Getting Started
About Project:
This project is a Python-based text analysis pipeline designed to process cryptocurrency-related articles, extract meaningful insights, and provide answers to user queries. The pipeline begins with acquiring data from web articles, converting them into PDFs, extracting text, preprocessing it, vectorizing it for analysis, performing sentiment analysis, and retrieving ranked answers using the BM25 algorithm. The codebase is modular, with contributions from multiple authors, and leverages a variety of NLP and machine learning libraries.
OBJECTIVES:
Scrape and gather cryptocurrency-related articles from the web.
Extract and refine text from these articles for analysis.
Analyze sentiment and identify cryptocurrency-specific entities.
Deliver ranked, relevant answers to user queries based on the processed text.
Last updated