Backend

Integration of NLP Model with Flask Backend

This section documents the transition of the NLP model from a CLI-based interface to a web-integrated backend architecture using Flask. The modifications allow the entire model pipeline to function via HTTP requests, improving accessibility, automation, and frontend interaction. The model.py file is adapted for integration with the Flask backend (app.py) instead of using a command-line interface. The process_query() function is designed to take structured inputs (coin, query) and return structured JSON outputs for rendering in the frontend.

The function `process_query(coin, query)` is the main bridge between the Flask backend and the NLP model. It takes a cryptocurrency name and user input as arguments and returns structured results suitable for JSON serialization and frontend display.

This function: - Analyzes sentiment using TextBlob - Retrieves top matching answers using BM25 and TF-IDF algorithms - Combines results and sorts them - Returns a dictionary with formatted output and raw data for logging

{
    "query": query,
    "coin": coin,
    "sentiment": "Positive (Score: 0.65)",
    "top_answers": [
        "1. Bitcoin is a decentralized cryptocurrency... (Score: 6.53)"
    ],
    "raw_top_results": [(sentence, score)]
}

The `raw_top_results` field is retained only for server-side logging purposes. It is removed before the frontend receives the data. This keeps logs comprehensive while maintaining frontend efficiency.

{
    "query": query,
    "coin": coin,
    "sentiment": "Positive (Score: 0.65)",
    "top_answers": [
        "1. Bitcoin is a decentralized cryptocurrency... (Score: 6.53)"
    ],
    "raw_top_results": [(sentence, score)]
}

Query Logging via logger.py

Each processed query is logged into `query_logs.json` for tracking. The log entry includes: - Timestamp of query - Coin name - User question - Detected sentiment - Top answer text and scores This is handled by the `log_query()` function, which appends logs in a JSON array format using Python's built-in `json` module.

Summing up Key Enhancements:

Unified Query Handler for API:

def process_query(coin, query):

def process_query(coin, query):

Replaces CLI interaction with a reusable backend-compatible function.
Accepts two parameters: the selected cryptocurrency (coin) and user question (query).
Returns a structured response dictionary that includes:
- Original query
- Coin name
- Sentiment result (with score)
- Ranked top 3 answers (as formatted strings)
- Raw answer-score pairs for logging

Why this matters: Previously, the output was printed to the terminal. Now, it is returned to the Flask route /query where it can be serialized to JSON and sent to the frontend.

Output Formatting for Frontend Rendering:

"top_answers": [
    f"{i+1}. {answer[:500]}... (Score: {score:.2f})"
    for i, (answer, score) in enumerate(top_results)
]

"top_answers": 
    f"{i+1}. {answer[:500]}... (Score: {score:.2f})"
    for i, (answer, score) in enumerate(top_results)
]

Ensures each result is trimmed and human-readable.
Designed for HTML rendering inside the web interface (index.html).

3. Compatibility with Logging Module

"raw_top_results": top_results

"raw_top_results": top_results

This line is included in the returned dictionary only for backend logging.
It is removed before sending the response to the frontend in app.py.

Coin Validation Integration

COINS = ["bitcoin", "ethereum", "solana", "dogecoin", "hamstercoin", "cardano", "general crypto"].

COINS = ["bitcoin", "ethereum", "solana", "dogecoin", "hamstercoin", "cardano", "general crypto"].

Now shared across model.py and app.py.
Ensures any coin selected in the frontend is validated before query processing.

Removed Command-Line Interface

The interactive loop when writing code for model (asking the user to type the coin and query) has been removed.
Replaced with stateless processing suited for HTTP requests.

# query_logs.json

Purpose: It is used to store a history of user queries and system responses during the text analysis process.

Usage in code:

Error Tracking: Can log query failures (e.g., empty results or low BM25 scores).
Audit Trail: Used to validate if the system is returning accurate and relevant answers over time.

Key Features:

JSON Format: Stores logs in a structured JSON array for easy access and readability.
Log Entries: Each log typically contains a timestamp, query text, response, sentiment, and confidence score.

Importance:

Transparency: Enables retrospective review of system outputs.
Model Improvement: Helps analyze mismatches between query intent and retrieved sentences.\
User Insight: Useful for visualizing user interest trends over time.
Testing & Debugging: Makes regression testing easier by replaying previous queries.

[
  {
    "timestamp": "2025-04-09T00:04:19.530114",
    "coin": "solana",
    "query": "can i trade in this coin?",
    "sentiment": "Neutral (Score: 0.28)",
    "answers": []
  },
  {
    "timestamp": "2025-04-09T00:04:47.992848",
    "coin": "dogecoin",
    "query": "trends",
    "sentiment": "Neutral (Score: 0.33)",
    "answers": []
  }
]

# query_logs.json

Purpose: It is used to store a history of user queries and system responses during the text analysis process.

Usage in code:

Error Tracking: Can log query failures (e.g., empty results or low BM25 scores).
Audit Trail: Used to validate if the system is returning accurate and relevant answers over time.

Key Features:

JSON Format: Stores logs in a structured JSON array for easy access and readability.
Log Entries: Each log typically contains a timestamp, query text, response, sentiment, and confidence score.

Importance:

Transparency: Enables retrospective review of system output.
Model Improvement: Helps analyze mismatches between query intent and retrieved sentences.
User Insight: Useful for visualizing user interest trends over time.
Testing & Debugging: Makes regression testing easier by replaying previous queries.

[
  {
    "timestamp": "2025-04-09T00:04:19.530114",
    "coin": "solana",
    "query": "can i trade in this coin?",
    "sentiment": "Neutral (Score: 0.28)",
    "answers": []
  },
  {
    "timestamp": "2025-04-09T00:04:47.992848",
    "coin": "dogecoin",
    "query": "trends",
    "sentiment": "Neutral (Score: 0.33)",
    "answers": []
  }
]

# Keywords.py

Purpose: It is responsible for extracting or managing key cryptocurrency-related terms or phrases from articles. These keywords are used to enhance search relevance, preserve domain-specific context during preprocessing, and improve user query matching in the BM25 retrieval step.

Key Features:

Crypto-Specific Vocabulary: Maintains a curated list of crypto terms (e.g., "blockchain", "DeFi", "halving", "altcoin").
Custom Stopword Exceptions: Prevents removal of important crypto terms during preprocessing (e.g., during stopword filtering).
Keyword Matching: May provide utility functions to identify or highlight keyword presence in text.
Query Expansion (if implemented): Can expand user queries with related keywords for broader match.

Usage in code:

Preprocessing (Step: preprocess_sentence)
Ensures that key crypto terms are not stemmed or lemmatized to preserve their semantic identity.
Prevents accidental removal during stopword or special character cleaning.
Sentiment or Relevance Analysis:
Used to filter or tag sentences that contain crypto-related terms for priority scoring.
BM25 Index Creation:
Can be used to label which sentences are “keyword-rich” to weigh them higher in ranking.

Importance:

Domain Sensitivity: Crypto-specific terms often don't behave like general English (e.g., “hodl” isn’t in any dictionary). Preserving these terms improves model accuracy.
Search Optimization: Ensures BM25 index or query expansion functions operate with context-aware terms.
User Intent Matching: Improves alignment between user queries and text content by anchoring on domain keywords.

# Define cryptocurrency-related terms
crypto_terms = {
    "bitcoin", "ethereum", "blockchain", "dogecoin", "litecoin", "ripple",
    "cardano", "solana", "polkadot", "chainlink", "uniswap", "binance",
    "coinbase", "ftx", "kraken", "defi", "nft", "metaverse", "web3", "usdt",
    "Crypto", "Cryptocurrency", "Digital currency", "Virtual currency", "Decentralized finance",
    "Web3", "Digital assets", "Crypto assets", "Bitcoin", "BTC", "Ethereum", "ETH", "Altcoins",
    "Stablecoins", "Tether", "USDC", "Memecoins", "Shiba Inu", "Crypto exchange",
    "Decentralized exchange", "DEX", "Centralized exchange", "CEX", "Trading pairs",
    "Order book", "Liquidity", "Trading volume", "Market cap", "Market capitalization",
    "Bull market", "Bear market", "Volatility", "Technical analysis", "Fundamental analysis",
    "Trading bots", "Crypto wallet", "Digital wallet", "Hardware wallet", "Cold wallet",
    "Software wallet", "Hot wallet", "Private key", "Public key", "Seed phrase",
    "Cryptography", "Security audit", "Decentralization", "Distributed ledger technology",
    "DLT", "Smart contracts", "Consensus mechanism", "Tokenomics", "DeFi protocols",
    "Yield farming", "Liquidity mining", "Polygon", "Arbitrum", "Optimism", "Sidechains",
    "Sharding", "Zero-knowledge proofs", "zk-SNARKs", "zk-STARKs", "Interoperability",
    "Cosmos", "Oracles", "Cryptographic primitives", "Merkle tree", "Byzantine F",
    "Crypto mining", "Proof-of-Work", "PoW", "Proof-of-Stake", "PoS", "Staking",
    "Mining rig", "Hash rate", "Validator", "Crypto regulation", "KYC",
    "Know Your Customer", "AML", "Anti-Money Laundering", "Compliance", "NFT marketplace",
    "Digital art", "Collectibles", "Minting", "Gas fees", "Crypto investment", "Portfolio",
    "Hodl", "DCA", "Dollar-cost averaging", "Yield", "APY", "DAOs",
    "Decentralized Autonomous Organizations", "Airdrop", "hamstercoin", "token",
    "altcoin", "market", "coin", "price", "trend", "exchange", "invest", "crypto currency"
}

Result of keywords.py in a separate json file for future QA:

{
  "timestamp": "2025-04-09T15:11:36.657009",
  "coin": "dogecoin",
  "query": "can i trade in this coin?",
  "sentiment": "Neutral (Score: 0.28)",
  "answers": [
    {
      "text": "- Bitcoin trades at $80,378, down 2.51%, with resistance at\n$85,000 and support at $78,000.",
      "score": 5.9655
    },
    {
      "text": "This has further instigated shifts in global trade\nrelationships.",
      "score": 5.6794
    },
    {
      "text": "It\nseems like meme coins are losing ground to utility-focused coins in 2025.",
      "score": 5.2208
    }
  ]
}

# Logger.py

Purpose: It is used to configure and manage logging across the entire crypto text analysis pipeline. It centralizes the setup of the logging system so that other modules can easily import and use a consistent logging format and level. This helps in monitoring, debugging, and auditing the system.

Key Features:

Centralized Logger Configuration: Defines logging settings (level, format, output file).
Reusable Across Modules: Any script (e.g., scraper.py, bm25.py) can import the configured logger and write logs.
File and Console Output: Optionally writes logs to both the terminal and a log file (e.g., pipeline.log).
Custom Format: Includes timestamps, log levels (INFO, WARNING, ERROR), and message content.

Usage in code:

Imported into modules like scraper.py or bm25.py to log progress updates (e.g., "Scraping Bitcoin - Page 1"), warnings (e.g., missing articles), and errors (e.g., failed network requests) using logger.info(), logger.warning(), and logger.error().
Ensures all log messages are consistently formatted and saved to both the console and a log file (pipeline.log), supporting easy debugging and execution tracking.

Importance:

Traceability: Tracks every step (e.g., "Preprocessing complete", "BM25 index created").
Debugging: Helps locate where and why failures occurred (e.g., network issues, parsing errors).
Monitoring Progress: Useful for long-running processes like scraping multiple pages.
Maintenance: Provides system-level visibility for developers working on different modules.

import os
import json
from datetime import datetime

# Get the current directory where this file is located
current_directory = os.path.dirname(os.path.abspath(__file__))
LOG_FILE = os.path.join(current_directory, "query_logs.json")

def log_query(coin, user_query, sentiment, top_results):
    log_entry = {
        "timestamp": datetime.now().isoformat(),
        "coin": coin,
        "query": user_query,
        "sentiment": sentiment,
        "answers": [
            {
                "text": sentence,
                "score": round(score, 4)
            } for sentence, score in top_results
        ]
    }

    try:
        if os.path.exists(LOG_FILE):
            with open(LOG_FILE, "r+", encoding="utf-8") as f:
                data = json.load(f)
                data.append(log_entry)
                f.seek(0)
                json.dump(data, f, indent=4)
        else:
            with open(LOG_FILE, "w", encoding="utf-8") as f:
                json.dump([log_entry], f, indent=4)
    except Exception as e:
        print(f"Error logging query: {e}")

Purpose: It serves as the main entry point for the application, integrating all components of the text analysis pipeline. It handles user input, processes queries using the BM25 model, and returns relevant answers with associated sentiment. It also initializes required resources such as the cleaned data, sentiment scores, and BM25 index.

Key Features:

User Interaction: Accepts user queries (via terminal or web interface) and returns ranked answers.
BM25 Integration: Leverages the BM25 model to retrieve the most relevant sentences from the processed dataset.
Sentiment Display: Enhances answers with sentiment scores and categories (positive, negative, neutral).
Modular Calls: Ties together preprocessed text, BM25 search, and sentiment data from other modules.
Execution Control: Acts as the main script run to launch the full pipeline or demo the system.

Usage in code:

Executes BM25 search to retrieve and rank the top N relevant sentences.
Displays matched sentences along with metadata like BM25 score, sentiment polarity, and article source.
Optionally logs the query and results into query_logs.json for tracking and debugging.

Importance:

Integration Point: Bridges data preprocessing, keyword relevance, and sentiment analysis into one cohesive flow.
User Interface Layer: Acts as the main access point for users to interact with the system.
Testing & Deployment: Ideal for quickly testing the pipeline with different queries or deploying a minimal prototype.

from flask import Flask, request, jsonify, render_template
from model import process_query, valid_query, COINS
from flask_cors import CORS
from logger import log_query

app = Flask(__name__)
CORS(app)

# Homepage endpoint (GET and POST)
@app.route("/", methods=['GET', 'POST'])
def index():
    # Try to render an index.html template if available; otherwise return a simple message.
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

# Query endpoint (POST)
@app.route("/query", methods=["POST"])
def handle_query():

    data = request.get_json()
    if not data:
        return jsonify({"error": "No input data provided."}), 400

    coin = data.get("coin", "").lower().strip()
    query = data.get("query", "").strip()

    if not coin or not query:
        return jsonify({"error": "Please provide both coin and query."}), 400

    # Validate coin against the list defined in model.py
    if coin not in COINS:
        return jsonify({"error": f"Unsupported coin. Choose from: {', '.join(COINS)}."}), 400

    # Validate that the query contains some crypto-related keywords
    if not valid_query(query):
        return jsonify({"error": "Query seems unrelated to crypto. Try asking something like 'What's the trend of Bitcoin this week?'"}), 400

    # Process the query using model.py
    result = process_query(coin, query)

    # Extract for logging
    top_results = result.get("raw_top_results", [])
    sentiment = result.get("sentiment", "unknown")

    # Log the query
    log_query(coin, query, sentiment, top_results)
    # Removing raw_top_results so it doesn't show on the frontend
    result.pop("raw_top_results", None)
    # Logging ends

    return jsonify(result), 200

# Keywords.py

Purpose:

It is responsible for extracting or managing key cryptocurrency-related terms or phrases from articles. These keywords are used to enhance search relevance, preserve domain-specific context during preprocessing, and improve user query matching in the BM25 retrieval step.

Key Features:

Crypto-Specific Vocabulary: Maintains a curated list of crypto terms (e.g., "blockchain", "DeFi", "halving", "altcoin").
Custom Stopword Exceptions: Prevents removal of important crypto terms during preprocessing (e.g., during stopword filtering).
Keyword Matching: May provide utility functions to identify or highlight keyword presence in text.
Query Expansion (if implemented): Can expand user queries with related keywords for broader match.

Usage in code:

Preprocessing (Step: preprocess_sentence)
Ensures that key crypto terms are not stemmed or lemmatized to preserve their semantic identity.
Prevents accidental removal during stopword or special character cleaning.
Sentiment or Relevance Analysis:
Used to filter or tag sentences that contain crypto-related terms for priority scoring.
BM25 Index Creation:
Can be used to label which sentences are “keyword-rich” to weigh them higher in ranking.

Importance:

Domain Sensitivity: Crypto-specific terms often don't behave like general English (e.g., “hodl” isn’t in any dictionary). Preserving these terms improves model accuracy.
Search Optimization: Ensures BM25 index or query expansion functions operate with context-aware terms.
User Intent Matching: Improves alignment between user queries and text content by anchoring on domain keywords.

# Define cryptocurrency-related terms
crypto_terms = {
    "bitcoin", "ethereum", "blockchain", "dogecoin", "litecoin", "ripple",
    "cardano", "solana", "polkadot", "chainlink", "uniswap", "binance",
    "coinbase", "ftx", "kraken", "defi", "nft", "metaverse", "web3", "usdt",
    "Crypto", "Cryptocurrency", "Digital currency", "Virtual currency", "Decentralized finance",
    "Web3", "Digital assets", "Crypto assets", "Bitcoin", "BTC", "Ethereum", "ETH", "Altcoins",
    "Stablecoins", "Tether", "USDC", "Memecoins", "Shiba Inu", "Crypto exchange",
    "Decentralized exchange", "DEX", "Centralized exchange", "CEX", "Trading pairs",
    "Order book", "Liquidity", "Trading volume", "Market cap", "Market capitalization",
    "Bull market", "Bear market", "Volatility", "Technical analysis", "Fundamental analysis",
    "Trading bots", "Crypto wallet", "Digital wallet", "Hardware wallet", "Cold wallet",
    "Software wallet", "Hot wallet", "Private key", "Public key", "Seed phrase",
    "Cryptography", "Security audit", "Decentralization", "Distributed ledger technology",
    "DLT", "Smart contracts", "Consensus mechanism", "Tokenomics", "DeFi protocols",
    "Yield farming", "Liquidity mining", "Polygon", "Arbitrum", "Optimism", "Sidechains",
    "Sharding", "Zero-knowledge proofs", "zk-SNARKs", "zk-STARKs", "Interoperability",
    "Cosmos", "Oracles", "Cryptographic primitives", "Merkle tree", "Byzantine F",
    "Crypto mining", "Proof-of-Work", "PoW", "Proof-of-Stake", "PoS", "Staking",
    "Mining rig", "Hash rate", "Validator", "Crypto regulation", "KYC",
    "Know Your Customer", "AML", "Anti-Money Laundering", "Compliance", "NFT marketplace",
    "Digital art", "Collectibles", "Minting", "Gas fees", "Crypto investment", "Portfolio",
    "Hodl", "DCA", "Dollar-cost averaging", "Yield", "APY", "DAOs",
    "Decentralized Autonomous Organizations", "Airdrop", "hamstercoin", "token",
    "altcoin", "market", "coin", "price", "trend", "exchange", "invest", "crypto currency"
}

app.py (continued)

Libraries Used:

- / – serves the frontend index.html
- /query – accepts JSON input (coin + query) and returns results

Endpoint Logic Breakdown:

/ Route (GET)

@app.route("/", methods=['GET', 'POST'])
def index():
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

Result of keywords.py in a separate json file for future QA:

{
  "timestamp": "2025-04-09T15:11:36.657009",
  "coin": "dogecoin",
  "query": "can i trade in this coin?",
  "sentiment": "Neutral (Score: 0.28)",
  "answers": [
    {
      "text": "- Bitcoin trades at $80,378, down 2.51%, with resistance at\n$85,000 and support at $78,000.",
      "score": 5.9655
    },
    {
      "text": "This has further instigated shifts in global trade\nrelationships.",
      "score": 5.6794
    },
    {
      "text": "It\nseems like meme coins are losing ground to utility-focused coins in 2025.",
      "score": 5.2208
    }
  ]
}

# Logger.py

Purpose:

It is used to configure and manage logging across the entire crypto text analysis pipeline. It centralizes the setup of the logging system so that other modules can easily import and use a consistent logging format and level. This helps in monitoring, debugging, and auditing the system.

Key Features:

Centralized Logger Configuration: Defines logging settings (level, format, output file).
Reusable Across Modules: Any script (e.g., scraper.py, bm25.py) can import the configured logger and write logs.
File and Console Output: Optionally writes logs to both the terminal and a log file (e.g., pipeline.log).
Custom Format: Includes timestamps, log levels (INFO, WARNING, ERROR), and message content.

Usage in code:

Imported into modules like scraper.py or bm25.py to log progress updates (e.g., "Scraping Bitcoin - Page 1"), warnings (e.g., missing articles), and errors (e.g., failed network requests) using logger.info(), logger.warning(), and logger.error().
Ensures all log messages are consistently formatted and saved to both the console and a log file (pipeline.log), supporting easy debugging and execution tracking.

Importance:

Traceability: Tracks every step (e.g., "Preprocessing complete", "BM25 index created").
Debugging: Helps locate where and why failures occurred (e.g., network issues, parsing errors).
Monitoring Progress: Useful for long-running processes like scraping multiple pages.
Maintenance: Provides system-level visibility for developers working on different modules.

import os
import json
from datetime import datetime

# Get the current directory where this file is located
current_directory = os.path.dirname(os.path.abspath(__file__))
LOG_FILE = os.path.join(current_directory, "query_logs.json")

def log_query(coin, user_query, sentiment, top_results):
    log_entry = {
        "timestamp": datetime.now().isoformat(),
        "coin": coin,
        "query": user_query,
        "sentiment": sentiment,
        "answers": [
            {
                "text": sentence,
                "score": round(score, 4)
            } for sentence, score in top_results
        ]
    }

    try:
        if os.path.exists(LOG_FILE):
            with open(LOG_FILE, "r+", encoding="utf-8") as f:
                data = json.load(f)
                data.append(log_entry)
                f.seek(0)
                json.dump(data, f, indent=4)
        else:
            with open(LOG_FILE, "w", encoding="utf-8") as f:
                json.dump([log_entry], f, indent=4)
    except Exception as e:
        print(f"Error logging query: {e}")

Purpose:

It serves as the main entry point for the application, integrating all components of the text analysis pipeline. It handles user input, processes queries using the BM25 model, and returns relevant answers with associated sentiment. It also initializes required resources such as the cleaned data, sentiment scores, and BM25 index.

Key Features:

User Interaction: Accepts user queries (via terminal or web interface) and returns ranked answers.
BM25 Integration: Leverages the BM25 model to retrieve the most relevant sentences from the processed dataset.
Sentiment Display: Enhances answers with sentiment scores and categories (positive, negative, neutral).
Modular Calls: Ties together preprocessed text, BM25 search, and sentiment data from other modules.
Execution Control: Acts as the main script run to launch the full pipeline or demo the system.

Usage in code:

Executes BM25 search to retrieve and rank the top N relevant sentences.
Displays matched sentences along with metadata like BM25 score, sentiment polarity, and article source.
Optionally logs the query and results into query_logs.json for tracking and debugging.

Importance:

Integration Point: Bridges data preprocessing, keyword relevance, and sentiment analysis into one cohesive flow.
User Interface Layer: Acts as the main access point for users to interact with the system.
Testing & Deployment: Ideal for quickly testing the pipeline with different queries or deploying a minimal prototype.

from flask import Flask, request, jsonify, render_template
from model import process_query, valid_query, COINS
from flask_cors import CORS
from logger import log_query

app = Flask(__name__)
CORS(app)

# Homepage endpoint (GET and POST)
@app.route("/", methods=['GET', 'POST'])
def index():
    # Try to render an index.html template if available; otherwise return a simple message.
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

# Query endpoint (POST)
@app.route("/query", methods=["POST"])
def handle_query():

    data = request.get_json()
    if not data:
        return jsonify({"error": "No input data provided."}), 400

    coin = data.get("coin", "").lower().strip()
    query = data.get("query", "").strip()

    if not coin or not query:
        return jsonify({"error": "Please provide both coin and query."}), 400

    # Validate coin against the list defined in model.py
    if coin not in COINS:
        return jsonify({"error": f"Unsupported coin. Choose from: {', '.join(COINS)}."}), 400

    # Validate that the query contains some crypto-related keywords
    if not valid_query(query):
        return jsonify({"error": "Query seems unrelated to crypto. Try asking something like 'What's the trend of Bitcoin this week?'"}), 400

    # Process the query using model.py
    result = process_query(coin, query)

    # Extract for logging
    top_results = result.get("raw_top_results", [])
    sentiment = result.get("sentiment", "unknown")

    # Log the query
    log_query(coin, query, sentiment, top_results)
    # Removing raw_top_results so it doesn't show on the frontend
    result.pop("raw_top_results", None)
    # Logging ends

    return jsonify(result), 200

app.py (continued)

Libraries Used:

- / – serves the frontend index.html
- /query – accepts JSON input (coin + query) and returns results

Endpoint Logic Breakdown:

/ Route (GET)

@app.route("/", methods=['GET', 'POST'])
def index():
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

Loads the interface via index.html from the templates/ directory.
Fallbacks to a JSON message in case of errors (e.g., missing template).

/query Route (POST)

Handles the backend logic:

@app.route("/", methods=['GET', 'POST'])
def index():
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

Input Handling and Validation:

Validates presence of data from request body.

Coin and Query Validation:

coin = data.get("coin", "").lower().strip()
query = data.get("query", "").strip()

Strips leading/trailing spaces
Normalizes to lowercase
Validates against supported coins defined in model.py
Ensures query relevance using keywords (valid_query(query))

Query Processing:

result = process_query(coin, query)

Calls core logic from model.py
Returns insights including sentiment and top-ranked responses

Logging User Interaction:

log_query(coin, query, sentiment, top_results)

Delegates logging to logger.py for persistent audit trail.

Output:

return jsonify(result), 200

Returns formatted response to frontend.

logger.py

Purpose: Logs each interaction into a local query_logs.json file with:

Timestamp
Coin name
User’s query
Sentiment result
Top answers with scores

Libraries Used:

Key Function: def log_query(coin, user_query, sentiment, top_results):

Creates an entry object
Appends to existing log file or creates a new one
Maintains logs in list format

Resilience: Wrapped in a try-except block to handle logging errors without affecting backend flow.

Loads the interface via index.html from the templates/ directory.
Fallbacks to a JSON message in case of errors (e.g., missing template).

/query Route (POST)

Handles the backend logic:

@app.route("/query", methods=["POST"])
def handle_query():

    data = request.get_json()
    
    if not data:
        return jsonify({"error": "No input data provided."}), 400

Input Handling and Validation:

Validates presence of data from request body.

Coin and Query Validation:

coin = data.get("coin", "").lower().strip()

query = data.get("query", "").strip()

Strips leading/trailing spaces
Normalizes to lowercase
Validates against supported coins defined in model.py
Ensures query relevance using keywords (valid_query(query))

Query Processing:

result = process_query(coin, query)

Calls core logic from model.py
Returns insights including sentiment and top-ranked responses

Logging User Interaction:

log_query(coin, query, sentiment, top_results)

Delegates logging to logger.py for persistent audit trail.

Output:

return jsonify(result), 200

Returns formatted response to frontend.

logger.py

Purpose: Logs each interaction into a local query_logs.json file with:

Timestamp
Coin name
User’s query
Sentiment result
Top answers with scores

Libraries Used:

Key Function:

def log_query(coin, user_query, sentiment, top_results):

Creates an entry object
Appends to existing log file or creates a new one
Maintains logs in list format

Resilience: Wrapped in a try-except block to handle logging errors without affecting backend flow.

PreviousSentiment Analysis NextFrontend

Last updated 22 days ago

Backend

Integration of NLP Model with Flask Backend

{
    "query": query,
    "coin": coin,
    "sentiment": "Positive (Score: 0.65)",
    "top_answers": [
        "1. Bitcoin is a decentralized cryptocurrency... (Score: 6.53)"
    ],
    "raw_top_results": [(sentence, score)]
}

{
    "query": query,
    "coin": coin,
    "sentiment": "Positive (Score: 0.65)",
    "top_answers": [
        "1. Bitcoin is a decentralized cryptocurrency... (Score: 6.53)"
    ],
    "raw_top_results": [(sentence, score)]
}

Query Logging via logger.py

Summing up Key Enhancements:

Unified Query Handler for API:

def process_query(coin, query):

def process_query(coin, query):

Replaces CLI interaction with a reusable backend-compatible function.
Accepts two parameters: the selected cryptocurrency (coin) and user question (query).
Returns a structured response dictionary that includes:
- Original query
- Coin name
- Sentiment result (with score)
- Ranked top 3 answers (as formatted strings)
- Raw answer-score pairs for logging

Why this matters: Previously, the output was printed to the terminal. Now, it is returned to the Flask route /query where it can be serialized to JSON and sent to the frontend.

Output Formatting for Frontend Rendering:

"top_answers": [
    f"{i+1}. {answer[:500]}... (Score: {score:.2f})"
    for i, (answer, score) in enumerate(top_results)
]

"top_answers": 
    f"{i+1}. {answer[:500]}... (Score: {score:.2f})"
    for i, (answer, score) in enumerate(top_results)
]

Ensures each result is trimmed and human-readable.
Designed for HTML rendering inside the web interface (index.html).

3. Compatibility with Logging Module

"raw_top_results": top_results

"raw_top_results": top_results

This line is included in the returned dictionary only for backend logging.
It is removed before sending the response to the frontend in app.py.

Coin Validation Integration

COINS = ["bitcoin", "ethereum", "solana", "dogecoin", "hamstercoin", "cardano", "general crypto"].

COINS = ["bitcoin", "ethereum", "solana", "dogecoin", "hamstercoin", "cardano", "general crypto"].

Now shared across model.py and app.py.
Ensures any coin selected in the frontend is validated before query processing.

Removed Command-Line Interface

The interactive loop when writing code for model (asking the user to type the coin and query) has been removed.
Replaced with stateless processing suited for HTTP requests.

# query_logs.json

Purpose: It is used to store a history of user queries and system responses during the text analysis process.

Usage in code:

Error Tracking: Can log query failures (e.g., empty results or low BM25 scores).
Audit Trail: Used to validate if the system is returning accurate and relevant answers over time.

Key Features:

JSON Format: Stores logs in a structured JSON array for easy access and readability.
Log Entries: Each log typically contains a timestamp, query text, response, sentiment, and confidence score.

Importance:

Transparency: Enables retrospective review of system outputs.
Model Improvement: Helps analyze mismatches between query intent and retrieved sentences.\
User Insight: Useful for visualizing user interest trends over time.
Testing & Debugging: Makes regression testing easier by replaying previous queries.

[
  {
    "timestamp": "2025-04-09T00:04:19.530114",
    "coin": "solana",
    "query": "can i trade in this coin?",
    "sentiment": "Neutral (Score: 0.28)",
    "answers": []
  },
  {
    "timestamp": "2025-04-09T00:04:47.992848",
    "coin": "dogecoin",
    "query": "trends",
    "sentiment": "Neutral (Score: 0.33)",
    "answers": []
  }
]

# query_logs.json

Purpose: It is used to store a history of user queries and system responses during the text analysis process.

Usage in code:

Error Tracking: Can log query failures (e.g., empty results or low BM25 scores).
Audit Trail: Used to validate if the system is returning accurate and relevant answers over time.

Key Features:

JSON Format: Stores logs in a structured JSON array for easy access and readability.
Log Entries: Each log typically contains a timestamp, query text, response, sentiment, and confidence score.

Importance:

Transparency: Enables retrospective review of system output.
Model Improvement: Helps analyze mismatches between query intent and retrieved sentences.
User Insight: Useful for visualizing user interest trends over time.
Testing & Debugging: Makes regression testing easier by replaying previous queries.

[
  {
    "timestamp": "2025-04-09T00:04:19.530114",
    "coin": "solana",
    "query": "can i trade in this coin?",
    "sentiment": "Neutral (Score: 0.28)",
    "answers": []
  },
  {
    "timestamp": "2025-04-09T00:04:47.992848",
    "coin": "dogecoin",
    "query": "trends",
    "sentiment": "Neutral (Score: 0.33)",
    "answers": []
  }
]

# Keywords.py

Key Features:

Crypto-Specific Vocabulary: Maintains a curated list of crypto terms (e.g., "blockchain", "DeFi", "halving", "altcoin").
Custom Stopword Exceptions: Prevents removal of important crypto terms during preprocessing (e.g., during stopword filtering).
Keyword Matching: May provide utility functions to identify or highlight keyword presence in text.
Query Expansion (if implemented): Can expand user queries with related keywords for broader match.

Usage in code:

Preprocessing (Step: preprocess_sentence)
Ensures that key crypto terms are not stemmed or lemmatized to preserve their semantic identity.
Prevents accidental removal during stopword or special character cleaning.
Sentiment or Relevance Analysis:
Used to filter or tag sentences that contain crypto-related terms for priority scoring.
BM25 Index Creation:
Can be used to label which sentences are “keyword-rich” to weigh them higher in ranking.

Importance:

Domain Sensitivity: Crypto-specific terms often don't behave like general English (e.g., “hodl” isn’t in any dictionary). Preserving these terms improves model accuracy.
Search Optimization: Ensures BM25 index or query expansion functions operate with context-aware terms.
User Intent Matching: Improves alignment between user queries and text content by anchoring on domain keywords.

# Define cryptocurrency-related terms
crypto_terms = {
    "bitcoin", "ethereum", "blockchain", "dogecoin", "litecoin", "ripple",
    "cardano", "solana", "polkadot", "chainlink", "uniswap", "binance",
    "coinbase", "ftx", "kraken", "defi", "nft", "metaverse", "web3", "usdt",
    "Crypto", "Cryptocurrency", "Digital currency", "Virtual currency", "Decentralized finance",
    "Web3", "Digital assets", "Crypto assets", "Bitcoin", "BTC", "Ethereum", "ETH", "Altcoins",
    "Stablecoins", "Tether", "USDC", "Memecoins", "Shiba Inu", "Crypto exchange",
    "Decentralized exchange", "DEX", "Centralized exchange", "CEX", "Trading pairs",
    "Order book", "Liquidity", "Trading volume", "Market cap", "Market capitalization",
    "Bull market", "Bear market", "Volatility", "Technical analysis", "Fundamental analysis",
    "Trading bots", "Crypto wallet", "Digital wallet", "Hardware wallet", "Cold wallet",
    "Software wallet", "Hot wallet", "Private key", "Public key", "Seed phrase",
    "Cryptography", "Security audit", "Decentralization", "Distributed ledger technology",
    "DLT", "Smart contracts", "Consensus mechanism", "Tokenomics", "DeFi protocols",
    "Yield farming", "Liquidity mining", "Polygon", "Arbitrum", "Optimism", "Sidechains",
    "Sharding", "Zero-knowledge proofs", "zk-SNARKs", "zk-STARKs", "Interoperability",
    "Cosmos", "Oracles", "Cryptographic primitives", "Merkle tree", "Byzantine F",
    "Crypto mining", "Proof-of-Work", "PoW", "Proof-of-Stake", "PoS", "Staking",
    "Mining rig", "Hash rate", "Validator", "Crypto regulation", "KYC",
    "Know Your Customer", "AML", "Anti-Money Laundering", "Compliance", "NFT marketplace",
    "Digital art", "Collectibles", "Minting", "Gas fees", "Crypto investment", "Portfolio",
    "Hodl", "DCA", "Dollar-cost averaging", "Yield", "APY", "DAOs",
    "Decentralized Autonomous Organizations", "Airdrop", "hamstercoin", "token",
    "altcoin", "market", "coin", "price", "trend", "exchange", "invest", "crypto currency"
}

Result of keywords.py in a separate json file for future QA:

{
  "timestamp": "2025-04-09T15:11:36.657009",
  "coin": "dogecoin",
  "query": "can i trade in this coin?",
  "sentiment": "Neutral (Score: 0.28)",
  "answers": [
    {
      "text": "- Bitcoin trades at $80,378, down 2.51%, with resistance at\n$85,000 and support at $78,000.",
      "score": 5.9655
    },
    {
      "text": "This has further instigated shifts in global trade\nrelationships.",
      "score": 5.6794
    },
    {
      "text": "It\nseems like meme coins are losing ground to utility-focused coins in 2025.",
      "score": 5.2208
    }
  ]
}

# Logger.py

Key Features:

Centralized Logger Configuration: Defines logging settings (level, format, output file).
Reusable Across Modules: Any script (e.g., scraper.py, bm25.py) can import the configured logger and write logs.
File and Console Output: Optionally writes logs to both the terminal and a log file (e.g., pipeline.log).
Custom Format: Includes timestamps, log levels (INFO, WARNING, ERROR), and message content.

Usage in code:

Imported into modules like scraper.py or bm25.py to log progress updates (e.g., "Scraping Bitcoin - Page 1"), warnings (e.g., missing articles), and errors (e.g., failed network requests) using logger.info(), logger.warning(), and logger.error().
Ensures all log messages are consistently formatted and saved to both the console and a log file (pipeline.log), supporting easy debugging and execution tracking.

Importance:

Traceability: Tracks every step (e.g., "Preprocessing complete", "BM25 index created").
Debugging: Helps locate where and why failures occurred (e.g., network issues, parsing errors).
Monitoring Progress: Useful for long-running processes like scraping multiple pages.
Maintenance: Provides system-level visibility for developers working on different modules.

import os
import json
from datetime import datetime

# Get the current directory where this file is located
current_directory = os.path.dirname(os.path.abspath(__file__))
LOG_FILE = os.path.join(current_directory, "query_logs.json")

def log_query(coin, user_query, sentiment, top_results):
    log_entry = {
        "timestamp": datetime.now().isoformat(),
        "coin": coin,
        "query": user_query,
        "sentiment": sentiment,
        "answers": [
            {
                "text": sentence,
                "score": round(score, 4)
            } for sentence, score in top_results
        ]
    }

    try:
        if os.path.exists(LOG_FILE):
            with open(LOG_FILE, "r+", encoding="utf-8") as f:
                data = json.load(f)
                data.append(log_entry)
                f.seek(0)
                json.dump(data, f, indent=4)
        else:
            with open(LOG_FILE, "w", encoding="utf-8") as f:
                json.dump([log_entry], f, indent=4)
    except Exception as e:
        print(f"Error logging query: {e}")

# App.py :

Key Features:

User Interaction: Accepts user queries (via terminal or web interface) and returns ranked answers.
BM25 Integration: Leverages the BM25 model to retrieve the most relevant sentences from the processed dataset.
Sentiment Display: Enhances answers with sentiment scores and categories (positive, negative, neutral).
Modular Calls: Ties together preprocessed text, BM25 search, and sentiment data from other modules.
Execution Control: Acts as the main script run to launch the full pipeline or demo the system.

Usage in code:

Executes BM25 search to retrieve and rank the top N relevant sentences.
Displays matched sentences along with metadata like BM25 score, sentiment polarity, and article source.
Optionally logs the query and results into query_logs.json for tracking and debugging.

Importance:

Integration Point: Bridges data preprocessing, keyword relevance, and sentiment analysis into one cohesive flow.
User Interface Layer: Acts as the main access point for users to interact with the system.
Testing & Deployment: Ideal for quickly testing the pipeline with different queries or deploying a minimal prototype.

from flask import Flask, request, jsonify, render_template
from model import process_query, valid_query, COINS
from flask_cors import CORS
from logger import log_query

app = Flask(__name__)
CORS(app)

# Homepage endpoint (GET and POST)
@app.route("/", methods=['GET', 'POST'])
def index():
    # Try to render an index.html template if available; otherwise return a simple message.
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

# Query endpoint (POST)
@app.route("/query", methods=["POST"])
def handle_query():

    data = request.get_json()
    if not data:
        return jsonify({"error": "No input data provided."}), 400

    coin = data.get("coin", "").lower().strip()
    query = data.get("query", "").strip()

    if not coin or not query:
        return jsonify({"error": "Please provide both coin and query."}), 400

    # Validate coin against the list defined in model.py
    if coin not in COINS:
        return jsonify({"error": f"Unsupported coin. Choose from: {', '.join(COINS)}."}), 400

    # Validate that the query contains some crypto-related keywords
    if not valid_query(query):
        return jsonify({"error": "Query seems unrelated to crypto. Try asking something like 'What's the trend of Bitcoin this week?'"}), 400

    # Process the query using model.py
    result = process_query(coin, query)

    # Extract for logging
    top_results = result.get("raw_top_results", [])
    sentiment = result.get("sentiment", "unknown")

    # Log the query
    log_query(coin, query, sentiment, top_results)
    # Removing raw_top_results so it doesn't show on the frontend
    result.pop("raw_top_results", None)
    # Logging ends

    return jsonify(result), 200

# Keywords.py

Purpose:

Key Features:

Crypto-Specific Vocabulary: Maintains a curated list of crypto terms (e.g., "blockchain", "DeFi", "halving", "altcoin").
Custom Stopword Exceptions: Prevents removal of important crypto terms during preprocessing (e.g., during stopword filtering).
Keyword Matching: May provide utility functions to identify or highlight keyword presence in text.
Query Expansion (if implemented): Can expand user queries with related keywords for broader match.

Usage in code:

Preprocessing (Step: preprocess_sentence)
Ensures that key crypto terms are not stemmed or lemmatized to preserve their semantic identity.
Prevents accidental removal during stopword or special character cleaning.
Sentiment or Relevance Analysis:
Used to filter or tag sentences that contain crypto-related terms for priority scoring.
BM25 Index Creation:
Can be used to label which sentences are “keyword-rich” to weigh them higher in ranking.

Importance:

Domain Sensitivity: Crypto-specific terms often don't behave like general English (e.g., “hodl” isn’t in any dictionary). Preserving these terms improves model accuracy.
Search Optimization: Ensures BM25 index or query expansion functions operate with context-aware terms.
User Intent Matching: Improves alignment between user queries and text content by anchoring on domain keywords.

# Define cryptocurrency-related terms
crypto_terms = {
    "bitcoin", "ethereum", "blockchain", "dogecoin", "litecoin", "ripple",
    "cardano", "solana", "polkadot", "chainlink", "uniswap", "binance",
    "coinbase", "ftx", "kraken", "defi", "nft", "metaverse", "web3", "usdt",
    "Crypto", "Cryptocurrency", "Digital currency", "Virtual currency", "Decentralized finance",
    "Web3", "Digital assets", "Crypto assets", "Bitcoin", "BTC", "Ethereum", "ETH", "Altcoins",
    "Stablecoins", "Tether", "USDC", "Memecoins", "Shiba Inu", "Crypto exchange",
    "Decentralized exchange", "DEX", "Centralized exchange", "CEX", "Trading pairs",
    "Order book", "Liquidity", "Trading volume", "Market cap", "Market capitalization",
    "Bull market", "Bear market", "Volatility", "Technical analysis", "Fundamental analysis",
    "Trading bots", "Crypto wallet", "Digital wallet", "Hardware wallet", "Cold wallet",
    "Software wallet", "Hot wallet", "Private key", "Public key", "Seed phrase",
    "Cryptography", "Security audit", "Decentralization", "Distributed ledger technology",
    "DLT", "Smart contracts", "Consensus mechanism", "Tokenomics", "DeFi protocols",
    "Yield farming", "Liquidity mining", "Polygon", "Arbitrum", "Optimism", "Sidechains",
    "Sharding", "Zero-knowledge proofs", "zk-SNARKs", "zk-STARKs", "Interoperability",
    "Cosmos", "Oracles", "Cryptographic primitives", "Merkle tree", "Byzantine F",
    "Crypto mining", "Proof-of-Work", "PoW", "Proof-of-Stake", "PoS", "Staking",
    "Mining rig", "Hash rate", "Validator", "Crypto regulation", "KYC",
    "Know Your Customer", "AML", "Anti-Money Laundering", "Compliance", "NFT marketplace",
    "Digital art", "Collectibles", "Minting", "Gas fees", "Crypto investment", "Portfolio",
    "Hodl", "DCA", "Dollar-cost averaging", "Yield", "APY", "DAOs",
    "Decentralized Autonomous Organizations", "Airdrop", "hamstercoin", "token",
    "altcoin", "market", "coin", "price", "trend", "exchange", "invest", "crypto currency"
}

app.py (continued)

Libraries Used:

: A micro web framework for Python. Enables routing, rendering HTML templates, handling HTTP requests (GET, POST), and serving JSON responses. Used here to define two main endpoints:
- / – serves the frontend index.html
- /query – accepts JSON input (coin + query) and returns results

: Adds Cross-Origin Resource Sharing (CORS) headers to Flask responses. This allows frontend scripts from different domains (or file origins) to communicate with the Flask backend without security blocks. Used via CORS(app) to support frontend–backend communication.
: Renders HTML files from the templates/ folder. Used to serve index.html.
: Parses incoming POST requests with JSON payloads into Python dictionaries. Used to fetch user input from the frontend (coin and query).

Endpoint Logic Breakdown:

/ Route (GET)

@app.route("/", methods=['GET', 'POST'])
def index():
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

Result of keywords.py in a separate json file for future QA:

{
  "timestamp": "2025-04-09T15:11:36.657009",
  "coin": "dogecoin",
  "query": "can i trade in this coin?",
  "sentiment": "Neutral (Score: 0.28)",
  "answers": [
    {
      "text": "- Bitcoin trades at $80,378, down 2.51%, with resistance at\n$85,000 and support at $78,000.",
      "score": 5.9655
    },
    {
      "text": "This has further instigated shifts in global trade\nrelationships.",
      "score": 5.6794
    },
    {
      "text": "It\nseems like meme coins are losing ground to utility-focused coins in 2025.",
      "score": 5.2208
    }
  ]
}

# Logger.py

Purpose:

Key Features:

Centralized Logger Configuration: Defines logging settings (level, format, output file).
Reusable Across Modules: Any script (e.g., scraper.py, bm25.py) can import the configured logger and write logs.
File and Console Output: Optionally writes logs to both the terminal and a log file (e.g., pipeline.log).
Custom Format: Includes timestamps, log levels (INFO, WARNING, ERROR), and message content.

Usage in code:

Imported into modules like scraper.py or bm25.py to log progress updates (e.g., "Scraping Bitcoin - Page 1"), warnings (e.g., missing articles), and errors (e.g., failed network requests) using logger.info(), logger.warning(), and logger.error().
Ensures all log messages are consistently formatted and saved to both the console and a log file (pipeline.log), supporting easy debugging and execution tracking.

Importance:

Traceability: Tracks every step (e.g., "Preprocessing complete", "BM25 index created").
Debugging: Helps locate where and why failures occurred (e.g., network issues, parsing errors).
Monitoring Progress: Useful for long-running processes like scraping multiple pages.
Maintenance: Provides system-level visibility for developers working on different modules.

import os
import json
from datetime import datetime

# Get the current directory where this file is located
current_directory = os.path.dirname(os.path.abspath(__file__))
LOG_FILE = os.path.join(current_directory, "query_logs.json")

def log_query(coin, user_query, sentiment, top_results):
    log_entry = {
        "timestamp": datetime.now().isoformat(),
        "coin": coin,
        "query": user_query,
        "sentiment": sentiment,
        "answers": [
            {
                "text": sentence,
                "score": round(score, 4)
            } for sentence, score in top_results
        ]
    }

    try:
        if os.path.exists(LOG_FILE):
            with open(LOG_FILE, "r+", encoding="utf-8") as f:
                data = json.load(f)
                data.append(log_entry)
                f.seek(0)
                json.dump(data, f, indent=4)
        else:
            with open(LOG_FILE, "w", encoding="utf-8") as f:
                json.dump([log_entry], f, indent=4)
    except Exception as e:
        print(f"Error logging query: {e}")

# App.py :

Purpose:

Key Features:

User Interaction: Accepts user queries (via terminal or web interface) and returns ranked answers.
BM25 Integration: Leverages the BM25 model to retrieve the most relevant sentences from the processed dataset.
Sentiment Display: Enhances answers with sentiment scores and categories (positive, negative, neutral).
Modular Calls: Ties together preprocessed text, BM25 search, and sentiment data from other modules.
Execution Control: Acts as the main script run to launch the full pipeline or demo the system.

Usage in code:

Executes BM25 search to retrieve and rank the top N relevant sentences.
Displays matched sentences along with metadata like BM25 score, sentiment polarity, and article source.
Optionally logs the query and results into query_logs.json for tracking and debugging.

Importance:

Integration Point: Bridges data preprocessing, keyword relevance, and sentiment analysis into one cohesive flow.
User Interface Layer: Acts as the main access point for users to interact with the system.
Testing & Deployment: Ideal for quickly testing the pipeline with different queries or deploying a minimal prototype.

from flask import Flask, request, jsonify, render_template
from model import process_query, valid_query, COINS
from flask_cors import CORS
from logger import log_query

app = Flask(__name__)
CORS(app)

# Homepage endpoint (GET and POST)
@app.route("/", methods=['GET', 'POST'])
def index():
    # Try to render an index.html template if available; otherwise return a simple message.
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

# Query endpoint (POST)
@app.route("/query", methods=["POST"])
def handle_query():

    data = request.get_json()
    if not data:
        return jsonify({"error": "No input data provided."}), 400

    coin = data.get("coin", "").lower().strip()
    query = data.get("query", "").strip()

    if not coin or not query:
        return jsonify({"error": "Please provide both coin and query."}), 400

    # Validate coin against the list defined in model.py
    if coin not in COINS:
        return jsonify({"error": f"Unsupported coin. Choose from: {', '.join(COINS)}."}), 400

    # Validate that the query contains some crypto-related keywords
    if not valid_query(query):
        return jsonify({"error": "Query seems unrelated to crypto. Try asking something like 'What's the trend of Bitcoin this week?'"}), 400

    # Process the query using model.py
    result = process_query(coin, query)

    # Extract for logging
    top_results = result.get("raw_top_results", [])
    sentiment = result.get("sentiment", "unknown")

    # Log the query
    log_query(coin, query, sentiment, top_results)
    # Removing raw_top_results so it doesn't show on the frontend
    result.pop("raw_top_results", None)
    # Logging ends

    return jsonify(result), 200

app.py (continued)

Libraries Used:

: A micro web framework for Python. Enables routing, rendering HTML templates, handling HTTP requests (GET, POST), and serving JSON responses. Used here to define two main endpoints:
- / – serves the frontend index.html
- /query – accepts JSON input (coin + query) and returns results
: Adds Cross-Origin Resource Sharing (CORS) headers to Flask responses. This allows frontend scripts from different domains (or file origins) to communicate with the Flask backend without security blocks. Used via CORS(app) to support frontend–backend communication.
: Renders HTML files from the templates/ folder. Used to serve index.html.
: Parses incoming POST requests with JSON payloads into Python dictionaries. Used to fetch user input from the frontend (coin and query).

Endpoint Logic Breakdown:

/ Route (GET)

@app.route("/", methods=['GET', 'POST'])
def index():
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

Loads the interface via index.html from the templates/ directory.
Fallbacks to a JSON message in case of errors (e.g., missing template).

/query Route (POST)

Handles the backend logic:

@app.route("/", methods=['GET', 'POST'])
def index():
    try:
        return render_template("index.html")
    except Exception as e:
        return jsonify({"message": "Crypto Insights is Down right now please try again later!"}), 200

Input Handling and Validation:

Validates presence of data from request body.

Coin and Query Validation:

coin = data.get("coin", "").lower().strip()
query = data.get("query", "").strip()

Strips leading/trailing spaces
Normalizes to lowercase
Validates against supported coins defined in model.py
Ensures query relevance using keywords (valid_query(query))

Query Processing:

result = process_query(coin, query)

Calls core logic from model.py
Returns insights including sentiment and top-ranked responses

Logging User Interaction:

log_query(coin, query, sentiment, top_results)

Delegates logging to logger.py for persistent audit trail.

Output:

return jsonify(result), 200

Returns formatted response to frontend.

logger.py

Purpose: Logs each interaction into a local query_logs.json file with:

Timestamp
Coin name
User’s query
Sentiment result
Top answers with scores

Libraries Used:

: for file path handling
: to read/write logs
: for timestamps

Key Function: def log_query(coin, user_query, sentiment, top_results):

Creates an entry object
Appends to existing log file or creates a new one
Maintains logs in list format

Resilience: Wrapped in a try-except block to handle logging errors without affecting backend flow.

Loads the interface via index.html from the templates/ directory.
Fallbacks to a JSON message in case of errors (e.g., missing template).

/query Route (POST)

Handles the backend logic:

@app.route("/query", methods=["POST"])
def handle_query():

    data = request.get_json()
    
    if not data:
        return jsonify({"error": "No input data provided."}), 400

Input Handling and Validation:

Validates presence of data from request body.

Coin and Query Validation:

coin = data.get("coin", "").lower().strip()

query = data.get("query", "").strip()

Strips leading/trailing spaces
Normalizes to lowercase
Validates against supported coins defined in model.py
Ensures query relevance using keywords (valid_query(query))

Query Processing:

result = process_query(coin, query)

Calls core logic from model.py
Returns insights including sentiment and top-ranked responses

Logging User Interaction:

log_query(coin, query, sentiment, top_results)

Delegates logging to logger.py for persistent audit trail.

Output:

return jsonify(result), 200

Returns formatted response to frontend.

logger.py

Purpose: Logs each interaction into a local query_logs.json file with:

Timestamp
Coin name
User’s query
Sentiment result
Top answers with scores

Libraries Used:

: for file path handling
: to read/write logs
: for timestamps

Key Function:

def log_query(coin, user_query, sentiment, top_results):

Creates an entry object
Appends to existing log file or creates a new one
Maintains logs in list format

Resilience: Wrapped in a try-except block to handle logging errors without affecting backend flow.

PreviousSentiment Analysis NextFrontend

Last updated 22 days ago