Crypto Sights
  • Getting Started
  • Data Acquisition
  • Data Preprocessing
  • Sentiment Analysis
  • Backend
  • Frontend
  • Frontend
  • Evaluation
Powered by GitBook
On this page
Export as PDF

Evaluation

1. OBJECTIVES

The objective of this project is to implement a robust validation mechanism for answers generated by a cryptocurrency-related question-answering system.

The goals include:

  • Ensuring that the answers are relevant to the specified cryptocurrency.

  • Detecting semantic and contextual alignment using transformer-based embeddings.

  • Generating a labeled dataset that identifies whether an answer is valid or invalid.

  • Enabling future integration into an NLP backend for real-time or batch processing of user queries.

2. DATA ACQUISITION

Input Format:

The input data is expected to be a JSON file structured with:

  • "coin": Name of the cryptocurrency (e.g., "bitcoin").

  • "query": User's question.

  • "answers": List of one or more answer objects, each with a "text" field.

Current Source:

The function load_data(file_path) is used to load this JSON data locally for testing.

A placeholder for live data fetching is also present in the form of a commented-out function:

python

def fetch_api(url):
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()

This enables future scalability where queries and answers can be fetched from a web service or user interaction layer.

3. LIBRARIES USED

Core Libraries:

  • json, csv: For reading/writing data.

  • pandas: For tabular data manipulation.

  • requests: For potential API integration.

  • sentence-transformers: For semantic vector encoding using the MiniLM BERT model.

  • sklearn: For TF-IDF vectorization and label encoding.

Transformer Model:

model = SentenceTransformer('all-MiniLM-L6-v2')

A lightweight and efficient model capable of encoding texts into embeddings for semantic comparison.

4. TEXT EXTRACTION & VALIDATION

The function is_valid_entry(entry) is central to the model.

Steps:

  1. Pre-check: If coin is "general crypto" → always valid.

  2. Null Checks: If query or answers is empty → invalid.

  3. Keyword Matching:

  • Checks if the coin is mentioned in the query or in any of the answers.

  1. Semantic Matching:

  • Embeds coin and each answer using BERT.

  • Computes cosine similarity.

  • Threshold: 0.5.

cosine_scores = util.cos_sim(coin_embedding, answer_embeddings)[0]

similarity_valid = any(score >= threshold for score in cosine_scores)

If either keyword match or semantic score passes, the entry is valid.

5. DATA PREPROCESSING

After validation:

  • A CSV is created with columns:

  • coin, query, status (valid/invalid), answers.

The answers are preprocessed to extract the first answer text and format it as:

text: <answer_content>

This is written using:

writer.writerow([coin, query, status, answers_str])

6. VECTORIZATION (TF-IDF)

Once labeled:

  • Text is prepared by concatenating the coin, query, and answers fields.

  • The new text field is vectorized using TfidfVectorizer.

df['text'] = df['coin'] + ' ' + df['query'] + ' ' + df['answers']
  • status is encoded using LabelEncoder.

Final output:

  • vectorised.csv: Matrix of TF-IDF features with status_encoded appended.

7. INTEGRATION WITH NLP MODEL AND BACKEND

This model can be integrated with a Flask backend by:

process_query() (to be implemented)

  • Takes coin + query as input.

  • Returns validation results, e.g.:

json

{
  "coin": "bitcoin",
  "query": "What is bitcoin's trend?",
  "status": "valid"
}

Flask Route Integration

@app.route('/validate', methods=['POST'])
def validate():
    data = request.get_json()
    result = is_valid_entry(data)
    return jsonify({'status': result})

This allows real-time validation of generated answers via a REST API.

8. RESULTS

Evaluation Dataset:

After applying is_valid_entry() to each row:

  • 63 entries evaluated.

  • Results saved to validation_results.csv.

TF-IDF Matrix:

  • Shape: (63, 129) → 63 rows, 129 unique words (features).

Label Encoding:

  • Valid: 1

  • Invalid: 0

These outputs are useful for training or analyzing a downstream classifier or performing visual diagnostics like confusion matrices.

9. STRENGTHS & LIMITATIONS

✅ Strengths:

  • Combines semantic and lexical checks.

  • Easy to extend to other domains.

  • Modular codebase (each step is a function).

  • Suitable for both batch and API use cases.

⚠️ Limitations:

  • Fixed threshold may underfit/overfit for certain coins or queries.

  • Only the first answer is evaluated from the list.

  • No integration yet for complex multi-turn dialogues or sentiment filtering.

10. CONCLUSION

This evaluation model effectively uses modern NLP techniques to determine the validity of answer passages in a cryptocurrency-focused QA pipeline. It supports:

  • Robust semantic checks using Sentence-BERT

  • Flexible preprocessing and vectorization

  • Backend-ready logic for real-time deployment



PreviousFrontend

Last updated 22 days ago