RAGChain Docs
  • Introduction
  • Quick Start
  • Installation
  • RAGchain Structure
    • File Loader
      • Dataset Loader
        • Ko-Strategy-QA Loader
      • Hwp Loader
      • Rust Hwp Loader
      • Win32 Hwp Loader
      • OCR
        • Nougat Loader
        • Mathpix Markdown Loader
        • Deepdoctection Loader
    • Text Spliter
      • Recursive Text Splitter
      • Markdown Header Splitter
      • HTML Header splitter
      • Code splitter
      • Token splitter
    • Retrieval
      • BM25 Retrieval
      • Hybrid Retrieval
      • Hyde Retrieval
      • VectorDB Retrieval
    • LLM
    • DB
      • MongoDB
      • Pickle DB
    • Reranker
      • BM25 Reranker
      • UPR Reranker
      • TART Reranker
      • MonoT5 Reranker
      • LLM Reranker
    • Benchmark
      • Auto Evaluator
      • Dataset Evaluators
        • Qasper
        • Ko-Strategy-QA
        • Strategy-QA
        • ms-marco
  • Utils
    • Query Decomposition
    • Evidence Extractor
    • Embedding
    • Slim Vector Store
      • Pinecone Slim
      • Chroma Slim
    • File Cache
    • Linker
      • Redis Linker
      • Dynamo Linker
      • Json Linker
    • REDE Search Detector
    • Semantic Clustering
  • Pipeline
    • BasicIngestPipeline
    • BasicRunPipeline
    • RerankRunPipeline
    • ViscondeRunPipeline
  • For Advanced RAG
    • Time-Aware RAG
    • Importance-Aware RAG
Powered by GitBook
On this page
  • Overview
  • Example Use
  1. RAGchain Structure
  2. Benchmark
  3. Dataset Evaluators

Qasper

Overview

Qasper is question answering dataset based on NLP papers. It contains full text (plus figures and tables captions) in NLP paper, questions and answers about corresponding paper. You can evaluate performance of your RAG workflow in closed-domain QA task.

Example Use

Note : We recommend set proper evaluate_size. This evaluator ingest each NLP paper when evaluated. So, if you set evaluate_size to 100, it will ingest 100 NLP papers when evaluated. It will take long time. You can set random_state for evaluating constant questions. If you change random_state, other questions will be evaluated, even with same evaluate_size.

from RAGchain.benchmark.dataset import QasperEvaluator

pipeline = <your pipeline>
retrievals = [<your retrieval>]
db = <your db>

evaluator = QasperEvaluator(pipeline, evaluate_size=20, random_state=60)
evaluator.ingest(retrievals, db)
result = evaluator.evaluate()

# print result summary (mean values)
print(result.results)
# print result DataFrame
print(result.each_results)
PreviousDataset EvaluatorsNextKo-Strategy-QA

Last updated 1 year ago