RAGChain Docs
  • Introduction
  • Quick Start
  • Installation
  • RAGchain Structure
    • File Loader
      • Dataset Loader
        • Ko-Strategy-QA Loader
      • Hwp Loader
      • Rust Hwp Loader
      • Win32 Hwp Loader
      • OCR
        • Nougat Loader
        • Mathpix Markdown Loader
        • Deepdoctection Loader
    • Text Spliter
      • Recursive Text Splitter
      • Markdown Header Splitter
      • HTML Header splitter
      • Code splitter
      • Token splitter
    • Retrieval
      • BM25 Retrieval
      • Hybrid Retrieval
      • Hyde Retrieval
      • VectorDB Retrieval
    • LLM
    • DB
      • MongoDB
      • Pickle DB
    • Reranker
      • BM25 Reranker
      • UPR Reranker
      • TART Reranker
      • MonoT5 Reranker
      • LLM Reranker
    • Benchmark
      • Auto Evaluator
      • Dataset Evaluators
        • Qasper
        • Ko-Strategy-QA
        • Strategy-QA
        • ms-marco
  • Utils
    • Query Decomposition
    • Evidence Extractor
    • Embedding
    • Slim Vector Store
      • Pinecone Slim
      • Chroma Slim
    • File Cache
    • Linker
      • Redis Linker
      • Dynamo Linker
      • Json Linker
    • REDE Search Detector
    • Semantic Clustering
  • Pipeline
    • BasicIngestPipeline
    • BasicRunPipeline
    • RerankRunPipeline
    • ViscondeRunPipeline
  • For Advanced RAG
    • Time-Aware RAG
    • Importance-Aware RAG
Powered by GitBook
On this page
  • Overview
  • Example Use
  1. RAGchain Structure
  2. Benchmark
  3. Dataset Evaluators

ms-marco

PreviousStrategy-QANextUtils

Last updated 1 year ago

Overview

MSMARCO (Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, and passage ranking. It contains questions, passages and answers. The passages are top-k results Bing engin searched based on question. And when human editor created answers, they refer to these passages and selected the passages.

MSMARCO V1.1 was a original question answering dataset dataset with 100,000 real Bing questions and human-generated answers. Since then, several other datasets have been released, including a 1,000,000 question dataset, a natural language generation dataset, a passage ranking dataset, a keyphrase extraction dataset, a web crawling dataset, and an interactive search dataset. More information about MSMARCO dataset, refer to below link!

MSMARCOEvaluator also support rank aware metrics like NDCG, AP, CG, IDCG, RR, etc.

Example Use

Note: MSMARCO dataset version is optional(v1.1 or 2.1). Default is v1.1. We use validation set in v2.1 because v2.1 passage data is_selected values are all -1. Ingest size must be same or larger than evaluate size.

from RAGchain.benchmark.dataset import MSMARCOEvaluator

pipeline = <your pipeline>
retrievals = [<your retrieval>]
db = <your db>

evaluator = MSMARCOEvaluator(pipeline, evaluate_size=20)
evaluator.ingest(retrievals, db) # ingest dataset to db and retrievals
result = evaluator.evaluate()

# print result summary (mean values)
print(result.results)
# print result DataFrame
print(result.each_results)
msmarco official github