Ko-Strategy-QA is Korean version of Strategy-QA, which is translated by DeepL. Ko-Strategy-QA is the only option for evaluating multi-hop questions in Korean. Also, RAGchain makers made this dataset^^
Example Use
from RAGchain.benchmark.dataset import KoStrategyQAEvaluatorpipeline =<your pipeline>retrievals = [<your retrieval>]db =<your db>evaluator =KoStrategyQAEvaluator(pipeline, evaluate_size=100)evaluator.ingest(retrievals, db)# This code will ingest whole paragraphs in Ko-Strategy-QA dataset. You only need to run this once.result = evaluator.evaluate()# print result summary (mean values)print(result.results)# print result DataFrameprint(result.each_results)