Hybrid Retrieval
HybridRetrieval Class Documentation
Overview
The HybridRetrieval
class is designed to retrieve passages from multiple retrievals. It combines retrieval scores using either the Reciprocal Rank Fusion (RRF) algorithm or Convex Combination (CC) algorithm. RRF algorithm calculate final similarity scores based on ranking in each retrievals. CC algorithm can caluclate scores with different weights between each retrievals.
Usage
Initialize
To create an instance of the HybridRetrieval
class, you need to provide a list of Retrieval objects.
You can provide p value, which means retrieve passages counts from each retrievals before run rrf or cc algorithm. If p value is small, it might can't get enought passages to reach top_k value. You should need more p value if your retrievals have huge passages.
If you want to use RRF algorithm, you can provide rrf_k value, which is hyper parameter in rrf algorithm.
If you want to use CC algorith, you can provide a list of weights corresponding to each retrieval method. The weights should sum up to 1.0.
Ingest
Ingest a list of Passage
s into all retrievals in the hybrid retrieval.
Retrieve
Retrieve top-k passages for a given query.
Retrieve with filter
You can also filter the retrieved passages. Use the retrieve_with_filter
method and provide the query, top-k value, and a list of content, filepath, or metadata values to filter by.
In this method uses DB.search
method. Please refer here for further information.
Here's an example:
Last updated