RAGChain Docs
  • Introduction
  • Quick Start
  • Installation
  • RAGchain Structure
    • File Loader
      • Dataset Loader
        • Ko-Strategy-QA Loader
      • Hwp Loader
      • Rust Hwp Loader
      • Win32 Hwp Loader
      • OCR
        • Nougat Loader
        • Mathpix Markdown Loader
        • Deepdoctection Loader
    • Text Spliter
      • Recursive Text Splitter
      • Markdown Header Splitter
      • HTML Header splitter
      • Code splitter
      • Token splitter
    • Retrieval
      • BM25 Retrieval
      • Hybrid Retrieval
      • Hyde Retrieval
      • VectorDB Retrieval
    • LLM
    • DB
      • MongoDB
      • Pickle DB
    • Reranker
      • BM25 Reranker
      • UPR Reranker
      • TART Reranker
      • MonoT5 Reranker
      • LLM Reranker
    • Benchmark
      • Auto Evaluator
      • Dataset Evaluators
        • Qasper
        • Ko-Strategy-QA
        • Strategy-QA
        • ms-marco
  • Utils
    • Query Decomposition
    • Evidence Extractor
    • Embedding
    • Slim Vector Store
      • Pinecone Slim
      • Chroma Slim
    • File Cache
    • Linker
      • Redis Linker
      • Dynamo Linker
      • Json Linker
    • REDE Search Detector
    • Semantic Clustering
  • Pipeline
    • BasicIngestPipeline
    • BasicRunPipeline
    • RerankRunPipeline
    • ViscondeRunPipeline
  • For Advanced RAG
    • Time-Aware RAG
    • Importance-Aware RAG
Powered by GitBook
On this page
  • Overview
  • Role of the DB in the Framework
  • Advantages of DBs
  1. RAGchain Structure

DB

Store passages at traditional database

PreviousLLMNextMongoDB

Last updated 1 year ago

Overview

Our framework currently supports two types of databases: MongoDB and PickleDB. We are planning to add more database types in the future to provide more flexibility and options for users.

1. MongoDB

MongoDB is a popular NoSQL database that provides high performance, high availability, and easy scalability. It works on the concept of collections and documents, making it a good choice for storing complex and hierarchical data structures. In our framework, MongoDB is used for storing and retrieving passage contents.

2. PickleDB

PickleDB is a super simple store. It is built upon Python's pickle module for serializing and deserializing Python object structures. PickleDB stores data in a local disk file in a pickle format, making it a good choice for small projects or for testing RAGchain quickly. We do not recommend PickleDB for production level, but it's great way to start RAGchain framework!

Role of the DB in the Framework

The role of the database in our framework is to store and manage passage contents and various metadatas. The database can save passages, fetch passages by their IDs, and searching for passages based on filters.

Advantages of DBs

One of the main advantages of using databases in our framework is that you can continue to use your existing databases. This means you don't have to migrate your data to a new database system to use our framework.

Another advantage is that you can use multiple databases at the same time. This can be useful when you have data stored in different databases and you want to access all of them from our framework. Like in MongoDB, you can use lots of collections. Without using RAGchain, managing all of collections for RAG is painful. But at RAGchain, that pain will go away.

Furthermore, using databases frees you from limits like the character limit for vectorDB. This means you can store and manage large amounts of data without worrying about hitting any limits.

Also, our framework allows you to perform searches directly in the database. This can be useful when you want to find specific passages based on various filters such as passage ID, content, filepath, and additional metadata. The search function returns a list of Passage objects that match the filters. In this way, you can easily search passages from multiple vector stores and retrievers.

https://www.mongodb.com