RAGChain Docs
  • Introduction
  • Quick Start
  • Installation
  • RAGchain Structure
    • File Loader
      • Dataset Loader
        • Ko-Strategy-QA Loader
      • Hwp Loader
      • Rust Hwp Loader
      • Win32 Hwp Loader
      • OCR
        • Nougat Loader
        • Mathpix Markdown Loader
        • Deepdoctection Loader
    • Text Spliter
      • Recursive Text Splitter
      • Markdown Header Splitter
      • HTML Header splitter
      • Code splitter
      • Token splitter
    • Retrieval
      • BM25 Retrieval
      • Hybrid Retrieval
      • Hyde Retrieval
      • VectorDB Retrieval
    • LLM
    • DB
      • MongoDB
      • Pickle DB
    • Reranker
      • BM25 Reranker
      • UPR Reranker
      • TART Reranker
      • MonoT5 Reranker
      • LLM Reranker
    • Benchmark
      • Auto Evaluator
      • Dataset Evaluators
        • Qasper
        • Ko-Strategy-QA
        • Strategy-QA
        • ms-marco
  • Utils
    • Query Decomposition
    • Evidence Extractor
    • Embedding
    • Slim Vector Store
      • Pinecone Slim
      • Chroma Slim
    • File Cache
    • Linker
      • Redis Linker
      • Dynamo Linker
      • Json Linker
    • REDE Search Detector
    • Semantic Clustering
  • Pipeline
    • BasicIngestPipeline
    • BasicRunPipeline
    • RerankRunPipeline
    • ViscondeRunPipeline
  • For Advanced RAG
    • Time-Aware RAG
    • Importance-Aware RAG
Powered by GitBook
On this page
  • Overview
  • Roles of the File Loader in the Framework
  • Advantages of File Loader
  1. RAGchain Structure

File Loader

Load various files to RAGchain - compatible with Langchain

PreviousRAGchain StructureNextDataset Loader

Last updated 1 year ago

Overview

The File Loader is a utility designed to load various files into a List of Objects. It is an integral part of our framework, providing the initial step in processing documents for similarity search using the RAG workflow. This loader is fully compatible with Langchain's and can be used interchangeably, offering additional special loaders unique to our framework.

Roles of the File Loader in the Framework

The primary role of the File Loader is to facilitate document ingestion into your application by loading different file types into a standardized List of Objects. Here are some key roles:

  1. Document Ingestion: The File Loader simplifies document ingestion by accepting various file types and converting them into a unified format ( Objects). This makes it easy to handle different kinds of documents within your application.

  2. Compatibility with Langchain's : The File Loader inherits from the same parent class as Langchain's Document loader, ensuring full compatibility between both loaders. You can use either loader based on your specific document file types.

  3. Initial Step in RAG Workflow: Once documents are loaded via the File Loader, they can be split into passages and converted into vector representations for similarity searches. It is the first step of RAG workflow.

Advantages of File Loader

The following are some key advantages offered by our File Loaders:

  1. Compatibility with Langchain's : Thanks to its shared inheritance, you get all benefits provided by Langchain’s Document loader along with additional features from our framework's special loaders. You can check out all document loaders from langchain at .

  2. : Our File Loader includes an OCR (Optical Character Recognition) loader. This allows for the extraction and digitization of text from images or scanned documents, and pdfs. It is useful when you want to ingest complex documents with tables.

  3. : Recognizing the prevalence and importance of HWP files in South Korea, our File Loader includes a dedicated HWP loader. This ensures seamless loading and processing of one of South Korea's most popular document formats.

  4. : Our File Loaders will support various ODQA dataset loaders, enabling easy ingestion and processing of open-domain question answering datasets for RAG pipeline benchmarking.

Document
Document Loader
Document
Document
Document Loader
Document Loader
here
OCR Loaders
HWP Loader
ODQA Dataset Loader