File Loader
Load various files to RAGchain - compatible with Langchain
Last updated
Load various files to RAGchain - compatible with Langchain
Last updated
The File Loader is a utility designed to load various files into a List of Objects. It is an integral part of our framework, providing the initial step in processing documents for similarity search using the RAG workflow. This loader is fully compatible with Langchain's and can be used interchangeably, offering additional special loaders unique to our framework.
The primary role of the File Loader is to facilitate document ingestion into your application by loading different file types into a standardized List of Objects. Here are some key roles:
Document Ingestion: The File Loader simplifies document ingestion by accepting various file types and converting them into a unified format ( Objects). This makes it easy to handle different kinds of documents within your application.
Compatibility with Langchain's : The File Loader inherits from the same parent class as Langchain's Document loader, ensuring full compatibility between both loaders. You can use either loader based on your specific document file types.
Initial Step in RAG Workflow: Once documents are loaded via the File Loader, they can be split into passages and converted into vector representations for similarity searches. It is the first step of RAG workflow.
The following are some key advantages offered by our File Loaders:
Compatibility with Langchain's : Thanks to its shared inheritance, you get all benefits provided by Langchain’s Document loader along with additional features from our framework's special loaders. You can check out all document loaders from langchain at .
: Our File Loader includes an OCR (Optical Character Recognition) loader. This allows for the extraction and digitization of text from images or scanned documents, and pdfs. It is useful when you want to ingest complex documents with tables.
: Recognizing the prevalence and importance of HWP files in South Korea, our File Loader includes a dedicated HWP loader. This ensures seamless loading and processing of one of South Korea's most popular document formats.
: Our File Loaders will support various ODQA dataset loaders, enabling easy ingestion and processing of open-domain question answering datasets for RAG pipeline benchmarking.