Recursive Text Splitter
Overview
The RecursiveTextSplitter is used to split a document into passages by recursively splitting on a list of separators. The class also allows for specifying a window size and overlap size to split the document into overlapping passages.
The most feature is similar with Langchain's RecursiveCharacterTextSplitter.
Usage
Initialization
First, initialize an instance of RecursiveTextSplitter. For example:
from RAGchain.preprocess.text_splitter import RecursiveTextSplitter
splitter = RecursiveTextSplitter(chunk_size=500, chunk_overlap=50)Split document
You can split document using split_document() method. It will return list of Passage objects. For example:
passages = splitter.split_document(document)Last updated