Recursive Text Splitter
Overview
The RecursiveTextSplitter
is used to split a document into passages by recursively splitting on a list of separators. The class also allows for specifying a window size and overlap size to split the document into overlapping passages.
The most feature is similar with Langchain's RecursiveCharacterTextSplitter
.
Usage
Initialization
First, initialize an instance of RecursiveTextSplitter
. For example:
from RAGchain.preprocess.text_splitter import RecursiveTextSplitter
splitter = RecursiveTextSplitter(chunk_size=500, chunk_overlap=50)
Split document
You can split document using split_document()
method. It will return list of Passage
objects. For example:
passages = splitter.split_document(document)
Last updated