Markdown Header Splitter
Overview
The MarkDownHeaderSplitter
is used to split a document into passages based document's header information which a list of separators contain. The most feature is similar with Langchain's MarkdownHeaderTextSplitter
. It split based on header.
metadata_etc
of Passage
contains header information and original document information. metadata_etc
updates new header is two case.
First, whenever new header appear at document, metadata_etc
is appended new header information.
Second, when a header with an equivalent relationship appears, the metadata is initialized and the newly appeared header is included in the metadata.
Usage
Initialization
First, initialize an instance of MarkDownHeaderSplitter
. For example:
Split document
You can split document using split_document()
method. It will return list of Passage
objects. For example:
Last updated