langchain_community.document_loaders.parsers.grobid
.GrobidParser¶
- class langchain_community.document_loaders.parsers.grobid.GrobidParser(segment_sentences: bool, grobid_server: str = 'http://localhost:8070/api/processFulltextDocument')[source]¶
Load article PDF files using Grobid.
Methods
__init__
(segment_sentences[, grobid_server])lazy_parse
(blob)Lazy parsing interface.
parse
(blob)Eagerly parse the blob into a document or documents.
process_xml
(file_path, xml_data, ...)Process the XML file from Grobin.
- Parameters
segment_sentences (bool) –
grobid_server (str) –
- __init__(segment_sentences: bool, grobid_server: str = 'http://localhost:8070/api/processFulltextDocument') None [source]¶
- Parameters
segment_sentences (bool) –
grobid_server (str) –
- Return type
None
- lazy_parse(blob: Blob) Iterator[Document] [source]¶
Lazy parsing interface.
Subclasses are required to implement this method.