`langchain_community.document_loaders.concurrent`.ConcurrentLoader¶

class langchain_community.document_loaders.concurrent.ConcurrentLoader(blob_loader: BlobLoader, blob_parser: BaseBlobParser, num_workers: int = 4)[source]¶

Load and pars Documents concurrently.

A generic document loader.

Parameters

blob_loader (BlobLoader) – A blob loader which knows how to yield blobs
blob_parser (BaseBlobParser) – A blob parser which knows how to parse blobs into documents
num_workers (int) –

Methods

`__init__`(blob_loader, blob_parser[, num_workers])	A generic document loader.
`alazy_load`()	A lazy loader for Documents.
`from_filesystem`(path, *[, glob, exclude, ...])	Create a concurrent generic document loader using a filesystem blob loader.
`get_parser`(**kwargs)	Override this method to associate a default parser with the class.
`lazy_load`()	Load documents lazily with concurrent parsing.
`load`()	Load data into Document objects.
`load_and_split`([text_splitter])	Load all documents and split them into sentences.

__init__(blob_loader: BlobLoader, blob_parser: BaseBlobParser, num_workers: int = 4) → None[source]¶

A generic document loader.

Parameters

blob_loader (BlobLoader) – A blob loader which knows how to yield blobs
blob_parser (BaseBlobParser) – A blob parser which knows how to parse blobs into documents
num_workers (int) –

Return type

None

async alazy_load() → AsyncIterator[Document]¶

A lazy loader for Documents.

Return type: AsyncIterator[Document]

classmethod from_filesystem(path: Union[str, Path], *, glob: str = '**/[!.]*', exclude: Sequence[str] = (), suffixes: Optional[Sequence[str]] = None, show_progress: bool = False, parser: Union[Literal['default'], BaseBlobParser] = 'default', num_workers: int = 4, parser_kwargs: Optional[dict] = None) → ConcurrentLoader[source]¶

Create a concurrent generic document loader using a filesystem blob loader.

Parameters

path (Union[str, Path]) – The path to the directory to load documents from.
glob (str) – The glob pattern to use to find documents.
suffixes (Optional[Sequence[str]]) – The suffixes to use to filter documents. If None, all files matching the glob will be loaded.
exclude (Sequence[str]) – A list of patterns to exclude from the loader.
show_progress (bool) – Whether to show a progress bar or not (requires tqdm). Proxies to the file system loader.
parser (Union[Literal['default'], ~langchain_core.document_loaders.base.BaseBlobParser]) – A blob parser which knows how to parse blobs into documents
num_workers (int) – Max number of concurrent workers to use.
parser_kwargs (Optional[dict]) – Keyword arguments to pass to the parser.

Return type

ConcurrentLoader

static get_parser(**kwargs: Any) → BaseBlobParser¶

Override this method to associate a default parser with the class.

Parameters: kwargs (Any) –
Return type: BaseBlobParser

lazy_load() → Iterator[Document][source]¶

Load documents lazily with concurrent parsing.

Return type: Iterator[Document]

load() → List[Document]¶

Load data into Document objects.

Return type: List[Document]

load_and_split(text_splitter: Optional[TextSplitter] = None) → List[Document]¶

Load all documents and split them into sentences.

Parameters: text_splitter (Optional[TextSplitter]) –
Return type: List[Document]

Examples using ConcurrentLoader¶

Concurrent Loader

langchain_community.document_loaders.concurrent.ConcurrentLoader¶

Examples using ConcurrentLoader¶

`langchain_community.document_loaders.concurrent`.ConcurrentLoader¶