langchain_community.document_loaders.chromium.AsyncChromiumLoader¶
- class langchain_community.document_loaders.chromium.AsyncChromiumLoader(urls: List[str])[source]¶
- Scrape HTML pages from URLs using a headless instance of the Chromium. - Initialize the loader with a list of URL paths. - Parameters
- urls (List[str]) – A list of URLs to scrape content from. 
- Raises
- ImportError – If the required ‘playwright’ package is not installed. 
 - Methods - __init__(urls)- Initialize the loader with a list of URL paths. - ascrape_playwright(url)- Asynchronously scrape the content of a given URL using Playwright's async API. - Lazily load text content from the provided URLs. - load()- Load and return all Documents from the provided URLs. - load_and_split([text_splitter])- Load Documents and split into chunks. - __init__(urls: List[str])[source]¶
- Initialize the loader with a list of URL paths. - Parameters
- urls (List[str]) – A list of URLs to scrape content from. 
- Raises
- ImportError – If the required ‘playwright’ package is not installed. 
 
 - async ascrape_playwright(url: str) str[source]¶
- Asynchronously scrape the content of a given URL using Playwright’s async API. - Parameters
- url (str) – The URL to scrape. 
- Returns
- The scraped HTML content or an error message if an exception occurs. 
- Return type
- str 
 
 - lazy_load() Iterator[Document][source]¶
- Lazily load text content from the provided URLs. - This method yields Documents one at a time as they’re scraped, instead of waiting to scrape all URLs before returning. - Yields
- Document – The scraped content encapsulated within a Document object. 
 
 - load() List[Document][source]¶
- Load and return all Documents from the provided URLs. - Returns
- A list of Document objects containing the scraped content from each URL. 
- Return type
- List[Document] 
 
 - load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document]¶
- Load Documents and split into chunks. Chunks are returned as Documents. - Parameters
- text_splitter – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter. 
- Returns
- List of Documents.