langchain_community.document_loaders.blackboard
.BlackboardLoader¶
- class langchain_community.document_loaders.blackboard.BlackboardLoader(blackboard_course_url: str, bbrouter: str, load_all_recursively: bool = True, basic_auth: Optional[Tuple[str, str]] = None, cookies: Optional[dict] = None, continue_on_failure: bool = False)[source]¶
Load a Blackboard course.
This loader is not compatible with all Blackboard courses. It is only compatible with courses that use the new Blackboard interface. To use this loader, you must have the BbRouter cookie. You can get this cookie by logging into the course and then copying the value of the BbRouter cookie from the browser’s developer tools.
Example
from langchain_community.document_loaders import BlackboardLoader loader = BlackboardLoader( blackboard_course_url="https://blackboard.example.com/webapps/blackboard/execute/announcement?method=search&context=course_entry&course_id=_123456_1", bbrouter="expires:12345...", ) documents = loader.load()
Initialize with blackboard course url.
The BbRouter cookie is required for most blackboard courses.
- Parameters
blackboard_course_url (str) – Blackboard course url.
bbrouter (str) – BbRouter cookie.
load_all_recursively (bool) – If True, load all documents recursively.
basic_auth (Optional[Tuple[str, str]]) – Basic auth credentials.
cookies (Optional[dict]) – Cookies.
continue_on_failure (bool) – whether to continue loading the sitemap if an error occurs loading a url, emitting a warning instead of raising an exception. Setting this to True makes the loader more robust, but also may result in missing data. Default: False
- Raises
ValueError – If blackboard course url is invalid.
Attributes
web_path
Methods
__init__
(blackboard_course_url, bbrouter[, ...])Initialize with blackboard course url.
A lazy loader for Documents.
aload
()Load text from the urls in web_path async into Documents.
Check if BeautifulSoup4 is installed.
download
(path)Download a file from an url.
fetch_all
(urls)Fetch all urls concurrently with rate limiting.
Lazy load text from the url(s) in web_path.
load
()Load data into Document objects.
load_and_split
([text_splitter])Load Documents and split into chunks.
parse_filename
(url)Parse the filename from an url.
scrape
([parser])Scrape data from webpage and return it in BeautifulSoup format.
scrape_all
(urls[, parser])Fetch all urls, then return soups for all results.
- __init__(blackboard_course_url: str, bbrouter: str, load_all_recursively: bool = True, basic_auth: Optional[Tuple[str, str]] = None, cookies: Optional[dict] = None, continue_on_failure: bool = False)[source]¶
Initialize with blackboard course url.
The BbRouter cookie is required for most blackboard courses.
- Parameters
blackboard_course_url (str) – Blackboard course url.
bbrouter (str) – BbRouter cookie.
load_all_recursively (bool) – If True, load all documents recursively.
basic_auth (Optional[Tuple[str, str]]) – Basic auth credentials.
cookies (Optional[dict]) – Cookies.
continue_on_failure (bool) – whether to continue loading the sitemap if an error occurs loading a url, emitting a warning instead of raising an exception. Setting this to True makes the loader more robust, but also may result in missing data. Default: False
- Raises
ValueError – If blackboard course url is invalid.
- async alazy_load() AsyncIterator[Document] ¶
A lazy loader for Documents.
- Return type
AsyncIterator[Document]
- aload() List[Document] ¶
Load text from the urls in web_path async into Documents.
- Return type
List[Document]
- check_bs4() None [source]¶
Check if BeautifulSoup4 is installed.
- Raises
ImportError – If BeautifulSoup4 is not installed.
- Return type
None
- download(path: str) None [source]¶
Download a file from an url.
- Parameters
path (str) – Path to the file.
- Return type
None
- async fetch_all(urls: List[str]) Any ¶
Fetch all urls concurrently with rate limiting.
- Parameters
urls (List[str]) –
- Return type
Any
- lazy_load() Iterator[Document] ¶
Lazy load text from the url(s) in web_path.
- Return type
Iterator[Document]
- load() List[Document] [source]¶
Load data into Document objects.
- Returns
List of Documents.
- Return type
List[Document]
- load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document] ¶
Load Documents and split into chunks. Chunks are returned as Documents.
Do not override this method. It should be considered to be deprecated!
- Parameters
text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
- Returns
List of Documents.
- Return type
List[Document]
- parse_filename(url: str) str [source]¶
Parse the filename from an url.
- Parameters
url (str) – Url to parse the filename from.
- Returns
The filename.
- Return type
str
- scrape(parser: Optional[str] = None) Any ¶
Scrape data from webpage and return it in BeautifulSoup format.
- Parameters
parser (Optional[str]) –
- Return type
Any
- scrape_all(urls: List[str], parser: Optional[str] = None) List[Any] ¶
Fetch all urls, then return soups for all results.
- Parameters
urls (List[str]) –
parser (Optional[str]) –
- Return type
List[Any]