langchain_community.document_loaders.obs_file
.OBSFileLoader¶
- class langchain_community.document_loaders.obs_file.OBSFileLoader(bucket: str, key: str, client: Any = None, endpoint: str = '', config: Optional[dict] = None)[source]¶
Load from the Huawei OBS file.
Initialize the OBSFileLoader with the specified settings.
- Parameters
bucket (str) – The name of the OBS bucket to be used.
key (str) – The name of the object in the OBS bucket.
client (ObsClient, optional) – An instance of the ObsClient to connect to OBS.
endpoint (str, optional) – The endpoint URL of your OBS bucket. This parameter is mandatory if client is not provided.
config (dict, optional) – The parameters for connecting to OBS, provided as a dictionary. This parameter is ignored if client is provided. The dictionary could have the following keys: - “ak” (str, optional): Your OBS access key (required if get_token_from_ecs is False and bucket policy is not public read). - “sk” (str, optional): Your OBS secret key (required if get_token_from_ecs is False and bucket policy is not public read). - “token” (str, optional): Your security token (required if using temporary credentials). - “get_token_from_ecs” (bool, optional): Whether to retrieve the security token from ECS. Defaults to False if not provided. If set to True, ak, sk, and token will be ignored.
- Raises
ValueError – If the esdk-obs-python package is not installed.
TypeError – If the provided client is not an instance of ObsClient.
ValueError – If client is not provided, but endpoint is missing.
Note
Before using this class, make sure you have registered with OBS and have the necessary credentials. The ak, sk, and endpoint values are mandatory unless get_token_from_ecs is True or the bucket policy is public read. token is required when using temporary credentials.
Example
To create a new OBSFileLoader with a new client: ``` config = {
“ak”: “your-access-key”, “sk”: “your-secret-key”
} obs_loader = OBSFileLoader(“your-bucket-name”, “your-object-key”, config=config) ```
To create a new OBSFileLoader with an existing client: ``` from obs import ObsClient
# Assuming you have an existing ObsClient object ‘obs_client’ obs_loader = OBSFileLoader(“your-bucket-name”, “your-object-key”, client=obs_client) ```
To create a new OBSFileLoader without an existing client:
` obs_loader = OBSFileLoader("your-bucket-name", "your-object-key", endpoint="your-endpoint-url") `
Methods
__init__
(bucket, key[, client, endpoint, config])Initialize the OBSFileLoader with the specified settings.
A lazy loader for Documents.
A lazy loader for Documents.
load
()Load documents.
load_and_split
([text_splitter])Load Documents and split into chunks.
- __init__(bucket: str, key: str, client: Any = None, endpoint: str = '', config: Optional[dict] = None) None [source]¶
Initialize the OBSFileLoader with the specified settings.
- Parameters
bucket (str) – The name of the OBS bucket to be used.
key (str) – The name of the object in the OBS bucket.
client (ObsClient, optional) – An instance of the ObsClient to connect to OBS.
endpoint (str, optional) – The endpoint URL of your OBS bucket. This parameter is mandatory if client is not provided.
config (dict, optional) – The parameters for connecting to OBS, provided as a dictionary. This parameter is ignored if client is provided. The dictionary could have the following keys: - “ak” (str, optional): Your OBS access key (required if get_token_from_ecs is False and bucket policy is not public read). - “sk” (str, optional): Your OBS secret key (required if get_token_from_ecs is False and bucket policy is not public read). - “token” (str, optional): Your security token (required if using temporary credentials). - “get_token_from_ecs” (bool, optional): Whether to retrieve the security token from ECS. Defaults to False if not provided. If set to True, ak, sk, and token will be ignored.
- Raises
ValueError – If the esdk-obs-python package is not installed.
TypeError – If the provided client is not an instance of ObsClient.
ValueError – If client is not provided, but endpoint is missing.
- Return type
None
Note
Before using this class, make sure you have registered with OBS and have the necessary credentials. The ak, sk, and endpoint values are mandatory unless get_token_from_ecs is True or the bucket policy is public read. token is required when using temporary credentials.
Example
To create a new OBSFileLoader with a new client: ``` config = {
“ak”: “your-access-key”, “sk”: “your-secret-key”
} obs_loader = OBSFileLoader(“your-bucket-name”, “your-object-key”, config=config) ```
To create a new OBSFileLoader with an existing client: ``` from obs import ObsClient
# Assuming you have an existing ObsClient object ‘obs_client’ obs_loader = OBSFileLoader(“your-bucket-name”, “your-object-key”, client=obs_client) ```
To create a new OBSFileLoader without an existing client:
` obs_loader = OBSFileLoader("your-bucket-name", "your-object-key", endpoint="your-endpoint-url") `
- async alazy_load() AsyncIterator[Document] ¶
A lazy loader for Documents.
- Return type
AsyncIterator[Document]
- load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document] ¶
Load Documents and split into chunks. Chunks are returned as Documents.
Do not override this method. It should be considered to be deprecated!
- Parameters
text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
- Returns
List of Documents.
- Return type
List[Document]