langchain_core.document_loaders.blob_loaders
.Blob¶
- class langchain_core.document_loaders.blob_loaders.Blob[source]¶
Bases:
BaseModel
Blob represents raw data by either reference or value.
Provides an interface to materialize the blob in different representations, and help to decouple the development of data loaders from the downstream parsing of the raw data.
Inspired by: https://developer.mozilla.org/en-US/docs/Web/API/Blob
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- param data: Optional[Union[bytes, str]] = None¶
Raw data associated with the blob.
- param encoding: str = 'utf-8'¶
Encoding to use if decoding the bytes into a string.
Use utf-8 as default encoding, if decoding to string.
- param metadata: Dict[str, Any] [Optional]¶
Metadata about the blob (e.g., source)
- param mimetype: Optional[str] = None¶
MimeType not to be confused with a file extension.
- param path: Optional[Union[str, PurePath]] = None¶
Location where the original content was found.
- as_bytes_io() Generator[Union[BytesIO, BufferedReader], None, None] [source]¶
Read data as a byte stream.
- Return type
Generator[Union[BytesIO, BufferedReader], None, None]
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ¶
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = ‘allow’ was set since it adds all passed values
- Parameters
_fields_set (Optional[SetStr]) –
values (Any) –
- Return type
Model
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ¶
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) – fields to include in new model
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) – fields to exclude from new model, as with values this takes precedence over include
update (Optional[DictStrAny]) – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep (bool) – set to True to make a deep copy of the model
self (Model) –
- Returns
new model instance
- Return type
Model
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ¶
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
by_alias (bool) –
skip_defaults (Optional[bool]) –
exclude_unset (bool) –
exclude_defaults (bool) –
exclude_none (bool) –
- Return type
DictStrAny
- classmethod from_data(data: Union[str, bytes], *, encoding: str = 'utf-8', mime_type: Optional[str] = None, path: Optional[str] = None, metadata: Optional[dict] = None) Blob [source]¶
Initialize the blob from in-memory data.
- Parameters
data (Union[str, bytes]) – the in-memory data associated with the blob
encoding (str) – Encoding to use if decoding the bytes into a string
mime_type (Optional[str]) – if provided, will be set as the mime-type of the data
path (Optional[str]) – if provided, will be set as the source from which the data came
metadata (Optional[dict]) – Metadata to associate with the blob
- Returns
Blob instance
- Return type
- classmethod from_orm(obj: Any) Model ¶
- Parameters
obj (Any) –
- Return type
Model
- classmethod from_path(path: Union[str, PurePath], *, encoding: str = 'utf-8', mime_type: Optional[str] = None, guess_type: bool = True, metadata: Optional[dict] = None) Blob [source]¶
Load the blob from a path like object.
- Parameters
path (Union[str, PurePath]) – path like object to file to be read
encoding (str) – Encoding to use if decoding the bytes into a string
mime_type (Optional[str]) – if provided, will be set as the mime-type of the data
guess_type (bool) – If True, the mimetype will be guessed from the file extension, if a mime-type was not provided
metadata (Optional[dict]) – Metadata to associate with the blob
- Returns
Blob instance
- Return type
- json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode ¶
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
by_alias (bool) –
skip_defaults (Optional[bool]) –
exclude_unset (bool) –
exclude_defaults (bool) –
exclude_none (bool) –
encoder (Optional[Callable[[Any], Any]]) –
models_as_dict (bool) –
dumps_kwargs (Any) –
- Return type
unicode
- classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ¶
- Parameters
path (Union[str, Path]) –
content_type (unicode) –
encoding (unicode) –
proto (Protocol) –
allow_pickle (bool) –
- Return type
Model
- classmethod parse_obj(obj: Any) Model ¶
- Parameters
obj (Any) –
- Return type
Model
- classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ¶
- Parameters
b (Union[str, bytes]) –
content_type (unicode) –
encoding (unicode) –
proto (Protocol) –
allow_pickle (bool) –
- Return type
Model
- classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') DictStrAny ¶
- Parameters
by_alias (bool) –
ref_template (unicode) –
- Return type
DictStrAny
- classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) unicode ¶
- Parameters
by_alias (bool) –
ref_template (unicode) –
dumps_kwargs (Any) –
- Return type
unicode
- classmethod update_forward_refs(**localns: Any) None ¶
Try to update ForwardRefs on fields based on this Model, globalns and localns.
- Parameters
localns (Any) –
- Return type
None
- classmethod validate(value: Any) Model ¶
- Parameters
value (Any) –
- Return type
Model
- property source: Optional[str]¶
The source location of the blob as string if known otherwise none.
If a path is associated with the blob, it will default to the path location.
Unless explicitly set via a metadata field called “source”, in which case that value will be used instead.