langchain_community.document_transformers.openai_functions
.OpenAIMetadataTaggerΒΆ
- class langchain_community.document_transformers.openai_functions.OpenAIMetadataTagger[source]ΒΆ
Bases:
BaseDocumentTransformer
,BaseModel
Extract metadata tags from document contents using OpenAI functions.
- Example:
from langchain_community.chat_models import ChatOpenAI from langchain_community.document_transformers import OpenAIMetadataTagger from langchain_core.documents import Document schema = { "properties": { "movie_title": { "type": "string" }, "critic": { "type": "string" }, "tone": { "type": "string", "enum": ["positive", "negative"] }, "rating": { "type": "integer", "description": "The number of stars the critic rated the movie" } }, "required": ["movie_title", "critic", "tone"] } # Must be an OpenAI model that supports functions llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613") tagging_chain = create_tagging_chain(schema, llm) document_transformer = OpenAIMetadataTagger(tagging_chain=tagging_chain) original_documents = [ Document(page_content="Review of The Bee Movie
By Roger Ebert
- This is the greatest movie ever made. 4 out of 5 stars.β),
Document(page_content=βReview of The Godfather
By Anonymous
- This movie was super boring. 1 out of 5 stars.β, metadata={βreliableβ: False}),
]
enhanced_documents = document_transformer.transform_documents(original_documents)
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- param tagging_chain: Any = NoneΒΆ
The chain used to extract metadata from each document.
- async atransform_documents(documents: Sequence[Document], **kwargs: Any) Sequence[Document] [source]ΒΆ
Asynchronously transform a list of documents.
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ΒΆ
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = βallowβ was set since it adds all passed values
- Parameters
_fields_set (Optional[SetStr]) β
values (Any) β
- Return type
Model
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ΒΆ
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) β fields to include in new model
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) β fields to exclude from new model, as with values this takes precedence over include
update (Optional[DictStrAny]) β values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep (bool) β set to True to make a deep copy of the model
self (Model) β
- Returns
new model instance
- Return type
Model
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ΒΆ
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) β
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) β
by_alias (bool) β
skip_defaults (Optional[bool]) β
exclude_unset (bool) β
exclude_defaults (bool) β
exclude_none (bool) β
- Return type
DictStrAny
- classmethod from_orm(obj: Any) Model ΒΆ
- Parameters
obj (Any) β
- Return type
Model
- json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode ΒΆ
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) β
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) β
by_alias (bool) β
skip_defaults (Optional[bool]) β
exclude_unset (bool) β
exclude_defaults (bool) β
exclude_none (bool) β
encoder (Optional[Callable[[Any], Any]]) β
models_as_dict (bool) β
dumps_kwargs (Any) β
- Return type
unicode
- classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ΒΆ
- Parameters
path (Union[str, Path]) β
content_type (unicode) β
encoding (unicode) β
proto (Protocol) β
allow_pickle (bool) β
- Return type
Model
- classmethod parse_obj(obj: Any) Model ΒΆ
- Parameters
obj (Any) β
- Return type
Model
- classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ΒΆ
- Parameters
b (Union[str, bytes]) β
content_type (unicode) β
encoding (unicode) β
proto (Protocol) β
allow_pickle (bool) β
- Return type
Model
- classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') DictStrAny ΒΆ
- Parameters
by_alias (bool) β
ref_template (unicode) β
- Return type
DictStrAny
- classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) unicode ΒΆ
- Parameters
by_alias (bool) β
ref_template (unicode) β
dumps_kwargs (Any) β
- Return type
unicode
- transform_documents(documents: Sequence[Document], **kwargs: Any) Sequence[Document] [source]ΒΆ
Automatically extract and populate metadata for each document according to the provided schema.
- classmethod update_forward_refs(**localns: Any) None ΒΆ
Try to update ForwardRefs on fields based on this Model, globalns and localns.
- Parameters
localns (Any) β
- Return type
None
- classmethod validate(value: Any) Model ΒΆ
- Parameters
value (Any) β
- Return type
Model