`langchain_experimental.data_anonymizer.presidio`.PresidioAnonymizerBase¶

class langchain_experimental.data_anonymizer.presidio.PresidioAnonymizerBase(analyzed_fields: Optional[List[str]] = None, operators: Optional[Dict[str, OperatorConfig]] = None, languages_config: Dict = {'models': [{'lang_code': 'en', 'model_name': 'en_core_web_lg'}], 'nlp_engine_name': 'spacy'}, add_default_faker_operators: bool = True, faker_seed: Optional[int] = None)[source]¶

Parameters

analyzed_fields – List of fields to detect and then anonymize. Defaults to all entities supported by Microsoft Presidio.
operators – Operators to use for anonymization. Operators allow for custom anonymization of detected PII. Learn more: https://microsoft.github.io/presidio/tutorial/10_simple_anonymization/
languages_config – Configuration for the NLP engine. First language in the list will be used as the main language in self.anonymize(…) when no language is specified. Learn more: https://microsoft.github.io/presidio/analyzer/customizing_nlp_models/
faker_seed – Seed used to initialize faker. Defaults to None, in which case faker will be seeded randomly and provide random values.

Methods

`__init__`([analyzed_fields, operators, ...])	param analyzed_fields List of fields to detect and then anonymize.
`add_operators`(operators)	Add operators to the anonymizer
`add_recognizer`(recognizer)	Add a recognizer to the analyzer
`anonymize`(text[, language, allow_list])	Anonymize text

__init__(analyzed_fields: Optional[List[str]] = None, operators: Optional[Dict[str, OperatorConfig]] = None, languages_config: Dict = {'models': [{'lang_code': 'en', 'model_name': 'en_core_web_lg'}], 'nlp_engine_name': 'spacy'}, add_default_faker_operators: bool = True, faker_seed: Optional[int] = None)[source]¶

Parameters

analyzed_fields – List of fields to detect and then anonymize. Defaults to all entities supported by Microsoft Presidio.
operators – Operators to use for anonymization. Operators allow for custom anonymization of detected PII. Learn more: https://microsoft.github.io/presidio/tutorial/10_simple_anonymization/
languages_config – Configuration for the NLP engine. First language in the list will be used as the main language in self.anonymize(…) when no language is specified. Learn more: https://microsoft.github.io/presidio/analyzer/customizing_nlp_models/
faker_seed – Seed used to initialize faker. Defaults to None, in which case faker will be seeded randomly and provide random values.

add_operators(operators: Dict[str, OperatorConfig]) → None[source]¶

Add operators to the anonymizer

Parameters: operators – Operators to add to the anonymizer.

add_recognizer(recognizer: EntityRecognizer) → None[source]¶

Add a recognizer to the analyzer

Parameters: recognizer – Recognizer to add to the analyzer.

anonymize(text: str, language: Optional[str] = None, allow_list: Optional[List[str]] = None) → str¶: Anonymize text

langchain_experimental.data_anonymizer.presidio.PresidioAnonymizerBase¶

`langchain_experimental.data_anonymizer.presidio`.PresidioAnonymizerBase¶