langchain_community.document_loaders.hugging_face_model.HuggingFaceModelLoader¶

class langchain_community.document_loaders.hugging_face_model.HuggingFaceModelLoader(*, search: Optional[str] = None, author: Optional[str] = None, filter: Optional[str] = None, sort: Optional[str] = None, direction: Optional[str] = None, limit: Optional[int] = 3, full: Optional[bool] = None, config: Optional[bool] = None)[source]¶

Load model information from Hugging Face Hub, including README content.

This loader interfaces with the Hugging Face Models API to fetch and load model metadata and README files. The API allows you to search and filter models based on specific criteria such as model tags, authors, and more.

API URL: https://huggingface.co/api/models DOC URL: https://huggingface.co/docs/hub/en/api

Examples

from langchain_community.document_loaders import HuggingFaceModelLoader

# Initialize the loader with search criteria
loader = HuggingFaceModelLoader(search="bert", limit=10)

# Load models
documents = loader.load()

# Iterate through the fetched documents
for doc in documents:
    print(doc.page_content)  # README content of the model
    print(doc.metadata)      # Metadata of the model

Initialize the HuggingFaceModelLoader.

Parameters
  • search (Optional[str]) – Filter based on substrings for repos and their usernames.

  • author (Optional[str]) – Filter models by an author or organization.

  • filter (Optional[str]) – Filter based on tags.

  • sort (Optional[str]) – Property to use when sorting.

  • direction (Optional[str]) – Direction in which to sort.

  • limit (Optional[int]) – Limit the number of models fetched.

  • full (Optional[bool]) – Whether to fetch most model data.

  • config (Optional[bool]) – Whether to also fetch the repo config.

Attributes

BASE_URL

README_BASE_URL

Methods

__init__(*[, search, author, filter, sort, ...])

Initialize the HuggingFaceModelLoader.

alazy_load()

A lazy loader for Documents.

fetch_models()

Fetch model information from Hugging Face Hub.

fetch_readme_content(model_id)

Fetch the README content for a given model.

lazy_load()

Load model information lazily, including README content.

load()

Load data into Document objects.

load_and_split([text_splitter])

Load Documents and split into chunks.

__init__(*, search: Optional[str] = None, author: Optional[str] = None, filter: Optional[str] = None, sort: Optional[str] = None, direction: Optional[str] = None, limit: Optional[int] = 3, full: Optional[bool] = None, config: Optional[bool] = None)[source]¶

Initialize the HuggingFaceModelLoader.

Parameters
  • search (Optional[str]) – Filter based on substrings for repos and their usernames.

  • author (Optional[str]) – Filter models by an author or organization.

  • filter (Optional[str]) – Filter based on tags.

  • sort (Optional[str]) – Property to use when sorting.

  • direction (Optional[str]) – Direction in which to sort.

  • limit (Optional[int]) – Limit the number of models fetched.

  • full (Optional[bool]) – Whether to fetch most model data.

  • config (Optional[bool]) – Whether to also fetch the repo config.

async alazy_load() AsyncIterator[Document]¶

A lazy loader for Documents.

Return type

AsyncIterator[Document]

fetch_models() List[dict][source]¶

Fetch model information from Hugging Face Hub.

Return type

List[dict]

fetch_readme_content(model_id: str) str[source]¶

Fetch the README content for a given model.

Parameters

model_id (str) –

Return type

str

lazy_load() Iterator[Document][source]¶

Load model information lazily, including README content.

Return type

Iterator[Document]

load() List[Document]¶

Load data into Document objects.

Return type

List[Document]

load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document]¶

Load Documents and split into chunks. Chunks are returned as Documents.

Do not override this method. It should be considered to be deprecated!

Parameters

text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns

List of Documents.

Return type

List[Document]