Skip to content

graph_retriever

Provides retrieval functions combining vector and graph traversal.

The main methods are traverse and atraverse which provide synchronous and asynchronous traversals.

Content dataclass

Content(
    id: str,
    content: str,
    embedding: list[float],
    metadata: dict[str, Any] = dict(),
    mime_type: str = "text/plain",
    score: float | None = None,
)

Model representing retrieved content.

PARAMETER DESCRIPTION
id

The ID of the content.

TYPE: str

content

The content.

TYPE: str

embedding

The embedding of the content.

TYPE: list[float]

score

The similarity of the embedding to the query. This is optional, and may not be set depending on the content.

TYPE: float | None DEFAULT: None

metadata

The metadata associated with the content.

TYPE: dict[str, Any] DEFAULT: dict()

mime_type

The MIME type of the content.

TYPE: str DEFAULT: 'text/plain'

new staticmethod

new(
    id: str,
    content: str,
    embedding: list[float] | Callable[[str], list[float]],
    score: float | None = None,
    metadata: dict[str, Any] | None = None,
    mime_type: str = "text/plain",
) -> Content

Create a new content.

PARAMETER DESCRIPTION
id

The ID of the content.

TYPE: str

content

The content.

TYPE: str

embedding

The embedding, or a function to apply to the content to compute the embedding.

TYPE: list[float] | Callable[[str], list[float]]

score

The similarity of the embedding to the query.

TYPE: float | None DEFAULT: None

metadata

The metadata associated with the content.

TYPE: dict[str, Any] | None DEFAULT: None

mime_type

The MIME type of the content.

TYPE: str DEFAULT: 'text/plain'

RETURNS DESCRIPTION
Content

The created content.

Source code in packages/graph-retriever/src/graph_retriever/content.py
@staticmethod
def new(
    id: str,
    content: str,
    embedding: list[float] | Callable[[str], list[float]],
    score: float | None = None,
    metadata: dict[str, Any] | None = None,
    mime_type: str = "text/plain",
) -> Content:
    """
    Create a new content.

    Parameters
    ----------
    id :
        The ID of the content.
    content :
        The content.
    embedding :
        The embedding, or a function to apply to the content to compute the
        embedding.
    score :
        The similarity of the embedding to the query.
    metadata :
        The metadata associated with the content.
    mime_type :
        The MIME type of the content.

    Returns
    -------
    :
        The created content.
    """
    return Content(
        id=id,
        content=content,
        embedding=embedding(content) if callable(embedding) else embedding,
        score=score,
        metadata=metadata or {},
        mime_type=mime_type,
    )

Node dataclass

Node(
    id: str,
    content: str,
    depth: int,
    embedding: list[float],
    metadata: dict[str, Any] = dict(),
    incoming_edges: set[Edge] = set(),
    outgoing_edges: set[Edge] = set(),
    extra_metadata: dict[str, Any] = dict(),
)

Represents a node in the traversal graph.

The Node class contains information about a document during graph traversal, including its depth, embedding, edges, and metadata.

PARAMETER DESCRIPTION
id

The unique identifier of the document represented by this node.

TYPE: str

content

The content.

TYPE: str

depth

The depth (number of edges) through which this node was discovered. This depth may not reflect the true depth in the full graph if only a subset of edges is retrieved.

TYPE: int

embedding

The embedding vector of the document, used for similarity calculations.

TYPE: list[float]

metadata

Metadata from the original document. This is a reference to the original document metadata and should not be modified directly. Any updates to metadata should be made to extra_metadata.

TYPE: dict[str, Any] DEFAULT: dict()

extra_metadata

Additional metadata to override or augment the original document metadata during traversal.

TYPE: dict[str, Any] DEFAULT: dict()

atraverse async

atraverse(
    query: str,
    *,
    edges: list[EdgeSpec] | EdgeFunction,
    strategy: Strategy,
    store: Adapter,
    metadata_filter: dict[str, Any] | None = None,
    initial_root_ids: Sequence[str] = (),
    store_kwargs: dict[str, Any] = {},
) -> list[Node]

Asynchronously perform a graph traversal to retrieve nodes for a specific query.

PARAMETER DESCRIPTION
query

The query string for the traversal.

TYPE: str

edges

A list of EdgeSpec for use in creating a MetadataEdgeFunction, or an EdgeFunction.

TYPE: list[EdgeSpec] | EdgeFunction

strategy

The traversal strategy that defines how nodes are discovered, selected, and finalized.

TYPE: Strategy

store

The vector store adapter used for similarity searches and document retrieval.

TYPE: Adapter

metadata_filter

Optional filter for metadata during traversal.

TYPE: dict[str, Any] | None DEFAULT: None

initial_root_ids

IDs of the initial root nodes for the traversal.

TYPE: Sequence[str] DEFAULT: ()

store_kwargs

Additional arguments passed to the store adapter.

TYPE: dict[str, Any] DEFAULT: {}

RETURNS DESCRIPTION
list[Node]

Nodes returned by the traversal.

Source code in packages/graph-retriever/src/graph_retriever/traversal.py
async def atraverse(
    query: str,
    *,
    edges: list[EdgeSpec] | EdgeFunction,
    strategy: Strategy,
    store: Adapter,
    metadata_filter: dict[str, Any] | None = None,
    initial_root_ids: Sequence[str] = (),
    store_kwargs: dict[str, Any] = {},
) -> list[Node]:
    """
    Asynchronously perform a graph traversal to retrieve nodes for a specific query.

    Parameters
    ----------
    query :
        The query string for the traversal.
    edges :
        A list of [EdgeSpec][graph_retriever.edges.EdgeSpec] for use in creating a
        [MetadataEdgeFunction][graph_retriever.edges.MetadataEdgeFunction],
        or an [EdgeFunction][graph_retriever.edges.EdgeFunction].
    strategy :
        The traversal strategy that defines how nodes are discovered, selected,
        and finalized.
    store :
        The vector store adapter used for similarity searches and document
        retrieval.
    metadata_filter :
        Optional filter for metadata during traversal.
    initial_root_ids :
        IDs of the initial root nodes for the traversal.
    store_kwargs :
        Additional arguments passed to the store adapter.

    Returns
    -------
    :
        Nodes returned by the traversal.
    """
    traversal = _Traversal(
        query=query,
        edges=edges,
        strategy=copy.deepcopy(strategy),
        store=store,
        metadata_filter=metadata_filter,
        initial_root_ids=initial_root_ids,
        store_kwargs=store_kwargs,
    )
    return await traversal.atraverse()

traverse

traverse(
    query: str,
    *,
    edges: list[EdgeSpec] | EdgeFunction,
    strategy: Strategy,
    store: Adapter,
    metadata_filter: dict[str, Any] | None = None,
    initial_root_ids: Sequence[str] = (),
    store_kwargs: dict[str, Any] = {},
) -> list[Node]

Perform a graph traversal to retrieve nodes for a specific query.

PARAMETER DESCRIPTION
query

The query string for the traversal.

TYPE: str

edges

A list of EdgeSpec for use in creating a MetadataEdgeFunction, or an EdgeFunction.

TYPE: list[EdgeSpec] | EdgeFunction

strategy

The traversal strategy that defines how nodes are discovered, selected, and finalized.

TYPE: Strategy

store

The vector store adapter used for similarity searches and document retrieval.

TYPE: Adapter

metadata_filter

Optional filter for metadata during traversal.

TYPE: dict[str, Any] | None DEFAULT: None

initial_root_ids

IDs of the initial root nodes for the traversal.

TYPE: Sequence[str] DEFAULT: ()

store_kwargs

Additional arguments passed to the store adapter.

TYPE: dict[str, Any] DEFAULT: {}

RETURNS DESCRIPTION
list[Node]

Nodes returned by the traversal.

Source code in packages/graph-retriever/src/graph_retriever/traversal.py
def traverse(
    query: str,
    *,
    edges: list[EdgeSpec] | EdgeFunction,
    strategy: Strategy,
    store: Adapter,
    metadata_filter: dict[str, Any] | None = None,
    initial_root_ids: Sequence[str] = (),
    store_kwargs: dict[str, Any] = {},
) -> list[Node]:
    """
    Perform a graph traversal to retrieve nodes for a specific query.

    Parameters
    ----------
    query :
        The query string for the traversal.
    edges :
        A list of [EdgeSpec][graph_retriever.edges.EdgeSpec] for use in creating a
        [MetadataEdgeFunction][graph_retriever.edges.MetadataEdgeFunction],
        or an [EdgeFunction][graph_retriever.edges.EdgeFunction].
    strategy :
        The traversal strategy that defines how nodes are discovered, selected,
        and finalized.
    store :
        The vector store adapter used for similarity searches and document
        retrieval.
    metadata_filter :
        Optional filter for metadata during traversal.
    initial_root_ids :
        IDs of the initial root nodes for the traversal.
    store_kwargs :
        Additional arguments passed to the store adapter.

    Returns
    -------
    :
        Nodes returned by the traversal.
    """
    traversal = _Traversal(
        query=query,
        edges=edges,
        strategy=copy.deepcopy(strategy),
        store=store,
        metadata_filter=metadata_filter,
        initial_root_ids=initial_root_ids,
        store_kwargs=store_kwargs,
    )
    return traversal.traverse()