Skip to content

graph_retriever.edges

Specification and implementation of edges functions.

These are responsible for extracting edges from nodes and expressing them in way that the adapters can implement.

EdgeFunction module-attribute

EdgeFunction: TypeAlias = Callable[[Content], Edges]

A function for extracting edges from nodes.

Implementations should be deterministic.

EdgeSpec module-attribute

EdgeSpec: TypeAlias = tuple[str | Id, str | Id]

The definition of an edge for traversal, represented as a pair of fields representing the source and target of the edge. Each may be:

  • A string, key, indicating doc.metadata[key] as the value.
  • The placeholder Id, indicating doc.id as the value.

Examples:

url_to_href_edge          = ("url", "href")
keywords_to_keywords_edge = ("keywords", "keywords")
mentions_to_id_edge       = ("mentions", Id())
id_to_mentions_edge       = (Id(), "mentions)

Edge

Bases: ABC

An edge identifies properties necessary for finding matching nodes.

Sub-classes should be hashable.

Edges dataclass

Edges(incoming: set[Edge], outgoing: set[Edge])

Information about the incoming and outgoing edges.

PARAMETER DESCRIPTION
incoming

Incoming edges that link to this node.

TYPE: set[Edge]

outgoing

Edges that this node link to. These edges should be defined in terms of the incoming Edge they match. For instance, a link from "mentions" to "id" would link to IdEdge(...).

TYPE: set[Edge]

Id

Place-holder type indicating that the ID should be used.

IdEdge dataclass

IdEdge(id: str)

Bases: Edge

An IdEdge connects to nodes with node.id == id.

PARAMETER DESCRIPTION
id

The ID of the node to link to.

TYPE: str

MetadataEdge dataclass

MetadataEdge(incoming_field: str, value: Any)

Bases: Edge

Link to nodes with specific metadata.

A MetadataEdge connects to nodes with either:

  • node.metadata[field] == value
  • node.metadata[field] CONTAINS value (if the metadata is a collection).
PARAMETER DESCRIPTION
incoming_field

The name of the metadata field storing incoming edges.

TYPE: str

value

The value associated with the key for this edge

TYPE: Any

Source code in packages/graph-retriever/src/graph_retriever/edges/_base.py
def __init__(self, incoming_field: str, value: Any) -> None:
    # `self.field = value` and `setattr(self, "field", value)` -- don't work
    # because of frozen. we need to call `__setattr__` directly (as the
    # default `__init__` would do) to initialize the fields of the frozen
    # dataclass.
    object.__setattr__(self, "incoming_field", incoming_field)

    if isinstance(value, dict):
        value = immutabledict(value)
    object.__setattr__(self, "value", value)

MetadataEdgeFunction

MetadataEdgeFunction(edges: list[EdgeSpec])

Helper for extracting and encoding edges in metadata.

This class provides tools to extract incoming and outgoing edges from document metadata. Both incoming and outgoing edges use the same target name, enabling equality matching for keys.

PARAMETER DESCRIPTION
edges

Definitions of edges for traversal, represented as a pair of fields representing the source and target of the edges.

TYPE: list[EdgeSpec]

RAISES DESCRIPTION
ValueError

If an invalid edge definition is provided.

Source code in packages/graph-retriever/src/graph_retriever/edges/metadata.py
def __init__(
    self,
    edges: list[EdgeSpec],
) -> None:
    self.edges = edges
    for source, target in edges:
        if not isinstance(source, str | Id):
            raise ValueError(f"Expected 'str | Id' but got: {source}")
        if not isinstance(target, str | Id):
            raise ValueError(f"Expected 'str | Id' but got: {target}")

__call__

__call__(content: Content) -> Edges

Extract incoming and outgoing edges for a piece of content.

This method retrieves edges based on the declared edge definitions, taking into account whether nested metadata is used.

PARAMETER DESCRIPTION
content

The content to extract edges from.

TYPE: Content

RETURNS DESCRIPTION
Edges

the incoming and outgoing edges of the node

Source code in packages/graph-retriever/src/graph_retriever/edges/metadata.py
def __call__(self, content: Content) -> Edges:
    """
    Extract incoming and outgoing edges for a piece of content.

    This method retrieves edges based on the declared edge definitions, taking
    into account whether nested metadata is used.

    Parameters
    ----------
    content :
        The content to extract edges from.

    Returns
    -------
    :
        the incoming and outgoing edges of the node
    """
    outgoing_edges = self._edges_from_dict(content.id, content.metadata)
    incoming_edges = self._edges_from_dict(
        content.id, content.metadata, incoming=True
    )

    return Edges(incoming=incoming_edges, outgoing=outgoing_edges)