Skip to content

graph_retriever.strategies

Strategies determine which nodes are selected during traversal.

Eager dataclass

Eager(
    _nodes: list[Node] = list(),
    *,
    k: int = 5,
    start_k: int = 4,
    adjacent_k: int = 10,
    max_depth: int | None = None,
    _query_embedding: list[float] = list(),
)

Bases: Strategy

Eager traversal strategy (breadth-first).

This strategy selects all discovered nodes at each traversal step. It ensures breadth-first traversal by processing nodes layer by layer, which is useful for scenarios where all nodes at the current depth should be explored before proceeding to the next depth.

PARAMETER DESCRIPTION
k

Maximum number of nodes to retrieve during traversal.

TYPE: int DEFAULT: 5

start_k

Number of documents to fetch via similarity for starting the traversal. Added to any initial roots provided to the traversal.

TYPE: int DEFAULT: 4

adjacent_k

Number of documents to fetch for each outgoing edge.

TYPE: int DEFAULT: 10

max_depth

Maximum traversal depth. If None, there is no limit.

TYPE: int | None DEFAULT: None

build staticmethod

build(base_strategy: Strategy, **kwargs: Any) -> Strategy

Build a strategy for a retrieval operation.

Combines a base strategy with any provided keyword arguments to create a customized traversal strategy.

PARAMETER DESCRIPTION
base_strategy

The base strategy to start with.

TYPE: Strategy

kwargs

Additional configuration options for the strategy.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Strategy

A configured strategy instance.

RAISES DESCRIPTION
ValueError

If 'strategy' is set incorrectly or extra arguments are invalid.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
@staticmethod
def build(
    base_strategy: Strategy,
    **kwargs: Any,
) -> Strategy:
    """
    Build a strategy for a retrieval operation.

    Combines a base strategy with any provided keyword arguments to
    create a customized traversal strategy.

    Parameters
    ----------
    base_strategy :
        The base strategy to start with.
    kwargs :
        Additional configuration options for the strategy.

    Returns
    -------
    :
        A configured strategy instance.

    Raises
    ------
    ValueError
        If 'strategy' is set incorrectly or extra arguments are invalid.
    """
    # Check if there is a new strategy to use. Otherwise, use the base.
    strategy: Strategy
    if "strategy" in kwargs:
        if next(iter(kwargs.keys())) != "strategy":
            raise ValueError("Error: 'strategy' must be set before other args.")
        strategy = kwargs.pop("strategy")
        if not isinstance(strategy, Strategy):
            raise ValueError(
                f"Unsupported 'strategy' type {type(strategy).__name__}."
                " Must be a sub-class of Strategy"
            )
    elif base_strategy is not None:
        strategy = base_strategy
    else:
        raise ValueError("'strategy' must be set in `__init__` or invocation")

    # Apply the kwargs to update the strategy.
    assert strategy is not None
    strategy = dataclasses.replace(strategy, **kwargs)

    return strategy

discover_nodes

discover_nodes(nodes: dict[str, Node]) -> None

Add discovered nodes to the strategy.

This method updates the strategy's state with nodes discovered during the traversal process.

PARAMETER DESCRIPTION
nodes

Discovered nodes keyed by their IDs.

TYPE: dict[str, Node]

Source code in packages/graph-retriever/src/graph_retriever/strategies/eager.py
@override
def discover_nodes(self, nodes: dict[str, Node]) -> None:
    self._nodes.extend(nodes.values())

finalize_nodes

finalize_nodes(nodes: Iterable[Node]) -> Iterable[Node]

Finalize the selected nodes.

This method is called before returning the final set of nodes.

PARAMETER DESCRIPTION
nodes

Nodes selected for finalization.

TYPE: Iterable[Node]

RETURNS DESCRIPTION
Iterable[Node]

Finalized nodes.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
def finalize_nodes(self, nodes: Iterable[Node]) -> Iterable[Node]:
    """
    Finalize the selected nodes.

    This method is called before returning the final set of nodes.

    Parameters
    ----------
    nodes :
        Nodes selected for finalization.

    Returns
    -------
    :
        Finalized nodes.
    """
    return nodes

select_nodes

select_nodes(*, limit: int) -> Iterable[Node]

Select discovered nodes to visit in the next iteration.

This method determines which nodes will be traversed next. If it returns an empty list, traversal ends even if fewer than k nodes have been selected.

PARAMETER DESCRIPTION
limit

Maximum number of nodes to select.

TYPE: int

RETURNS DESCRIPTION
Iterable[Node]

Selected nodes for the next iteration. Traversal ends if this is empty.

Source code in packages/graph-retriever/src/graph_retriever/strategies/eager.py
@override
def select_nodes(self, *, limit: int) -> Iterable[Node]:
    nodes = self._nodes[:limit]
    self._nodes = []
    return nodes

Mmr dataclass

Mmr(
    lambda_mult: float = 0.5,
    min_mmr_score: float = NEG_INF,
    _selected_ids: list[str] = list(),
    _candidate_id_to_index: dict[str, int] = dict(),
    _candidates: list[_MmrCandidate] = list(),
    _best_score: float = NEG_INF,
    _best_id: str | None = None,
    *,
    k: int = 5,
    start_k: int = 4,
    adjacent_k: int = 10,
    max_depth: int | None = None,
    _query_embedding: list[float] = list(),
)

Bases: Strategy

Maximal Marginal Relevance (MMR) traversal strategy.

This strategy selects nodes by balancing relevance to the query and diversity among the results. It uses a lambda_mult parameter to control the trade-off between relevance and redundancy. Nodes are scored based on their similarity to the query and their distance from already selected nodes.

PARAMETER DESCRIPTION
k

Maximum number of nodes to retrieve during traversal.

TYPE: int DEFAULT: 5

start_k

Number of documents to fetch via similarity for starting the traversal. Added to any initial roots provided to the traversal.

TYPE: int DEFAULT: 4

adjacent_k

Number of documents to fetch for each outgoing edge.

TYPE: int DEFAULT: 10

max_depth

Maximum traversal depth. If None, there is no limit.

TYPE: int | None DEFAULT: None

lambda_mult

Controls the trade-off between relevance and diversity. A value closer to 1 prioritizes relevance, while a value closer to 0 prioritizes diversity. Must be between 0 and 1 (inclusive).

TYPE: float DEFAULT: 0.5

min_mmr_score

Only nodes with a score greater than or equal to this value will be selected.

TYPE: float DEFAULT: NEG_INF

build staticmethod

build(base_strategy: Strategy, **kwargs: Any) -> Strategy

Build a strategy for a retrieval operation.

Combines a base strategy with any provided keyword arguments to create a customized traversal strategy.

PARAMETER DESCRIPTION
base_strategy

The base strategy to start with.

TYPE: Strategy

kwargs

Additional configuration options for the strategy.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Strategy

A configured strategy instance.

RAISES DESCRIPTION
ValueError

If 'strategy' is set incorrectly or extra arguments are invalid.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
@staticmethod
def build(
    base_strategy: Strategy,
    **kwargs: Any,
) -> Strategy:
    """
    Build a strategy for a retrieval operation.

    Combines a base strategy with any provided keyword arguments to
    create a customized traversal strategy.

    Parameters
    ----------
    base_strategy :
        The base strategy to start with.
    kwargs :
        Additional configuration options for the strategy.

    Returns
    -------
    :
        A configured strategy instance.

    Raises
    ------
    ValueError
        If 'strategy' is set incorrectly or extra arguments are invalid.
    """
    # Check if there is a new strategy to use. Otherwise, use the base.
    strategy: Strategy
    if "strategy" in kwargs:
        if next(iter(kwargs.keys())) != "strategy":
            raise ValueError("Error: 'strategy' must be set before other args.")
        strategy = kwargs.pop("strategy")
        if not isinstance(strategy, Strategy):
            raise ValueError(
                f"Unsupported 'strategy' type {type(strategy).__name__}."
                " Must be a sub-class of Strategy"
            )
    elif base_strategy is not None:
        strategy = base_strategy
    else:
        raise ValueError("'strategy' must be set in `__init__` or invocation")

    # Apply the kwargs to update the strategy.
    assert strategy is not None
    strategy = dataclasses.replace(strategy, **kwargs)

    return strategy

candidate_ids

candidate_ids() -> Iterable[str]

Return the IDs of the candidates.

RETURNS DESCRIPTION
Iterable[str]

The IDs of the candidates.

Source code in packages/graph-retriever/src/graph_retriever/strategies/mmr.py
def candidate_ids(self) -> Iterable[str]:
    """
    Return the IDs of the candidates.

    Returns
    -------
    Iterable[str]
        The IDs of the candidates.
    """
    return self._candidate_id_to_index.keys()

discover_nodes

discover_nodes(nodes: dict[str, Node]) -> None

Add candidates to the consideration set.

Source code in packages/graph-retriever/src/graph_retriever/strategies/mmr.py
@override
def discover_nodes(self, nodes: dict[str, Node]) -> None:
    """Add candidates to the consideration set."""
    # Determine the keys to actually include.
    # These are the candidates that aren't already selected
    # or under consideration.

    include_ids_set = set(nodes.keys())
    include_ids_set.difference_update(self._selected_ids)
    include_ids_set.difference_update(self._candidate_id_to_index.keys())
    include_ids = list(include_ids_set)

    # Now, build up a matrix of the remaining candidate embeddings.
    # And add them to the
    new_embeddings: NDArray[np.float32] = np.ndarray(
        (
            len(include_ids),
            self._dimensions,
        )
    )
    offset = self._candidate_embeddings.shape[0]
    for index, candidate_id in enumerate(include_ids):
        self._candidate_id_to_index[candidate_id] = offset + index
        new_embeddings[index] = nodes[candidate_id].embedding

    # Compute the similarity to the query.
    similarity = cosine_similarity(new_embeddings, self._nd_query_embedding)

    # Compute the distance metrics of all of pairs in the selected set with
    # the new candidates.
    redundancy = cosine_similarity(
        new_embeddings, self._already_selected_embeddings()
    )
    for index, candidate_id in enumerate(include_ids):
        max_redundancy = 0.0
        if redundancy.shape[0] > 0:
            max_redundancy = redundancy[index].max()
        candidate = _MmrCandidate(
            node=nodes[candidate_id],
            similarity=similarity[index][0],
            weighted_similarity=self.lambda_mult * similarity[index][0],
            weighted_redundancy=self._lambda_mult_complement * max_redundancy,
        )
        self._candidates.append(candidate)

        if candidate.score >= self._best_score:
            self._best_score = candidate.score
            self._best_id = candidate.node.id

    # Add the new embeddings to the candidate set.
    self._candidate_embeddings = np.vstack(
        (
            self._candidate_embeddings,
            new_embeddings,
        )
    )

finalize_nodes

finalize_nodes(nodes: Iterable[Node]) -> Iterable[Node]

Finalize the selected nodes.

This method is called before returning the final set of nodes.

PARAMETER DESCRIPTION
nodes

Nodes selected for finalization.

TYPE: Iterable[Node]

RETURNS DESCRIPTION
Iterable[Node]

Finalized nodes.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
def finalize_nodes(self, nodes: Iterable[Node]) -> Iterable[Node]:
    """
    Finalize the selected nodes.

    This method is called before returning the final set of nodes.

    Parameters
    ----------
    nodes :
        Nodes selected for finalization.

    Returns
    -------
    :
        Finalized nodes.
    """
    return nodes

select_nodes

select_nodes(*, limit: int) -> Iterable[Node]

Select and pop the best item being considered.

Updates the consideration set based on it.

RETURNS DESCRIPTION
A tuple containing the ID of the best item.
Source code in packages/graph-retriever/src/graph_retriever/strategies/mmr.py
@override
def select_nodes(self, *, limit: int) -> Iterable[Node]:
    """
    Select and pop the best item being considered.

    Updates the consideration set based on it.

    Returns
    -------
        A tuple containing the ID of the best item.
    """
    if limit == 0:
        return []
    if self._best_id is None or self._best_score < self.min_mmr_score:
        return []

    # Get the selection and remove from candidates.
    selected_id = self._best_id
    selected, selected_embedding = self._pop_candidate(selected_id)

    # Add the ID and embedding to the selected information.
    selection_index = len(self._selected_ids)
    self._selected_ids.append(selected_id)
    self._selected_embeddings[selection_index] = selected_embedding

    # Create the selected result node.
    selected_node = selected.node
    selected_node.extra_metadata = {
        "_similarity_score": selected.similarity,
        "_mmr_score": self._best_score,
    }

    # Reset the best score / best ID.
    self._best_score = NEG_INF
    self._best_id = None

    # Update the candidates redundancy, tracking the best node.
    if self._candidate_embeddings.shape[0] > 0:
        similarity = cosine_similarity(
            self._candidate_embeddings, np.expand_dims(selected_embedding, axis=0)
        )
        for index, candidate in enumerate(self._candidates):
            candidate.update_redundancy(similarity[index][0])
            if candidate.score > self._best_score:
                self._best_score = candidate.score
                self._best_id = candidate.node.id

    return [selected_node]

Scored dataclass

Scored(
    scorer: Callable[[Node], float],
    _nodes: list[_ScoredNode] = list(),
    per_iteration_limit: int | None = None,
    *,
    k: int = 5,
    start_k: int = 4,
    adjacent_k: int = 10,
    max_depth: int | None = None,
    _query_embedding: list[float] = list(),
)

Bases: Strategy

Strategy selecing nodes using a scoring function.

build staticmethod

build(base_strategy: Strategy, **kwargs: Any) -> Strategy

Build a strategy for a retrieval operation.

Combines a base strategy with any provided keyword arguments to create a customized traversal strategy.

PARAMETER DESCRIPTION
base_strategy

The base strategy to start with.

TYPE: Strategy

kwargs

Additional configuration options for the strategy.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Strategy

A configured strategy instance.

RAISES DESCRIPTION
ValueError

If 'strategy' is set incorrectly or extra arguments are invalid.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
@staticmethod
def build(
    base_strategy: Strategy,
    **kwargs: Any,
) -> Strategy:
    """
    Build a strategy for a retrieval operation.

    Combines a base strategy with any provided keyword arguments to
    create a customized traversal strategy.

    Parameters
    ----------
    base_strategy :
        The base strategy to start with.
    kwargs :
        Additional configuration options for the strategy.

    Returns
    -------
    :
        A configured strategy instance.

    Raises
    ------
    ValueError
        If 'strategy' is set incorrectly or extra arguments are invalid.
    """
    # Check if there is a new strategy to use. Otherwise, use the base.
    strategy: Strategy
    if "strategy" in kwargs:
        if next(iter(kwargs.keys())) != "strategy":
            raise ValueError("Error: 'strategy' must be set before other args.")
        strategy = kwargs.pop("strategy")
        if not isinstance(strategy, Strategy):
            raise ValueError(
                f"Unsupported 'strategy' type {type(strategy).__name__}."
                " Must be a sub-class of Strategy"
            )
    elif base_strategy is not None:
        strategy = base_strategy
    else:
        raise ValueError("'strategy' must be set in `__init__` or invocation")

    # Apply the kwargs to update the strategy.
    assert strategy is not None
    strategy = dataclasses.replace(strategy, **kwargs)

    return strategy

discover_nodes

discover_nodes(nodes: dict[str, Node]) -> None

Add discovered nodes to the strategy.

This method updates the strategy's state with nodes discovered during the traversal process.

PARAMETER DESCRIPTION
nodes

Discovered nodes keyed by their IDs.

TYPE: dict[str, Node]

Source code in packages/graph-retriever/src/graph_retriever/strategies/scored.py
@override
def discover_nodes(self, nodes: dict[str, Node]) -> None:
    for node in nodes.values():
        heapq.heappush(self._nodes, _ScoredNode(self.scorer(node), node))

finalize_nodes

finalize_nodes(nodes: Iterable[Node]) -> Iterable[Node]

Finalize the selected nodes.

This method is called before returning the final set of nodes.

PARAMETER DESCRIPTION
nodes

Nodes selected for finalization.

TYPE: Iterable[Node]

RETURNS DESCRIPTION
Iterable[Node]

Finalized nodes.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
def finalize_nodes(self, nodes: Iterable[Node]) -> Iterable[Node]:
    """
    Finalize the selected nodes.

    This method is called before returning the final set of nodes.

    Parameters
    ----------
    nodes :
        Nodes selected for finalization.

    Returns
    -------
    :
        Finalized nodes.
    """
    return nodes

select_nodes

select_nodes(*, limit: int) -> Iterable[Node]

Select discovered nodes to visit in the next iteration.

This method determines which nodes will be traversed next. If it returns an empty list, traversal ends even if fewer than k nodes have been selected.

PARAMETER DESCRIPTION
limit

Maximum number of nodes to select.

TYPE: int

RETURNS DESCRIPTION
Iterable[Node]

Selected nodes for the next iteration. Traversal ends if this is empty.

Source code in packages/graph-retriever/src/graph_retriever/strategies/scored.py
@override
def select_nodes(self, *, limit: int) -> Iterable[Node]:
    if self.per_iteration_limit and self.per_iteration_limit < limit:
        limit = self.per_iteration_limit

    selected = []
    for _x in range(limit):
        if not self._nodes:
            break

        selected.append(heapq.heappop(self._nodes).node)
    return selected

Strategy dataclass

Strategy(
    *,
    k: int = 5,
    start_k: int = 4,
    adjacent_k: int = 10,
    max_depth: int | None = None,
    _query_embedding: list[float] = list(),
)

Bases: ABC

Interface for configuring node selection and traversal strategies.

This base class defines how nodes are selected, traversed, and finalized during a graph traversal. Implementations can customize behaviors like limiting the depth of traversal, scoring nodes, or selecting the next set of nodes for exploration.

PARAMETER DESCRIPTION
k

Maximum number of nodes to retrieve during traversal.

TYPE: int DEFAULT: 5

start_k

Number of documents to fetch via similarity for starting the traversal. Added to any initial roots provided to the traversal.

TYPE: int DEFAULT: 4

adjacent_k

Number of documents to fetch for each outgoing edge.

TYPE: int DEFAULT: 10

max_depth

Maximum traversal depth. If None, there is no limit.

TYPE: int | None DEFAULT: None

build staticmethod

build(base_strategy: Strategy, **kwargs: Any) -> Strategy

Build a strategy for a retrieval operation.

Combines a base strategy with any provided keyword arguments to create a customized traversal strategy.

PARAMETER DESCRIPTION
base_strategy

The base strategy to start with.

TYPE: Strategy

kwargs

Additional configuration options for the strategy.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Strategy

A configured strategy instance.

RAISES DESCRIPTION
ValueError

If 'strategy' is set incorrectly or extra arguments are invalid.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
@staticmethod
def build(
    base_strategy: Strategy,
    **kwargs: Any,
) -> Strategy:
    """
    Build a strategy for a retrieval operation.

    Combines a base strategy with any provided keyword arguments to
    create a customized traversal strategy.

    Parameters
    ----------
    base_strategy :
        The base strategy to start with.
    kwargs :
        Additional configuration options for the strategy.

    Returns
    -------
    :
        A configured strategy instance.

    Raises
    ------
    ValueError
        If 'strategy' is set incorrectly or extra arguments are invalid.
    """
    # Check if there is a new strategy to use. Otherwise, use the base.
    strategy: Strategy
    if "strategy" in kwargs:
        if next(iter(kwargs.keys())) != "strategy":
            raise ValueError("Error: 'strategy' must be set before other args.")
        strategy = kwargs.pop("strategy")
        if not isinstance(strategy, Strategy):
            raise ValueError(
                f"Unsupported 'strategy' type {type(strategy).__name__}."
                " Must be a sub-class of Strategy"
            )
    elif base_strategy is not None:
        strategy = base_strategy
    else:
        raise ValueError("'strategy' must be set in `__init__` or invocation")

    # Apply the kwargs to update the strategy.
    assert strategy is not None
    strategy = dataclasses.replace(strategy, **kwargs)

    return strategy

discover_nodes abstractmethod

discover_nodes(nodes: dict[str, Node]) -> None

Add discovered nodes to the strategy.

This method updates the strategy's state with nodes discovered during the traversal process.

PARAMETER DESCRIPTION
nodes

Discovered nodes keyed by their IDs.

TYPE: dict[str, Node]

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
@abc.abstractmethod
def discover_nodes(self, nodes: dict[str, Node]) -> None:
    """
    Add discovered nodes to the strategy.

    This method updates the strategy's state with nodes discovered during
    the traversal process.

    Parameters
    ----------
    nodes :
        Discovered nodes keyed by their IDs.
    """
    ...

finalize_nodes

finalize_nodes(nodes: Iterable[Node]) -> Iterable[Node]

Finalize the selected nodes.

This method is called before returning the final set of nodes.

PARAMETER DESCRIPTION
nodes

Nodes selected for finalization.

TYPE: Iterable[Node]

RETURNS DESCRIPTION
Iterable[Node]

Finalized nodes.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
def finalize_nodes(self, nodes: Iterable[Node]) -> Iterable[Node]:
    """
    Finalize the selected nodes.

    This method is called before returning the final set of nodes.

    Parameters
    ----------
    nodes :
        Nodes selected for finalization.

    Returns
    -------
    :
        Finalized nodes.
    """
    return nodes

select_nodes abstractmethod

select_nodes(*, limit: int) -> Iterable[Node]

Select discovered nodes to visit in the next iteration.

This method determines which nodes will be traversed next. If it returns an empty list, traversal ends even if fewer than k nodes have been selected.

PARAMETER DESCRIPTION
limit

Maximum number of nodes to select.

TYPE: int

RETURNS DESCRIPTION
Iterable[Node]

Selected nodes for the next iteration. Traversal ends if this is empty.

Source code in packages/graph-retriever/src/graph_retriever/strategies/base.py
@abc.abstractmethod
def select_nodes(self, *, limit: int) -> Iterable[Node]:
    """
    Select discovered nodes to visit in the next iteration.

    This method determines which nodes will be traversed next. If it returns
    an empty list, traversal ends even if fewer than `k` nodes have been selected.

    Parameters
    ----------
    limit :
        Maximum number of nodes to select.

    Returns
    -------
    :
        Selected nodes for the next iteration. Traversal ends if this is empty.
    """
    ...