Skip to content

Response Models

The response module provides data models and parsing functions for LLM graph generation responses. It handles both edge list and adjacency matrix formats with robust JSON extraction.

Import Pattern

from causaliq_knowledge.graph import (
    ProposedEdge,
    GeneratedGraph,
    GenerationMetadata,
    parse_graph_response,
)

Data Models

ProposedEdge

A Pydantic model representing a single proposed causal edge from an LLM.

from causaliq_knowledge.graph import ProposedEdge

edge = ProposedEdge(
    source="smoking",
    target="lung_cancer",
    confidence=0.95,
    reasoning="Well-established causal relationship from epidemiological studies"
)

Attributes:

  • source (str): Name of the source (cause) variable
  • target (str): Name of the target (effect) variable
  • confidence (float): Confidence score between 0.0 and 1.0
  • reasoning (str, optional): LLM's reasoning for proposing this edge

GenerationMetadata

A dataclass containing metadata about the graph generation process.

from causaliq_knowledge.graph import GenerationMetadata

metadata = GenerationMetadata(
    model_name="gemini-2.0-flash",
    prompt_tokens=450,
    completion_tokens=320,
    view_level="standard",
    disguised=False,
    output_format="edge_list"
)

Attributes:

  • model_name (str): Name of the LLM model used
  • prompt_tokens (int, optional): Number of tokens in the prompt
  • completion_tokens (int, optional): Number of tokens in the response
  • view_level (str, optional): Context level used (minimal/standard/rich)
  • disguised (bool): Whether variable names were disguised
  • output_format (str): Response format (edge_list/adjacency_matrix)

GeneratedGraph

A dataclass representing a complete generated causal graph.

from causaliq_knowledge.graph import GeneratedGraph, ProposedEdge

graph = GeneratedGraph(
    edges=[
        ProposedEdge(source="A", target="B", confidence=0.9),
        ProposedEdge(source="B", target="C", confidence=0.85),
    ],
    variables=["A", "B", "C"],
    reasoning="Based on temporal ordering and domain knowledge...",
    metadata=None
)

Attributes:

  • edges (list[ProposedEdge]): List of proposed causal edges
  • variables (list[str]): List of variable names in the graph
  • reasoning (str, optional): Overall reasoning for the graph structure
  • metadata (GenerationMetadata, optional): Generation metadata

Parsing Functions

parse_graph_response

Parse an LLM response string into a GeneratedGraph object.

from causaliq_knowledge.graph import parse_graph_response

response_text = '''```json
{
    "edges": [
        {"source": "A", "target": "B", "confidence": 0.9},
        {"source": "B", "target": "C", "confidence": 0.85}
    ],
    "reasoning": "Based on causal principles..."
}
```'''

graph = parse_graph_response(
    response_text=response_text,
    variables=["A", "B", "C"],
    output_format="edge_list"
)

Parameters:

  • response_text (str): Raw LLM response text (may include markdown)
  • variables (list[str]): Expected variable names for validation
  • output_format (str): Expected format ("edge_list" or "adjacency_matrix")

Returns: GeneratedGraph object

Raises: ValueError if JSON parsing fails or format is invalid

Response Formats

The module supports two response formats:

Edge List Format

{
    "edges": [
        {
            "source": "variable_a",
            "target": "variable_b",
            "confidence": 0.9,
            "reasoning": "Optional per-edge reasoning"
        }
    ],
    "reasoning": "Overall graph reasoning"
}

Adjacency Matrix Format

{
    "variables": ["A", "B", "C"],
    "adjacency_matrix": [
        [0.0, 0.9, 0.0],
        [0.0, 0.0, 0.85],
        [0.0, 0.0, 0.0]
    ],
    "reasoning": "Overall graph reasoning"
}

Values in the adjacency matrix represent confidence scores. A value at position [i][j] indicates an edge from variables[i] to variables[j].

API Reference

Response models and parsing for LLM graph generation.

This module provides Pydantic models for representing LLM-generated causal graphs and functions for parsing LLM responses in both edge list and adjacency matrix formats.

Classes:

Functions:

ProposedEdge

A proposed causal edge from LLM graph generation.

Represents a single directed edge in the proposed causal graph, with confidence score and optional reasoning.

Attributes:

  • source (str) –

    The name of the source variable (cause).

  • target (str) –

    The name of the target variable (effect).

  • confidence (float) –

    Confidence score from 0.0 to 1.0.

  • reasoning (Optional[str]) –

    Optional explanation for this specific edge.

Example

edge = ProposedEdge( ... source="smoking", ... target="lung_cancer", ... confidence=0.95, ... ) print(f"{edge.source} -> {edge.target}: {edge.confidence}") smoking -> lung_cancer: 0.95

Methods:

clamp_confidence classmethod

clamp_confidence(v: Any) -> float

Clamp confidence values to [0.0, 1.0] range.

GenerationMetadata dataclass

GenerationMetadata(
    model: str,
    provider: str = "",
    timestamp: datetime = (lambda: now(utc))(),
    llm_timestamp: datetime = (lambda: now(utc))(),
    llm_latency_ms: int = 0,
    input_tokens: int = 0,
    output_tokens: int = 0,
    from_cache: bool = False,
    messages: List[Dict[str, Any]] = list(),
    temperature: float = 0.1,
    max_tokens: int = 2000,
    finish_reason: str = "stop",
    llm_cost_usd: float = 0.0,
)

Metadata about a graph generation request.

Attributes:

  • model (str) –

    The LLM model used for generation.

  • provider (str) –

    The LLM provider (e.g., "groq", "gemini").

  • timestamp (datetime) –

    When this request was made (current request time).

  • llm_timestamp (datetime) –

    When the LLM originally responded (from cache if maybe).

  • llm_latency_ms (int) –

    Original LLM response latency in milliseconds.

  • input_tokens (int) –

    Number of input tokens used.

  • output_tokens (int) –

    Number of output tokens generated.

  • from_cache (bool) –

    Whether the response was from cache.

  • messages (List[Dict[str, Any]]) –

    The messages sent to the LLM.

  • temperature (float) –

    Sampling temperature used.

  • max_tokens (int) –

    Maximum tokens requested.

  • finish_reason (str) –

    Why generation stopped (stop, length, etc.).

  • llm_cost_usd (float) –

    Cost when the LLM request was originally made.

Methods:

  • to_dict

    Convert generation metadata to a dictionary.

initial_cost_usd property

initial_cost_usd: float

Alias for llm_cost_usd (backward compatibility).

latency_ms property

latency_ms: int

Alias for llm_latency_ms (backward compatibility).

to_dict

to_dict() -> Dict[str, Any]

Convert generation metadata to a dictionary.

Returns a dictionary suitable for JSON serialisation, containing all generation provenance information.

Returns:

  • Dict[str, Any]

    Dictionary with all metadata fields.

GeneratedGraph dataclass

GeneratedGraph(
    edges: List[ProposedEdge],
    variables: List[str],
    reasoning: str = "",
    metadata: Optional[GenerationMetadata] = None,
    raw_response: Optional[Dict[str, Any]] = None,
)

A complete causal graph generated by an LLM.

Represents the full output from an LLM graph generation query, including all proposed edges, metadata, and the LLM's reasoning.

Attributes:

  • edges (List[ProposedEdge]) –

    List of proposed causal edges.

  • variables (List[str]) –

    List of variable names in the graph.

  • reasoning (str) –

    Overall reasoning provided by the LLM.

  • metadata (Optional[GenerationMetadata]) –

    Generation metadata (model, timing, etc.).

  • raw_response (Optional[Dict[str, Any]]) –

    The original LLM response for debugging.

Example

edge1 = ProposedEdge( ... source="age", target="income", confidence=0.7 ... ) edge2 = ProposedEdge( ... source="education", target="income", confidence=0.9 ... ) graph = GeneratedGraph( ... edges=[edge1, edge2], ... variables=["age", "education", "income"], ... reasoning="Age and education both influence income.", ... metadata=GenerationMetadata(model="llama-3.1-8b-instant"), ... ) print(f"Generated {len(graph.edges)} edges") Generated 2 edges

Methods:

filter_by_confidence

filter_by_confidence(threshold: float = 0.5) -> 'GeneratedGraph'

Return a new graph with only edges above the threshold.

Parameters:

  • threshold

    (float, default: 0.5 ) –

    Minimum confidence score to include.

Returns:

  • 'GeneratedGraph'

    New GeneratedGraph with filtered edges.

get_adjacency_matrix

get_adjacency_matrix() -> List[List[float]]

Convert edges to an adjacency matrix.

Creates a square matrix where entry (i,j) represents the confidence that variable i causes variable j.

Returns:

  • List[List[float]]

    Square matrix of confidence scores.

get_edge_list

get_edge_list() -> List[tuple[str, str, float]]

Get edges as a list of tuples.

Returns:

  • List[tuple[str, str, float]]

    List of (source, target, confidence) tuples.

parse_graph_response

parse_graph_response(
    response_text: str, variables: List[str], output_format: str = "edge_list"
) -> GeneratedGraph

Parse an LLM response into a GeneratedGraph.

Handles JSON extraction from markdown code blocks and parses according to the specified output format.

Parameters:

  • response_text

    (str) –

    Raw text response from the LLM.

  • variables

    (List[str]) –

    List of valid variable names.

  • output_format

    (str, default: 'edge_list' ) –

    Expected format ("edge_list" or "adjacency_matrix").

Returns:

Raises:

  • ValueError

    If JSON parsing fails or format is invalid.