Response Models¶

The response module provides data models and parsing functions for LLM graph generation responses. It handles both edge list and adjacency matrix formats with robust JSON extraction.

Import Pattern¶

from causaliq_knowledge.graph import (
    ProposedEdge,
    GeneratedGraph,
    GenerationMetadata,
    parse_graph_response,
)

Data Models¶

ProposedEdge¶

A Pydantic model representing a single proposed causal edge from an LLM.

from causaliq_knowledge.graph import ProposedEdge

edge = ProposedEdge(
    source="smoking",
    target="lung_cancer",
    confidence=0.95,
    reasoning="Well-established causal relationship from epidemiological studies"
)

Attributes:

source (str): Name of the source (cause) variable
target (str): Name of the target (effect) variable
confidence (float): Confidence score between 0.0 and 1.0
reasoning (str, optional): LLM's reasoning for proposing this edge

GenerationMetadata¶

A dataclass containing metadata about the graph generation process.

from causaliq_knowledge.graph import GenerationMetadata

metadata = GenerationMetadata(
    model_name="gemini-2.0-flash",
    prompt_tokens=450,
    completion_tokens=320,
    view_level="standard",
    disguised=False,
    output_format="edge_list"
)

Attributes:

model_name (str): Name of the LLM model used
prompt_tokens (int, optional): Number of tokens in the prompt
completion_tokens (int, optional): Number of tokens in the response
view_level (str, optional): Context level used (minimal/standard/rich)
disguised (bool): Whether variable names were disguised
output_format (str): Response format (edge_list/adjacency_matrix)

GeneratedGraph¶

A dataclass representing a complete generated causal graph.

from causaliq_knowledge.graph import GeneratedGraph, ProposedEdge

graph = GeneratedGraph(
    edges=[
        ProposedEdge(source="A", target="B", confidence=0.9),
        ProposedEdge(source="B", target="C", confidence=0.85),
    ],
    variables=["A", "B", "C"],
    reasoning="Based on temporal ordering and domain knowledge...",
    metadata=None
)

Attributes:

edges (list[ProposedEdge]): List of proposed causal edges
variables (list[str]): List of variable names in the graph
reasoning (str, optional): Overall reasoning for the graph structure
metadata (GenerationMetadata, optional): Generation metadata

Parsing Functions¶

parse_graph_response¶

Parse an LLM response string into a GeneratedGraph object.

from causaliq_knowledge.graph import parse_graph_response

response_text = '''```json
{
    "edges": [
        {"source": "A", "target": "B", "confidence": 0.9},
        {"source": "B", "target": "C", "confidence": 0.85}
    ],
    "reasoning": "Based on causal principles..."
}
```'''

graph = parse_graph_response(
    response_text=response_text,
    variables=["A", "B", "C"],
    output_format="edge_list"
)

Parameters:

response_text (str): Raw LLM response text (may include markdown)
variables (list[str]): Expected variable names for validation
output_format (str): Expected format ("edge_list" or "adjacency_matrix")

Returns: GeneratedGraph object

Raises: ValueError if JSON parsing fails or format is invalid

Response Formats¶

The module supports two response formats:

Edge List Format¶

{
    "edges": [
        {
            "source": "variable_a",
            "target": "variable_b",
            "confidence": 0.9,
            "reasoning": "Optional per-edge reasoning"
        }
    ],
    "reasoning": "Overall graph reasoning"
}

Adjacency Matrix Format¶

{
    "variables": ["A", "B", "C"],
    "adjacency_matrix": [
        [0.0, 0.9, 0.0],
        [0.0, 0.0, 0.85],
        [0.0, 0.0, 0.0]
    ],
    "reasoning": "Overall graph reasoning"
}

Values in the adjacency matrix represent confidence scores. A value at position [i][j] indicates an edge from variables[i] to variables[j].

API Reference¶

Response models and parsing for LLM graph generation.

This module provides Pydantic models for representing LLM-generated causal graphs and functions for parsing LLM responses in both edge list and adjacency matrix formats.

Classes:

ProposedEdge –

A proposed causal edge from LLM graph generation.
GenerationMetadata –

Metadata about a graph generation request.
GeneratedGraph –

A complete causal graph generated by an LLM.

Functions:

parse_graph_response –

Parse an LLM response into a GeneratedGraph.

ProposedEdge ¶

A proposed causal edge from LLM graph generation.

Represents a single directed edge in the proposed causal graph, with confidence score and optional reasoning.

Attributes:

source (str) –

The name of the source variable (cause).
target (str) –

The name of the target variable (effect).
confidence (float) –

Confidence score from 0.0 to 1.0.
reasoning (Optional[str]) –

Optional explanation for this specific edge.

Example

edge = ProposedEdge( ... source="smoking", ... target="lung_cancer", ... confidence=0.95, ... ) print(f"{edge.source} -> {edge.target}: {edge.confidence}") smoking -> lung_cancer: 0.95

Methods:

clamp_confidence –

Clamp confidence values to [0.0, 1.0] range.

clamp_confidence `classmethod` ¶

clamp_confidence(v: Any) -> float

Clamp confidence values to [0.0, 1.0] range.

GenerationMetadata `dataclass` ¶

GenerationMetadata(
    model: str,
    provider: str = "",
    timestamp: datetime = (lambda: now(utc))(),
    llm_timestamp: datetime = (lambda: now(utc))(),
    llm_latency_ms: int = 0,
    input_tokens: int = 0,
    output_tokens: int = 0,
    from_cache: bool = False,
    messages: List[Dict[str, Any]] = list(),
    temperature: float = 0.1,
    max_tokens: int = 2000,
    finish_reason: str = "stop",
    llm_cost_usd: float = 0.0,
)

Metadata about a graph generation request.

Attributes:

model (str) –

The LLM model used for generation.
provider (str) –

The LLM provider (e.g., "groq", "gemini").
timestamp (datetime) –

When this request was made (current request time).
llm_timestamp (datetime) –

When the LLM originally responded (from cache if maybe).
llm_latency_ms (int) –

Original LLM response latency in milliseconds.
input_tokens (int) –

Number of input tokens used.
output_tokens (int) –

Number of output tokens generated.
from_cache (bool) –

Whether the response was from cache.
messages (List[Dict[str, Any]]) –

The messages sent to the LLM.
temperature (float) –

Sampling temperature used.
max_tokens (int) –

Maximum tokens requested.
finish_reason (str) –

Why generation stopped (stop, length, etc.).
llm_cost_usd (float) –

Cost when the LLM request was originally made.

Methods:

to_dict –

Convert generation metadata to a dictionary.

initial_cost_usd `property` ¶

initial_cost_usd: float

Alias for llm_cost_usd (backward compatibility).

latency_ms `property` ¶

latency_ms: int

Alias for llm_latency_ms (backward compatibility).

to_dict ¶

to_dict() -> Dict[str, Any]

Convert generation metadata to a dictionary.

Returns a dictionary suitable for JSON serialisation, containing all generation provenance information.

Returns:

Dict[str, Any] –

Dictionary with all metadata fields.

GeneratedGraph `dataclass` ¶

GeneratedGraph(
    edges: List[ProposedEdge],
    variables: List[str],
    reasoning: str = "",
    metadata: Optional[GenerationMetadata] = None,
    raw_response: Optional[Dict[str, Any]] = None,
)

A complete causal graph generated by an LLM.

Represents the full output from an LLM graph generation query, including all proposed edges, metadata, and the LLM's reasoning.

Attributes:

edges (List[ProposedEdge]) –

List of proposed causal edges.
variables (List[str]) –

List of variable names in the graph.
reasoning (str) –

Overall reasoning provided by the LLM.
metadata (Optional[GenerationMetadata]) –

Generation metadata (model, timing, etc.).
raw_response (Optional[Dict[str, Any]]) –

The original LLM response for debugging.

Example

edge1 = ProposedEdge( ... source="age", target="income", confidence=0.7 ... ) edge2 = ProposedEdge( ... source="education", target="income", confidence=0.9 ... ) graph = GeneratedGraph( ... edges=[edge1, edge2], ... variables=["age", "education", "income"], ... reasoning="Age and education both influence income.", ... metadata=GenerationMetadata(model="llama-3.1-8b-instant"), ... ) print(f"Generated {len(graph.edges)} edges") Generated 2 edges

Methods:

filter_by_confidence –

Return a new graph with only edges above the threshold.
get_adjacency_matrix –

Convert edges to an adjacency matrix.
get_edge_list –

Get edges as a list of tuples.

filter_by_confidence ¶

filter_by_confidence(threshold: float = 0.5) -> 'GeneratedGraph'

Return a new graph with only edges above the threshold.

Parameters:

threshold ¶
(float, default: 0.5 ) –

Minimum confidence score to include.

Returns:

'GeneratedGraph' –

New GeneratedGraph with filtered edges.

get_adjacency_matrix ¶

get_adjacency_matrix() -> List[List[float]]

Convert edges to an adjacency matrix.

Creates a square matrix where entry (i,j) represents the confidence that variable i causes variable j.

Returns:

List[List[float]] –

Square matrix of confidence scores.

get_edge_list ¶

get_edge_list() -> List[tuple[str, str, float]]

Get edges as a list of tuples.

Returns:

List[tuple[str, str, float]] –

List of (source, target, confidence) tuples.

parse_graph_response ¶

parse_graph_response(
    response_text: str, variables: List[str], output_format: str = "edge_list"
) -> GeneratedGraph

Parse an LLM response into a GeneratedGraph.

Handles JSON extraction from markdown code blocks and parses according to the specified output format.

Parameters:

response_text ¶
(str) –

Raw text response from the LLM.
variables ¶
(List[str]) –

List of valid variable names.
output_format ¶
(str, default: 'edge_list' ) –

Expected format ("edge_list" or "adjacency_matrix").

Returns:

GeneratedGraph –

GeneratedGraph with parsed edges and metadata.

Raises:

ValueError –

If JSON parsing fails or format is invalid.

Response Models¶

Import Pattern¶

Data Models¶

ProposedEdge¶

GenerationMetadata¶

GeneratedGraph¶

Parsing Functions¶

parse_graph_response¶

Response Formats¶

Edge List Format¶

Adjacency Matrix Format¶

API Reference¶

ProposedEdge ¶

clamp_confidence `classmethod` ¶

GenerationMetadata `dataclass` ¶

initial_cost_usd `property` ¶

latency_ms `property` ¶

to_dict ¶

GeneratedGraph `dataclass` ¶

filter_by_confidence ¶

`threshold` ¶

get_adjacency_matrix ¶

get_edge_list ¶

parse_graph_response ¶

`response_text` ¶

`variables` ¶

`output_format` ¶

Response Models¶

Import Pattern¶

Data Models¶

ProposedEdge¶

GenerationMetadata¶

GeneratedGraph¶

Parsing Functions¶

parse_graph_response¶

Response Formats¶

Edge List Format¶

Adjacency Matrix Format¶

API Reference¶

ProposedEdge ¶

clamp_confidence classmethod ¶

GenerationMetadata dataclass ¶

initial_cost_usd property ¶

latency_ms property ¶

to_dict ¶

GeneratedGraph dataclass ¶

filter_by_confidence ¶

threshold ¶

get_adjacency_matrix ¶

get_edge_list ¶

parse_graph_response ¶

response_text ¶

variables ¶

output_format ¶

clamp_confidence `classmethod` ¶

GenerationMetadata `dataclass` ¶

initial_cost_usd `property` ¶

latency_ms `property` ¶

GeneratedGraph `dataclass` ¶

`threshold` ¶

`response_text` ¶

`variables` ¶

`output_format` ¶