Response Models¶
The response module provides data models and parsing functions for LLM graph
generation responses. It handles both edge list and adjacency matrix formats
with robust JSON extraction.
Import Pattern¶
from causaliq_knowledge.graph import (
ProposedEdge,
GeneratedGraph,
GenerationMetadata,
parse_graph_response,
)
Data Models¶
ProposedEdge¶
A Pydantic model representing a single proposed causal edge from an LLM.
from causaliq_knowledge.graph import ProposedEdge
edge = ProposedEdge(
source="smoking",
target="lung_cancer",
confidence=0.95,
reasoning="Well-established causal relationship from epidemiological studies"
)
Attributes:
source(str): Name of the source (cause) variabletarget(str): Name of the target (effect) variableconfidence(float): Confidence score between 0.0 and 1.0reasoning(str, optional): LLM's reasoning for proposing this edge
GenerationMetadata¶
A dataclass containing metadata about the graph generation process.
from causaliq_knowledge.graph import GenerationMetadata
metadata = GenerationMetadata(
model_name="gemini-2.0-flash",
prompt_tokens=450,
completion_tokens=320,
view_level="standard",
disguised=False,
output_format="edge_list"
)
Attributes:
model_name(str): Name of the LLM model usedprompt_tokens(int, optional): Number of tokens in the promptcompletion_tokens(int, optional): Number of tokens in the responseview_level(str, optional): Context level used (minimal/standard/rich)disguised(bool): Whether variable names were disguisedoutput_format(str): Response format (edge_list/adjacency_matrix)
GeneratedGraph¶
A dataclass representing a complete generated causal graph.
from causaliq_knowledge.graph import GeneratedGraph, ProposedEdge
graph = GeneratedGraph(
edges=[
ProposedEdge(source="A", target="B", confidence=0.9),
ProposedEdge(source="B", target="C", confidence=0.85),
],
variables=["A", "B", "C"],
reasoning="Based on temporal ordering and domain knowledge...",
metadata=None
)
Attributes:
edges(list[ProposedEdge]): List of proposed causal edgesvariables(list[str]): List of variable names in the graphreasoning(str, optional): Overall reasoning for the graph structuremetadata(GenerationMetadata, optional): Generation metadata
Parsing Functions¶
parse_graph_response¶
Parse an LLM response string into a GeneratedGraph object.
from causaliq_knowledge.graph import parse_graph_response
response_text = '''```json
{
"edges": [
{"source": "A", "target": "B", "confidence": 0.9},
{"source": "B", "target": "C", "confidence": 0.85}
],
"reasoning": "Based on causal principles..."
}
```'''
graph = parse_graph_response(
response_text=response_text,
variables=["A", "B", "C"],
output_format="edge_list"
)
Parameters:
response_text(str): Raw LLM response text (may include markdown)variables(list[str]): Expected variable names for validationoutput_format(str): Expected format ("edge_list" or "adjacency_matrix")
Returns: GeneratedGraph object
Raises: ValueError if JSON parsing fails or format is invalid
Response Formats¶
The module supports two response formats:
Edge List Format¶
{
"edges": [
{
"source": "variable_a",
"target": "variable_b",
"confidence": 0.9,
"reasoning": "Optional per-edge reasoning"
}
],
"reasoning": "Overall graph reasoning"
}
Adjacency Matrix Format¶
{
"variables": ["A", "B", "C"],
"adjacency_matrix": [
[0.0, 0.9, 0.0],
[0.0, 0.0, 0.85],
[0.0, 0.0, 0.0]
],
"reasoning": "Overall graph reasoning"
}
Values in the adjacency matrix represent confidence scores. A value at
position [i][j] indicates an edge from variables[i] to variables[j].
API Reference¶
Response models and parsing for LLM graph generation.
This module provides Pydantic models for representing LLM-generated causal graphs and functions for parsing LLM responses in both edge list and adjacency matrix formats.
Classes:
-
ProposedEdge–A proposed causal edge from LLM graph generation.
-
GenerationMetadata–Metadata about a graph generation request.
-
GeneratedGraph–A complete causal graph generated by an LLM.
Functions:
-
parse_graph_response–Parse an LLM response into a GeneratedGraph.
ProposedEdge
¶
A proposed causal edge from LLM graph generation.
Represents a single directed edge in the proposed causal graph, with confidence score and optional reasoning.
Attributes:
-
source(str) –The name of the source variable (cause).
-
target(str) –The name of the target variable (effect).
-
confidence(float) –Confidence score from 0.0 to 1.0.
-
reasoning(Optional[str]) –Optional explanation for this specific edge.
Example
edge = ProposedEdge( ... source="smoking", ... target="lung_cancer", ... confidence=0.95, ... ) print(f"{edge.source} -> {edge.target}: {edge.confidence}") smoking -> lung_cancer: 0.95
Methods:
-
clamp_confidence–Clamp confidence values to [0.0, 1.0] range.
clamp_confidence
classmethod
¶
Clamp confidence values to [0.0, 1.0] range.
GenerationMetadata
dataclass
¶
GenerationMetadata(
model: str,
provider: str = "",
timestamp: datetime = (lambda: now(utc))(),
llm_timestamp: datetime = (lambda: now(utc))(),
llm_latency_ms: int = 0,
input_tokens: int = 0,
output_tokens: int = 0,
from_cache: bool = False,
messages: List[Dict[str, Any]] = list(),
temperature: float = 0.1,
max_tokens: int = 2000,
finish_reason: str = "stop",
llm_cost_usd: float = 0.0,
)
Metadata about a graph generation request.
Attributes:
-
model(str) –The LLM model used for generation.
-
provider(str) –The LLM provider (e.g., "groq", "gemini").
-
timestamp(datetime) –When this request was made (current request time).
-
llm_timestamp(datetime) –When the LLM originally responded (from cache if maybe).
-
llm_latency_ms(int) –Original LLM response latency in milliseconds.
-
input_tokens(int) –Number of input tokens used.
-
output_tokens(int) –Number of output tokens generated.
-
from_cache(bool) –Whether the response was from cache.
-
messages(List[Dict[str, Any]]) –The messages sent to the LLM.
-
temperature(float) –Sampling temperature used.
-
max_tokens(int) –Maximum tokens requested.
-
finish_reason(str) –Why generation stopped (stop, length, etc.).
-
llm_cost_usd(float) –Cost when the LLM request was originally made.
Methods:
-
to_dict–Convert generation metadata to a dictionary.
GeneratedGraph
dataclass
¶
GeneratedGraph(
edges: List[ProposedEdge],
variables: List[str],
reasoning: str = "",
metadata: Optional[GenerationMetadata] = None,
raw_response: Optional[Dict[str, Any]] = None,
)
A complete causal graph generated by an LLM.
Represents the full output from an LLM graph generation query, including all proposed edges, metadata, and the LLM's reasoning.
Attributes:
-
edges(List[ProposedEdge]) –List of proposed causal edges.
-
variables(List[str]) –List of variable names in the graph.
-
reasoning(str) –Overall reasoning provided by the LLM.
-
metadata(Optional[GenerationMetadata]) –Generation metadata (model, timing, etc.).
-
raw_response(Optional[Dict[str, Any]]) –The original LLM response for debugging.
Example
edge1 = ProposedEdge( ... source="age", target="income", confidence=0.7 ... ) edge2 = ProposedEdge( ... source="education", target="income", confidence=0.9 ... ) graph = GeneratedGraph( ... edges=[edge1, edge2], ... variables=["age", "education", "income"], ... reasoning="Age and education both influence income.", ... metadata=GenerationMetadata(model="llama-3.1-8b-instant"), ... ) print(f"Generated {len(graph.edges)} edges") Generated 2 edges
Methods:
-
filter_by_confidence–Return a new graph with only edges above the threshold.
-
get_adjacency_matrix–Convert edges to an adjacency matrix.
-
get_edge_list–Get edges as a list of tuples.
filter_by_confidence
¶
filter_by_confidence(threshold: float = 0.5) -> 'GeneratedGraph'
Return a new graph with only edges above the threshold.
Parameters:
-
(threshold¶float, default:0.5) –Minimum confidence score to include.
Returns:
-
'GeneratedGraph'–New GeneratedGraph with filtered edges.
get_adjacency_matrix
¶
Convert edges to an adjacency matrix.
Creates a square matrix where entry (i,j) represents the confidence that variable i causes variable j.
Returns:
-
List[List[float]]–Square matrix of confidence scores.
get_edge_list
¶
Get edges as a list of tuples.
Returns:
-
List[tuple[str, str, float]]–List of (source, target, confidence) tuples.
parse_graph_response
¶
parse_graph_response(
response_text: str, variables: List[str], output_format: str = "edge_list"
) -> GeneratedGraph
Parse an LLM response into a GeneratedGraph.
Handles JSON extraction from markdown code blocks and parses according to the specified output format.
Parameters:
-
(response_text¶str) –Raw text response from the LLM.
-
(variables¶List[str]) –List of valid variable names.
-
(output_format¶str, default:'edge_list') –Expected format ("edge_list" or "adjacency_matrix").
Returns:
-
GeneratedGraph–GeneratedGraph with parsed edges and metadata.
Raises:
-
ValueError–If JSON parsing fails or format is invalid.