Network Context API Reference¶
Pydantic models for defining network context specifications in JSON format.
Overview¶
Network context specifications define the variables and metadata for a causal network, enabling LLMs to generate causal graphs with appropriate domain context.
from causaliq_knowledge.graph import (
NetworkContext,
NetworkLoadError,
VariableSpec,
VariableType,
VariableRole,
PromptDetails,
ViewDefinition,
)
VariableType¶
Enumeration of supported variable types.
class VariableType(str, Enum):
BINARY = "binary" # Two states (e.g., yes/no)
CATEGORICAL = "categorical" # Multiple unordered states
ORDINAL = "ordinal" # Multiple ordered states
CONTINUOUS = "continuous" # Numeric values
Example:
from causaliq_knowledge.graph import VariableType
var_type = VariableType.BINARY
print(var_type.value) # "binary"
VariableRole¶
Enumeration of causal roles in the graph structure.
class VariableRole(str, Enum):
EXOGENOUS = "exogenous" # No parents (root cause)
ENDOGENOUS = "endogenous" # Has parents (caused by other variables)
LATENT = "latent" # Unobserved variable
Example:
from causaliq_knowledge.graph import VariableRole
role = VariableRole.EXOGENOUS
print(role.value) # "exogenous"
VariableSpec¶
Specification for a single variable in the causal model.
Attributes¶
| Attribute | Type | Required | Description |
|---|---|---|---|
name |
str |
Yes | Benchmark/literature name for ground truth |
llm_name |
str |
No | Name used for LLM queries (defaults to name) |
type |
VariableType |
Yes | Variable type (binary, categorical, etc.) |
display_name |
str |
No | Human-readable display name |
aliases |
list[str] |
No | Alternative names for the variable |
states |
list[str] |
No | Possible values for discrete variables |
role |
VariableRole |
No | Causal role (exogenous, endogenous, latent) |
category |
str |
No | Domain-specific category |
short_description |
str |
No | Brief description of the variable |
extended_description |
str |
No | Detailed description with domain context |
base_rate |
dict[str, float] |
No | Prior probabilities for each state |
conditional_rates |
dict |
No | Conditional probabilities |
sensitivity_hints |
str |
No | Hints about causal relationships |
related_domain_knowledge |
list[str] |
No | Domain knowledge statements |
references |
list[str] |
No | Literature references |
LLM Name vs Benchmark Name¶
The name and llm_name fields enable semantic disguising to reduce LLM
memorisation of well-known benchmark networks:
name: The benchmark/literature name used for ground truth evaluationllm_name: The name sent to the LLM (defaults tonameif not specified)
Example: For the ASIA network's "Tuberculosis" variable:
VariableSpec(
name="tub", # Original benchmark name
llm_name="HasTB", # Meaningful but non-canonical name for LLM
display_name="Tuberculosis Status",
type=VariableType.BINARY,
)
Example¶
from causaliq_knowledge.graph import VariableSpec, VariableType, VariableRole
smoking = VariableSpec(
name="smoke",
llm_name="tobacco_history",
type=VariableType.BINARY,
states=["never", "ever"],
role=VariableRole.EXOGENOUS,
short_description="Patient has history of tobacco smoking.",
extended_description="Self-reported smoking history, known risk factor.",
base_rate={"never": 0.7, "ever": 0.3},
)
ViewDefinition¶
Configuration for a single context view level.
Attributes¶
| Attribute | Type | Required | Description |
|---|---|---|---|
description |
str |
No | Human-readable description of the view |
include_fields |
list[str] |
Yes | Variable fields to include in this view |
Example¶
from causaliq_knowledge.graph import ViewDefinition
minimal_view = ViewDefinition(
description="Variable names only",
include_fields=["name"]
)
standard_view = ViewDefinition(
description="Names with basic metadata",
include_fields=["name", "type", "short_description", "states"]
)
PromptDetails¶
Container for the three standard prompt detail levels.
Attributes¶
| Attribute | Type | Description |
|---|---|---|
minimal |
ViewDefinition |
Minimal context (names only) |
standard |
ViewDefinition |
Standard context (names + descriptions) |
rich |
ViewDefinition |
Rich context (full metadata) |
Default Prompt Details¶
If not specified, the following defaults are used:
PromptDetails(
minimal=ViewDefinition(include_fields=["name"]),
standard=ViewDefinition(
include_fields=["name", "type", "short_description", "states"]
),
rich=ViewDefinition(
include_fields=[
"name", "type", "role", "short_description",
"extended_description", "states", "sensitivity_hints"
]
),
)
Provenance¶
Provenance information for the model specification.
Attributes¶
| Attribute | Type | Description |
|---|---|---|
source_network |
str |
Name of the source benchmark network |
source_reference |
str |
Citation for the original source |
source_url |
str |
URL to the source data |
disguise_strategy |
str |
Strategy used for variable name disguising |
memorization_risk |
str |
Risk level for LLM memorisation |
notes |
str |
Additional notes about the source |
LLMGuidance¶
Guidance for LLM interactions with the model.
Attributes¶
| Attribute | Type | Description |
|---|---|---|
usage_notes |
list[str] |
Notes about using this model with LLMs |
do_not_provide |
list[str] |
Information to withhold from LLMs |
expected_difficulty |
str |
Expected difficulty level |
Constraints¶
Structural constraints on the causal graph.
Attributes¶
| Attribute | Type | Description |
|---|---|---|
forbidden_edges |
list[list[str]] |
Edges that must not exist |
required_edges |
list[list[str]] |
Edges that must exist |
partial_order |
list[list[str]] |
Temporal ordering constraints |
causal_principles |
list[CausalPrinciple] |
Domain causal principles |
GroundTruth¶
Ground truth edges for evaluation.
Attributes¶
| Attribute | Type | Description |
|---|---|---|
edges_expert |
list[list[str]] |
Expert-defined edges |
edges_experiment |
list[list[str]] |
Experimentally-derived edges |
edges_observational |
list[list[str]] |
Observationally-derived edges |
NetworkContext¶
Complete network context for LLM-based causal graph generation.
Provides domain and variable information needed to generate causal graphs using LLMs. This is not the network itself, but the context required to generate one.
Attributes¶
| Attribute | Type | Required | Description |
|---|---|---|---|
schema_version |
str |
No | Schema version (default: "2.0") |
network |
str |
Yes | Network identifier (e.g., "asia") |
domain |
str |
Yes | Domain of the model (e.g., "pulmonary_oncology") |
purpose |
str |
No | Purpose of the context specification |
variables |
list[VariableSpec] |
Yes | List of variable specifications |
provenance |
Provenance |
No | Source and provenance information |
llm_guidance |
LLMGuidance |
No | Guidance for LLM interactions |
prompt_details |
PromptDetails |
No | Prompt detail definitions (uses defaults if omitted) |
constraints |
Constraints |
No | Structural constraints |
causal_principles |
list[CausalPrinciple] |
No | Domain causal principles |
ground_truth |
GroundTruth |
No | Ground truth for evaluation |
Class Methods¶
load(path: str | Path) -> NetworkContext¶
Load a network context from a JSON file.
Raises: NetworkLoadError - If the file cannot be read or validation fails.
from_dict(data: dict, source_path: Path | None = None) -> NetworkContext¶
Create a network context from a dictionary.
context = NetworkContext.from_dict({
"network": "test",
"domain": "testing",
"variables": [{"name": "X", "type": "binary"}]
})
load_and_validate(path: str | Path) -> tuple[NetworkContext, list[str]]¶
Load and fully validate a network context, returning warnings.
context, warnings = NetworkContext.load_and_validate("model.json")
for warning in warnings:
print(f"Warning: {warning}")
Instance Methods¶
get_variable_names() -> list[str]¶
Return list of all benchmark variable names.
context = NetworkContext.load("model.json")
names = context.get_variable_names()
# ["smoking", "cancer", "age"]
get_llm_names() -> list[str]¶
Return list of all LLM variable names.
context = NetworkContext.load("model.json")
llm_names = context.get_llm_names()
# ["tobacco_use", "malignancy", "patient_age"]
get_variable(name: str) -> VariableSpec | None¶
Get a variable specification by name.
get_llm_to_name_mapping() -> dict[str, str]¶
Get mapping from LLM names to benchmark names.
uses_distinct_llm_names() -> bool¶
Check if any variable has a different llm_name from name.
validate_variables() -> list[str]¶
Validate variable specifications and return warnings.
Example¶
from causaliq_knowledge.graph import (
NetworkContext,
VariableSpec,
VariableType,
VariableRole,
)
context = NetworkContext(
network="smoking_cancer",
domain="epidemiology",
purpose="Causal model for smoking and cancer",
variables=[
VariableSpec(
name="smoking",
llm_name="tobacco_use",
type=VariableType.BINARY,
role=VariableRole.EXOGENOUS,
short_description="Smoking status",
),
VariableSpec(
name="cancer",
llm_name="malignancy",
type=VariableType.BINARY,
role=VariableRole.ENDOGENOUS,
short_description="Cancer diagnosis",
),
],
)
NetworkLoadError¶
Exception raised when network context loading fails.
Attributes¶
| Attribute | Type | Description |
|---|---|---|
message |
str |
Error description |
path |
Path \| str \| None |
Path to the file that failed |
details |
str \| None |
Additional error details |
Example¶
from causaliq_knowledge.graph import NetworkContext, NetworkLoadError
try:
context = NetworkContext.load("nonexistent.json")
except NetworkLoadError as e:
print(f"Failed to load: {e.message}")
if e.path:
print(f"File: {e.path}")
JSON Schema¶
Network context specifications are typically stored as JSON files. See Network Context Format for the complete JSON schema and examples.