Network Context API Reference¶

Pydantic models for defining network context specifications in JSON format.

Overview¶

Network context specifications define the variables and metadata for a causal network, enabling LLMs to generate causal graphs with appropriate domain context.

from causaliq_knowledge.graph import (
    NetworkContext,
    NetworkLoadError,
    VariableSpec,
    VariableType,
    VariableRole,
    PromptDetails,
    ViewDefinition,
)

VariableType¶

Enumeration of supported variable types.

class VariableType(str, Enum):
    BINARY = "binary"           # Two states (e.g., yes/no)
    CATEGORICAL = "categorical" # Multiple unordered states
    ORDINAL = "ordinal"         # Multiple ordered states
    CONTINUOUS = "continuous"   # Numeric values

Example:

from causaliq_knowledge.graph import VariableType

var_type = VariableType.BINARY
print(var_type.value)  # "binary"

VariableRole¶

Enumeration of causal roles in the graph structure.

class VariableRole(str, Enum):
    EXOGENOUS = "exogenous"   # No parents (root cause)
    ENDOGENOUS = "endogenous" # Has parents (caused by other variables)
    LATENT = "latent"         # Unobserved variable

Example:

from causaliq_knowledge.graph import VariableRole

role = VariableRole.EXOGENOUS
print(role.value)  # "exogenous"

VariableSpec¶

Specification for a single variable in the causal model.

Attributes¶

Attribute	Type	Required	Description
`name`	`str`	Yes	Benchmark/literature name for ground truth
`llm_name`	`str`	No	Name used for LLM queries (defaults to name)
`type`	`VariableType`	Yes	Variable type (binary, categorical, etc.)
`display_name`	`str`	No	Human-readable display name
`aliases`	`list[str]`	No	Alternative names for the variable
`states`	`list[str]`	No	Possible values for discrete variables
`role`	`VariableRole`	No	Causal role (exogenous, endogenous, latent)
`category`	`str`	No	Domain-specific category
`short_description`	`str`	No	Brief description of the variable
`extended_description`	`str`	No	Detailed description with domain context
`base_rate`	`dict[str, float]`	No	Prior probabilities for each state
`conditional_rates`	`dict`	No	Conditional probabilities
`sensitivity_hints`	`str`	No	Hints about causal relationships
`related_domain_knowledge`	`list[str]`	No	Domain knowledge statements
`references`	`list[str]`	No	Literature references

LLM Name vs Benchmark Name¶

The name and llm_name fields enable semantic disguising to reduce LLM memorisation of well-known benchmark networks:

name: The benchmark/literature name used for ground truth evaluation
llm_name: The name sent to the LLM (defaults to name if not specified)

Example: For the ASIA network's "Tuberculosis" variable:

VariableSpec(
    name="tub",              # Original benchmark name
    llm_name="HasTB",        # Meaningful but non-canonical name for LLM
    display_name="Tuberculosis Status",
    type=VariableType.BINARY,
)

Example¶

from causaliq_knowledge.graph import VariableSpec, VariableType, VariableRole

smoking = VariableSpec(
    name="smoke",
    llm_name="tobacco_history",
    type=VariableType.BINARY,
    states=["never", "ever"],
    role=VariableRole.EXOGENOUS,
    short_description="Patient has history of tobacco smoking.",
    extended_description="Self-reported smoking history, known risk factor.",
    base_rate={"never": 0.7, "ever": 0.3},
)

ViewDefinition¶

Configuration for a single context view level.

Attributes¶

Attribute	Type	Required	Description
`description`	`str`	No	Human-readable description of the view
`include_fields`	`list[str]`	Yes	Variable fields to include in this view

Example¶

from causaliq_knowledge.graph import ViewDefinition

minimal_view = ViewDefinition(
    description="Variable names only",
    include_fields=["name"]
)

standard_view = ViewDefinition(
    description="Names with basic metadata",
    include_fields=["name", "type", "short_description", "states"]
)

PromptDetails¶

Container for the three standard prompt detail levels.

Attributes¶

Attribute	Type	Description
`minimal`	`ViewDefinition`	Minimal context (names only)
`standard`	`ViewDefinition`	Standard context (names + descriptions)
`rich`	`ViewDefinition`	Rich context (full metadata)

Default Prompt Details¶

If not specified, the following defaults are used:

PromptDetails(
    minimal=ViewDefinition(include_fields=["name"]),
    standard=ViewDefinition(
        include_fields=["name", "type", "short_description", "states"]
    ),
    rich=ViewDefinition(
        include_fields=[
            "name", "type", "role", "short_description",
            "extended_description", "states", "sensitivity_hints"
        ]
    ),
)

Provenance¶

Provenance information for the model specification.

Attributes¶

Attribute	Type	Description
`source_network`	`str`	Name of the source benchmark network
`source_reference`	`str`	Citation for the original source
`source_url`	`str`	URL to the source data
`disguise_strategy`	`str`	Strategy used for variable name disguising
`memorization_risk`	`str`	Risk level for LLM memorisation
`notes`	`str`	Additional notes about the source

LLMGuidance¶

Guidance for LLM interactions with the model.

Attributes¶

Attribute	Type	Description
`usage_notes`	`list[str]`	Notes about using this model with LLMs
`do_not_provide`	`list[str]`	Information to withhold from LLMs
`expected_difficulty`	`str`	Expected difficulty level

Constraints¶

Structural constraints on the causal graph.

Attributes¶

Attribute	Type	Description
`forbidden_edges`	`list[list[str]]`	Edges that must not exist
`required_edges`	`list[list[str]]`	Edges that must exist
`partial_order`	`list[list[str]]`	Temporal ordering constraints
`causal_principles`	`list[CausalPrinciple]`	Domain causal principles

GroundTruth¶

Ground truth edges for evaluation.

Attributes¶

Attribute	Type	Description
`edges_expert`	`list[list[str]]`	Expert-defined edges
`edges_experiment`	`list[list[str]]`	Experimentally-derived edges
`edges_observational`	`list[list[str]]`	Observationally-derived edges

NetworkContext¶

Complete network context for LLM-based causal graph generation.

Provides domain and variable information needed to generate causal graphs using LLMs. This is not the network itself, but the context required to generate one.

Attributes¶

Attribute	Type	Required	Description
`schema_version`	`str`	No	Schema version (default: "2.0")
`network`	`str`	Yes	Network identifier (e.g., "asia")
`domain`	`str`	Yes	Domain of the model (e.g., "pulmonary_oncology")
`purpose`	`str`	No	Purpose of the context specification
`variables`	`list[VariableSpec]`	Yes	List of variable specifications
`provenance`	`Provenance`	No	Source and provenance information
`llm_guidance`	`LLMGuidance`	No	Guidance for LLM interactions
`prompt_details`	`PromptDetails`	No	Prompt detail definitions (uses defaults if omitted)
`constraints`	`Constraints`	No	Structural constraints
`causal_principles`	`list[CausalPrinciple]`	No	Domain causal principles
`ground_truth`	`GroundTruth`	No	Ground truth for evaluation

Class Methods¶

`load(path: str | Path) -> NetworkContext`¶

Load a network context from a JSON file.

context = NetworkContext.load("models/cancer.json")

Raises: NetworkLoadError - If the file cannot be read or validation fails.

`from_dict(data: dict, source_path: Path | None = None) -> NetworkContext`¶

Create a network context from a dictionary.

context = NetworkContext.from_dict({
    "network": "test",
    "domain": "testing",
    "variables": [{"name": "X", "type": "binary"}]
})

`load_and_validate(path: str | Path) -> tuple[NetworkContext, list[str]]`¶

Load and fully validate a network context, returning warnings.

context, warnings = NetworkContext.load_and_validate("model.json")
for warning in warnings:
    print(f"Warning: {warning}")

Instance Methods¶

`get_variable_names() -> list[str]`¶

Return list of all benchmark variable names.

context = NetworkContext.load("model.json")
names = context.get_variable_names()
# ["smoking", "cancer", "age"]

`get_llm_names() -> list[str]`¶

Return list of all LLM variable names.

context = NetworkContext.load("model.json")
llm_names = context.get_llm_names()
# ["tobacco_use", "malignancy", "patient_age"]

`get_variable(name: str) -> VariableSpec | None`¶

Get a variable specification by name.

context = NetworkContext.load("model.json")
smoking = context.get_variable("smoking")

`get_llm_to_name_mapping() -> dict[str, str]`¶

Get mapping from LLM names to benchmark names.

mapping = context.get_llm_to_name_mapping()
# {"tobacco_use": "smoking", "malignancy": "cancer"}

`uses_distinct_llm_names() -> bool`¶

Check if any variable has a different llm_name from name.

if context.uses_distinct_llm_names():
    print("Context uses LLM name disguising")

`validate_variables() -> list[str]`¶

Validate variable specifications and return warnings.

warnings = context.validate_variables()
for warning in warnings:
    print(f"Warning: {warning}")

Example¶

from causaliq_knowledge.graph import (
    NetworkContext,
    VariableSpec,
    VariableType,
    VariableRole,
)

context = NetworkContext(
    network="smoking_cancer",
    domain="epidemiology",
    purpose="Causal model for smoking and cancer",
    variables=[
        VariableSpec(
            name="smoking",
            llm_name="tobacco_use",
            type=VariableType.BINARY,
            role=VariableRole.EXOGENOUS,
            short_description="Smoking status",
        ),
        VariableSpec(
            name="cancer",
            llm_name="malignancy",
            type=VariableType.BINARY,
            role=VariableRole.ENDOGENOUS,
            short_description="Cancer diagnosis",
        ),
    ],
)

NetworkLoadError¶

Exception raised when network context loading fails.

Attributes¶

Attribute	Type	Description
`message`	`str`	Error description
`path`	`Path \\| str \\| None`	Path to the file that failed
`details`	`str \\| None`	Additional error details

Example¶

from causaliq_knowledge.graph import NetworkContext, NetworkLoadError

try:
    context = NetworkContext.load("nonexistent.json")
except NetworkLoadError as e:
    print(f"Failed to load: {e.message}")
    if e.path:
        print(f"File: {e.path}")

JSON Schema¶

Network context specifications are typically stored as JSON files. See Network Context Format for the complete JSON schema and examples.

Network Context API Reference¶

Overview¶

VariableType¶

VariableRole¶

VariableSpec¶

Attributes¶

LLM Name vs Benchmark Name¶

Example¶

ViewDefinition¶

Attributes¶

Example¶

PromptDetails¶

Attributes¶

Default Prompt Details¶

Provenance¶

Attributes¶

LLMGuidance¶

Attributes¶

Constraints¶

Attributes¶

GroundTruth¶

Attributes¶

NetworkContext¶

Attributes¶

Class Methods¶

load(path: str | Path) -> NetworkContext¶

from_dict(data: dict, source_path: Path | None = None) -> NetworkContext¶

load_and_validate(path: str | Path) -> tuple[NetworkContext, list[str]]¶

Instance Methods¶

get_variable_names() -> list[str]¶

get_llm_names() -> list[str]¶

get_variable(name: str) -> VariableSpec | None¶

get_llm_to_name_mapping() -> dict[str, str]¶

uses_distinct_llm_names() -> bool¶

validate_variables() -> list[str]¶

Example¶

NetworkLoadError¶

Attributes¶

Example¶

JSON Schema¶

`load(path: str | Path) -> NetworkContext`¶

`from_dict(data: dict, source_path: Path | None = None) -> NetworkContext`¶

`load_and_validate(path: str | Path) -> tuple[NetworkContext, list[str]]`¶

`get_variable_names() -> list[str]`¶

`get_llm_names() -> list[str]`¶

`get_variable(name: str) -> VariableSpec | None`¶

`get_llm_to_name_mapping() -> dict[str, str]`¶

`uses_distinct_llm_names() -> bool`¶

`validate_variables() -> list[str]`¶