Skip to content

Network Context API Reference

Pydantic models for defining network context specifications in JSON format.

Overview

Network context specifications define the variables and metadata for a causal network, enabling LLMs to generate causal graphs with appropriate domain context.

from causaliq_knowledge.graph import (
    NetworkContext,
    NetworkLoadError,
    VariableSpec,
    VariableType,
    VariableRole,
    PromptDetails,
    ViewDefinition,
)

VariableType

Enumeration of supported variable types.

class VariableType(str, Enum):
    BINARY = "binary"           # Two states (e.g., yes/no)
    CATEGORICAL = "categorical" # Multiple unordered states
    ORDINAL = "ordinal"         # Multiple ordered states
    CONTINUOUS = "continuous"   # Numeric values

Example:

from causaliq_knowledge.graph import VariableType

var_type = VariableType.BINARY
print(var_type.value)  # "binary"

VariableRole

Enumeration of causal roles in the graph structure.

class VariableRole(str, Enum):
    EXOGENOUS = "exogenous"   # No parents (root cause)
    ENDOGENOUS = "endogenous" # Has parents (caused by other variables)
    LATENT = "latent"         # Unobserved variable

Example:

from causaliq_knowledge.graph import VariableRole

role = VariableRole.EXOGENOUS
print(role.value)  # "exogenous"

VariableSpec

Specification for a single variable in the causal model.

Attributes

Attribute Type Required Description
name str Yes Benchmark/literature name for ground truth
llm_name str No Name used for LLM queries (defaults to name)
type VariableType Yes Variable type (binary, categorical, etc.)
display_name str No Human-readable display name
aliases list[str] No Alternative names for the variable
states list[str] No Possible values for discrete variables
role VariableRole No Causal role (exogenous, endogenous, latent)
category str No Domain-specific category
short_description str No Brief description of the variable
extended_description str No Detailed description with domain context
base_rate dict[str, float] No Prior probabilities for each state
conditional_rates dict No Conditional probabilities
sensitivity_hints str No Hints about causal relationships
related_domain_knowledge list[str] No Domain knowledge statements
references list[str] No Literature references

LLM Name vs Benchmark Name

The name and llm_name fields enable semantic disguising to reduce LLM memorisation of well-known benchmark networks:

  • name: The benchmark/literature name used for ground truth evaluation
  • llm_name: The name sent to the LLM (defaults to name if not specified)

Example: For the ASIA network's "Tuberculosis" variable:

VariableSpec(
    name="tub",              # Original benchmark name
    llm_name="HasTB",        # Meaningful but non-canonical name for LLM
    display_name="Tuberculosis Status",
    type=VariableType.BINARY,
)

Example

from causaliq_knowledge.graph import VariableSpec, VariableType, VariableRole

smoking = VariableSpec(
    name="smoke",
    llm_name="tobacco_history",
    type=VariableType.BINARY,
    states=["never", "ever"],
    role=VariableRole.EXOGENOUS,
    short_description="Patient has history of tobacco smoking.",
    extended_description="Self-reported smoking history, known risk factor.",
    base_rate={"never": 0.7, "ever": 0.3},
)

ViewDefinition

Configuration for a single context view level.

Attributes

Attribute Type Required Description
description str No Human-readable description of the view
include_fields list[str] Yes Variable fields to include in this view

Example

from causaliq_knowledge.graph import ViewDefinition

minimal_view = ViewDefinition(
    description="Variable names only",
    include_fields=["name"]
)

standard_view = ViewDefinition(
    description="Names with basic metadata",
    include_fields=["name", "type", "short_description", "states"]
)

PromptDetails

Container for the three standard prompt detail levels.

Attributes

Attribute Type Description
minimal ViewDefinition Minimal context (names only)
standard ViewDefinition Standard context (names + descriptions)
rich ViewDefinition Rich context (full metadata)

Default Prompt Details

If not specified, the following defaults are used:

PromptDetails(
    minimal=ViewDefinition(include_fields=["name"]),
    standard=ViewDefinition(
        include_fields=["name", "type", "short_description", "states"]
    ),
    rich=ViewDefinition(
        include_fields=[
            "name", "type", "role", "short_description",
            "extended_description", "states", "sensitivity_hints"
        ]
    ),
)

Provenance

Provenance information for the model specification.

Attributes

Attribute Type Description
source_network str Name of the source benchmark network
source_reference str Citation for the original source
source_url str URL to the source data
disguise_strategy str Strategy used for variable name disguising
memorization_risk str Risk level for LLM memorisation
notes str Additional notes about the source

LLMGuidance

Guidance for LLM interactions with the model.

Attributes

Attribute Type Description
usage_notes list[str] Notes about using this model with LLMs
do_not_provide list[str] Information to withhold from LLMs
expected_difficulty str Expected difficulty level

Constraints

Structural constraints on the causal graph.

Attributes

Attribute Type Description
forbidden_edges list[list[str]] Edges that must not exist
required_edges list[list[str]] Edges that must exist
partial_order list[list[str]] Temporal ordering constraints
causal_principles list[CausalPrinciple] Domain causal principles

GroundTruth

Ground truth edges for evaluation.

Attributes

Attribute Type Description
edges_expert list[list[str]] Expert-defined edges
edges_experiment list[list[str]] Experimentally-derived edges
edges_observational list[list[str]] Observationally-derived edges

NetworkContext

Complete network context for LLM-based causal graph generation.

Provides domain and variable information needed to generate causal graphs using LLMs. This is not the network itself, but the context required to generate one.

Attributes

Attribute Type Required Description
schema_version str No Schema version (default: "2.0")
network str Yes Network identifier (e.g., "asia")
domain str Yes Domain of the model (e.g., "pulmonary_oncology")
purpose str No Purpose of the context specification
variables list[VariableSpec] Yes List of variable specifications
provenance Provenance No Source and provenance information
llm_guidance LLMGuidance No Guidance for LLM interactions
prompt_details PromptDetails No Prompt detail definitions (uses defaults if omitted)
constraints Constraints No Structural constraints
causal_principles list[CausalPrinciple] No Domain causal principles
ground_truth GroundTruth No Ground truth for evaluation

Class Methods

load(path: str | Path) -> NetworkContext

Load a network context from a JSON file.

context = NetworkContext.load("models/cancer.json")

Raises: NetworkLoadError - If the file cannot be read or validation fails.

from_dict(data: dict, source_path: Path | None = None) -> NetworkContext

Create a network context from a dictionary.

context = NetworkContext.from_dict({
    "network": "test",
    "domain": "testing",
    "variables": [{"name": "X", "type": "binary"}]
})

load_and_validate(path: str | Path) -> tuple[NetworkContext, list[str]]

Load and fully validate a network context, returning warnings.

context, warnings = NetworkContext.load_and_validate("model.json")
for warning in warnings:
    print(f"Warning: {warning}")

Instance Methods

get_variable_names() -> list[str]

Return list of all benchmark variable names.

context = NetworkContext.load("model.json")
names = context.get_variable_names()
# ["smoking", "cancer", "age"]

get_llm_names() -> list[str]

Return list of all LLM variable names.

context = NetworkContext.load("model.json")
llm_names = context.get_llm_names()
# ["tobacco_use", "malignancy", "patient_age"]

get_variable(name: str) -> VariableSpec | None

Get a variable specification by name.

context = NetworkContext.load("model.json")
smoking = context.get_variable("smoking")

get_llm_to_name_mapping() -> dict[str, str]

Get mapping from LLM names to benchmark names.

mapping = context.get_llm_to_name_mapping()
# {"tobacco_use": "smoking", "malignancy": "cancer"}

uses_distinct_llm_names() -> bool

Check if any variable has a different llm_name from name.

if context.uses_distinct_llm_names():
    print("Context uses LLM name disguising")

validate_variables() -> list[str]

Validate variable specifications and return warnings.

warnings = context.validate_variables()
for warning in warnings:
    print(f"Warning: {warning}")

Example

from causaliq_knowledge.graph import (
    NetworkContext,
    VariableSpec,
    VariableType,
    VariableRole,
)

context = NetworkContext(
    network="smoking_cancer",
    domain="epidemiology",
    purpose="Causal model for smoking and cancer",
    variables=[
        VariableSpec(
            name="smoking",
            llm_name="tobacco_use",
            type=VariableType.BINARY,
            role=VariableRole.EXOGENOUS,
            short_description="Smoking status",
        ),
        VariableSpec(
            name="cancer",
            llm_name="malignancy",
            type=VariableType.BINARY,
            role=VariableRole.ENDOGENOUS,
            short_description="Cancer diagnosis",
        ),
    ],
)

NetworkLoadError

Exception raised when network context loading fails.

Attributes

Attribute Type Description
message str Error description
path Path \| str \| None Path to the file that failed
details str \| None Additional error details

Example

from causaliq_knowledge.graph import NetworkContext, NetworkLoadError

try:
    context = NetworkContext.load("nonexistent.json")
except NetworkLoadError as e:
    print(f"Failed to load: {e.message}")
    if e.path:
        print(f"File: {e.path}")

JSON Schema

Network context specifications are typically stored as JSON files. See Network Context Format for the complete JSON schema and examples.