Graph PDG Module¶

The PDG (Probabilistic Dependency Graph) class represents a probability distribution over edge states between node pairs. Unlike SDG which stores a single deterministic edge type, PDG stores probabilities for each possible edge state.

Overview¶

PDG is designed for:

Graph averaging: Combining multiple structure learning runs
Uncertainty representation: Representing structural uncertainty in causal graphs
LLM fusion: Integrating LLM-generated graphs with statistical methods

The PDG class is independent of the SDG class hierarchy (not a subclass) as it represents a fundamentally different concept: a distribution over graphs rather than a single graph.

Classes¶

`EdgeProbabilities`¶

Probability distribution over edge states between two nodes.

Stores probabilities for each possible edge state:

forward: P(source -> target) directed edge in stored direction
backward: P(target -> source) directed edge opposite to stored
undirected: P(source -- target) undirected edge
none: P(no edge between source and target)

Properties:

p_exist: Probability that any edge exists (sum of forward, backward, undirected)
p_directed: Probability of a directed edge (sum of forward, backward)
most_likely_state(): Returns the most probable edge state

Usage:

from causaliq_core.graph import EdgeProbabilities

# Create edge probability distribution
probs = EdgeProbabilities(
    forward=0.6,
    backward=0.2,
    undirected=0.1,
    none=0.1
)

# Query properties
print(probs.p_exist)      # 0.9 (probability edge exists)
print(probs.p_directed)   # 0.8 (probability edge is directed)
print(probs.most_likely_state())  # "forward"

`PDG`¶

Probabilistic Dependency Graph - distribution over SDG structures.

Features:

Stores probability distributions for each node pair
Supports threshold-based graph extraction
GraphML I/O for serialisation

Usage:

from causaliq_core.graph import PDG, EdgeProbabilities

# Create a PDG
nodes = ["A", "B", "C"]
edges = {
    ("A", "B"): EdgeProbabilities(forward=0.8, none=0.2),
    ("A", "C"): EdgeProbabilities(forward=0.6, backward=0.3, none=0.1),
}
pdg = PDG(nodes, edges)

# Query edge probabilities
probs = pdg.get_probabilities("A", "B")
print(probs.forward)  # 0.8

# Extract graph at threshold
pdag = pdg.to_pdag(threshold=0.5)

Reference¶

Probabilistic Dependency Graph (PDG)

This module provides PDG (Probabilistic Dependency Graph) which represents a probability distribution over edge states between node pairs. Unlike SDG which stores a single deterministic edge type, PDG stores probabilities for each possible edge state.

PDG is designed for: - Graph averaging from multiple structure learning runs - Fusing LLM-generated graphs with statistical structure learning - Representing uncertainty in causal graph structure

The PDG class is independent of the SDG class hierarchy (not a subclass) as it represents uncertainty over graphs rather than a single graph.

Classes:

EdgeProbabilities –

Probability distribution over edge states between two nodes.
PDG –

Probabilistic Dependency Graph - distribution over SDG structures.

Classes¶

EdgeProbabilities `dataclass` ¶

EdgeProbabilities(
    forward: float = 0.0,
    backward: float = 0.0,
    undirected: float = 0.0,
    none: float = 1.0,
)

Probability distribution over edge states between two nodes.

Stores probabilities for each possible edge state. The edge is stored with source node alphabetically before target node (canonical form).

Attributes:

forward (float) –

P(source -> target) directed edge in stored direction.
backward (float) –

P(target -> source) directed edge opposite to stored.
undirected (float) –

P(source -- target) undirected edge.
none (float) –

P(no edge between source and target).

Raises:

ValueError –

If probabilities do not sum to 1.0 (within tolerance).

Example

probs = EdgeProbabilities( ... forward=0.6, backward=0.2, undirected=0.1, none=0.1 ... ) probs.p_exist 0.9 probs.p_directed 0.8

Methods:

__post_init__ –

Validate probabilities sum to 1.0.
most_likely_state –

Return the most likely edge state.

Attributes¶

p_exist `property` ¶

p_exist: float

Probability that any edge exists between the nodes.

Returns:

float –

Sum of forward, backward, and undirected probabilities.

p_directed `property` ¶

p_directed: float

Probability of a directed edge (either direction).

Returns:

float –

Sum of forward and backward probabilities.

Functions¶

__post_init__ ¶

__post_init__() -> None

Validate probabilities sum to 1.0.

most_likely_state ¶

most_likely_state() -> str

Return the most likely edge state.

Returns:

str –

One of "forward", "backward", "undirected", or "none".

PDG ¶

PDG(nodes: List[str], edges: Optional[Dict[Tuple[str, str], EdgeProbabilities]] = None)

Probabilistic Dependency Graph - distribution over SDG structures.

Represents uncertainty over causal graph structure by storing probability distributions for each possible edge between node pairs. Unlike SDG, PDAG, and DAG which represent single deterministic graphs, PDG captures structural uncertainty.

PDG is not a subclass of SDG because it represents a fundamentally different concept: a distribution over graphs rather than a single graph.

Parameters:

nodes ¶
(List[str]) –

List of node names in the graph.
edges ¶
(Optional[Dict[Tuple[str, str], EdgeProbabilities]], default: None ) –

Dictionary mapping (source, target) pairs to EdgeProbabilities. Node pairs should be in canonical order (source < target alphabetically).

Attributes:

nodes (List[str]) –

Graph nodes in alphabetical order.
edges (Dict[Tuple[str, str], EdgeProbabilities]) –

Edge probabilities {(source, target): EdgeProbabilities}.

Raises:

TypeError –

If nodes or edges have invalid types.
ValueError –

If edge keys are not in canonical order or reference unknown nodes.

Example

from causaliq_core.graph.pdg import PDG, EdgeProbabilities nodes = ["A", "B", "C"] edges = { ... ("A", "B"): EdgeProbabilities(forward=0.8, none=0.2), ... ("A", "C"): EdgeProbabilities(forward=0.6, backward=0.3, ... none=0.1), ... } pdg = PDG(nodes, edges) pdg.get_probabilities("A", "B").forward 0.8

Parameters:

nodes ¶
(List[str]) –

List of node names.
edges ¶
(Optional[Dict[Tuple[str, str], EdgeProbabilities]], default: None ) –

Optional dictionary of edge probabilities. Keys must be tuples (source, target) where source < target alphabetically.

Methods:

get_probabilities –

Get edge probabilities between two nodes.
set_probabilities –

Set edge probabilities between two nodes.
node_pairs –

Iterate over all possible node pairs in canonical order.
existing_edges –

Iterate over node pairs with non-zero edge probability.
__len__ –

Return number of node pairs with explicit probabilities.
__eq__ –

Check equality with another PDG.
__str__ –

Return human-readable description of the PDG.
__repr__ –

Return detailed representation of the PDG.
compress –

Compress PDG to compact binary representation.
decompress –

Decompress PDG from compact binary representation.

Functions¶

get_probabilities ¶

get_probabilities(node_a: str, node_b: str) -> EdgeProbabilities

Get edge probabilities between two nodes.

Handles node ordering automatically - returns probabilities with forward/backward relative to alphabetical ordering.

Parameters:

node_a ¶ (str) –

First node name.
node_b ¶ (str) –

Second node name.

Returns:

EdgeProbabilities –

EdgeProbabilities for the node pair. If no explicit probabilities
EdgeProbabilities –

stored, returns EdgeProbabilities(none=1.0).

Raises:

ValueError –

If either node is not in the graph.

set_probabilities ¶

set_probabilities(node_a: str, node_b: str, probs: EdgeProbabilities) -> None

Set edge probabilities between two nodes.

Handles node ordering automatically.

Parameters:

node_a ¶ (str) –

First node name.
node_b ¶ (str) –

Second node name.
probs ¶ (EdgeProbabilities) –

Edge probabilities to set.

Raises:

ValueError –

If either node is not in the graph.
TypeError –

If probs is not EdgeProbabilities.

node_pairs ¶

node_pairs() -> Iterator[Tuple[str, str]]

Iterate over all possible node pairs in canonical order.

Yields:

Tuple[str, str] –

Tuples (source, target) where source < target alphabetically.

existing_edges ¶

existing_edges() -> Iterator[Tuple[str, str, EdgeProbabilities]]

Iterate over node pairs with non-zero edge probability.

Yields:

Tuple[str, str, EdgeProbabilities] –

Tuples (source, target, probs) where p_exist > 0.

len ¶

__len__() -> int

Return number of node pairs with explicit probabilities.

eq ¶

__eq__(other: object) -> bool

Check equality with another PDG.

str ¶

__str__() -> str

Return human-readable description of the PDG.

repr ¶

__repr__() -> str

Return detailed representation of the PDG.

compress ¶

compress() -> bytes

Compress PDG to compact binary representation.

Format: - 2 bytes: number of nodes (uint16, big-endian) - For each node: 2 bytes name length + UTF-8 encoded name - 2 bytes: number of edge pairs with probabilities (uint16) - For each edge pair: - 2 bytes: source node index (uint16) - 2 bytes: target node index (uint16) - 3 bytes: p_forward (4 s.f. mantissa + exponent) - 3 bytes: p_backward (4 s.f. mantissa + exponent) - 3 bytes: p_undirected (4 s.f. mantissa + exponent)

Probabilities are encoded with 4 significant figures using a mantissa (0-9999) and exponent format: value = mantissa × 10^exp. The p_none value is derived as 1.0 - (forward + backward + undirected).

Returns:

bytes –

Compact binary representation of the PDG.

Raises:

ValueError –

If graph has more than 65535 nodes or edge pairs.

decompress `classmethod` ¶

decompress(data: bytes) -> PDG

Decompress PDG from compact binary representation.

Parameters:

data ¶ (bytes) –

Binary data from PDG.compress().

Returns:

PDG –

Reconstructed PDG instance.

Raises:

TypeError –

If data is not bytes.
ValueError –

If data is invalid or corrupted.

Graph PDG Module¶

Overview¶

Classes¶

`EdgeProbabilities`¶

`PDG`¶

Reference¶

Classes¶

EdgeProbabilities `dataclass` ¶

Attributes¶

p_exist `property` ¶

p_directed `property` ¶

Functions¶

__post_init__ ¶

most_likely_state ¶

PDG ¶

`nodes` ¶

`edges` ¶

`nodes` ¶

`edges` ¶

Functions¶

get_probabilities ¶

set_probabilities ¶

node_pairs ¶

existing_edges ¶

len ¶

eq ¶

str ¶

repr ¶

compress ¶

decompress `classmethod` ¶

Graph PDG Module¶

Overview¶

Classes¶

EdgeProbabilities¶

PDG¶

Reference¶

Classes¶

EdgeProbabilities dataclass ¶

Attributes¶

p_exist property ¶

p_directed property ¶

Functions¶

__post_init__ ¶

most_likely_state ¶

PDG ¶

nodes ¶

edges ¶

nodes ¶

edges ¶

Functions¶

get_probabilities ¶

set_probabilities ¶

node_pairs ¶

existing_edges ¶

__len__ ¶

__eq__ ¶

__str__ ¶

__repr__ ¶

compress ¶

decompress classmethod ¶

`EdgeProbabilities`¶

`PDG`¶

EdgeProbabilities `dataclass` ¶

p_exist `property` ¶

p_directed `property` ¶

`nodes` ¶

`edges` ¶

`nodes` ¶

`edges` ¶

len ¶

eq ¶

str ¶

repr ¶

decompress `classmethod` ¶