CPT (Conditional Probability Table)¶
The CPT class represents conditional probability tables for discrete/categorical variables in Bayesian Networks. It stores and manages probability distributions that map parent value combinations to child variable probabilities.
Overview¶
A Conditional Probability Table (CPT) defines the local probability distribution for a discrete node given its parent values. It consists of:
- Values: The possible states/categories for this variable
- Table: Probability values organized by parent combinations
- Parents: Names of parent variables (if any)
- Normalization: Ensures probabilities sum to 1 for each parent combination
Key Features¶
- Flexible Parent Support: Handle any number of discrete parent variables
- Efficient Storage: Compact representation of probability tables
- Automatic Validation: Ensures probability constraints are satisfied
- Missing Data Handling: Robust handling of incomplete data during learning
- Fast Lookup: Optimized probability queries
Table Organization¶
For a node with parents, the probability table is organized as:
- Rows: Each parent value combination
- Columns: Each possible value of the child variable
- Entries: P(child=value | parents=combination)
Example Usage¶
from causaliq_core.bn.dist import CPT
# Simple unconditional distribution
weather = CPT(
values=['Sunny', 'Rainy'],
table=[0.8, 0.2] # P(Sunny)=0.8, P(Rainy)=0.2
)
# Conditional distribution with one parent
sprinkler = CPT(
values=['On', 'Off'],
parents=['Weather'],
table=[
0.1, 0.9, # P(Sprinkler | Weather=Sunny) = [0.1, 0.9]
0.7, 0.3 # P(Sprinkler | Weather=Rainy) = [0.7, 0.3]
]
)
# Conditional distribution with multiple parents
grass = CPT(
values=['Wet', 'Dry'],
parents=['Weather', 'Sprinkler'],
table=[
# Weather=Sunny, Sprinkler=On: P(Wet)=0.9, P(Dry)=0.1
# Weather=Sunny, Sprinkler=Off: P(Wet)=0.2, P(Dry)=0.8
# Weather=Rainy, Sprinkler=On: P(Wet)=0.95, P(Dry)=0.05
# Weather=Rainy, Sprinkler=Off: P(Wet)=0.8, P(Dry)=0.2
0.9, 0.1, 0.2, 0.8, 0.95, 0.05, 0.8, 0.2
]
)
# Access probabilities
prob = sprinkler.prob('On', parents_values={'Weather': 'Rainy'})
print(f"P(Sprinkler=On | Weather=Rainy) = {prob}")
Data Learning¶
CPTs can learn parameters from data:
import pandas as pd
from causaliq_core.bn.dist import CPT
# Training data
data = pd.DataFrame({
'Weather': ['Sunny', 'Rainy', 'Sunny', 'Rainy', 'Sunny'],
'Sprinkler': ['Off', 'On', 'On', 'Off', 'Off']
})
# Learn CPT from data
learned_cpt = CPT.from_data(
variable='Sprinkler',
parents=['Weather'],
data=data,
values=['On', 'Off']
)
NodeValueCombinations Utility¶
The module also provides the NodeValueCombinations utility class for handling parent value combinations:
from causaliq_core.bn.dist import NodeValueCombinations
# Create combinations for multiple parents
nvc = NodeValueCombinations(['Weather', 'Season'],
[['Sunny', 'Rainy'], ['Summer', 'Winter']])
# Get all combinations
combinations = nvc.combinations()
# Result: [('Sunny', 'Summer'), ('Sunny', 'Winter'),
# ('Rainy', 'Summer'), ('Rainy', 'Winter')]
API Reference¶
cpt
¶
Classes:
-
CPT–Base class for conditional probability tables.
-
NodeValueCombinations–Iterable over all combinations of node values
CPT
¶
CPT(
pmfs: Union[Dict[str, float], List[Tuple[Dict[str, str], Dict[str, float]]]],
estimated: int = 0,
)
Base class for conditional probability tables.
Parameters:
-
(pmfs¶Union[Dict[str, float], List[Tuple[Dict[str, str], Dict[str, float]]]]) –A pmf of {value: prob} for parentless nodes OR list of tuples ({parent: value}, {value: prob}).
-
(estimated¶int, default:0) –How many PMFs were estimated.
Attributes:
-
cpt–Internal representation of the CPT. {node_values: prob} for parentless node, otherwise {parental_values as frozenset: {node_values: prob}}.
-
estimated–Number of PMFs that were estimated.
-
values–Values which node can take.
Raises:
-
TypeError–If arguments are of wrong type.
-
ValueError–If arguments have invalid or conflicting values.
Methods:
-
__eq__–Return whether two CPTs are the same allowing for probability
-
__str__–Human-friendly description of the contents of the CPT.
-
cdist–Return conditional probabilities of node values for specified
-
fit–Constructs a CPT (Conditional Probability Table) from data.
-
node_values–Return node values (states) of node CPT relates to.
-
param_ratios–Returns distribution of parameter ratios across all parental
-
parents–Return parents of node CPT relates to.
-
random_value–Generate a random value for a node given the value of its parents.
-
to_spec–Returns external specification format of CPT,
-
validate_cnds–Checks that all CNDs in graph are consistent with one another
-
validate_parents–Checks every CPT's parents and parental values are consistent
__eq__
¶
Return whether two CPTs are the same allowing for probability rounding errors
:param other: CPT to compared to self :type other: CPT
:returns: whether CPTs are PRACTICALLY the same :rtype: bool
__str__
¶
Human-friendly description of the contents of the CPT.
Returns:
-
str–String representation of the CPT contents.
cdist
¶
cdist(parental_values: Optional[Dict[str, str]] = None) -> Dict[str, float]
Return conditional probabilities of node values for specified parental values.
Parameters:
-
(parental_values¶Optional[Dict[str, str]], default:None) –Parental values for which pmf required
Raises:
-
TypeError–If args are of wrong type.
-
ValueError–If args have invalid or conflicting values.
fit
classmethod
¶
fit(
node: str,
parents: Optional[Tuple[str, ...]],
data: Union[BNFit, Any],
autocomplete: bool = True,
) -> Tuple[Tuple[type, Dict[str, Any]], Optional[int]]
Constructs a CPT (Conditional Probability Table) from data.
Parameters:
-
(node¶str) –Node that CPT applies to.
-
(parents¶Optional[Tuple[str, ...]]) –Parents of node.
-
(data¶Union[BNFit, Any]) –Data to fit CPT to.
-
(autocomplete¶bool, default:True) –Whether to ensure CPT data contains entries for
Returns:
-
Tuple[type, Dict[str, Any]]–Tuple of (cnd_spec, estimated_pmfs) where
-
Optional[int]–cnd_spec is (CPT class, cpt_spec for CPT())
-
Tuple[Tuple[type, Dict[str, Any]], Optional[int]]–estimated_pmfs is int, # estimated pmfs.
node_values
¶
Return node values (states) of node CPT relates to.
Returns:
-
List[str]–Node values in alphabetical order.
param_ratios
¶
Returns distribution of parameter ratios across all parental values for each combination of possible node values.
:returns dict: {(node value pair): (param ratios across parents)
parents
¶
Return parents of node CPT relates to.
Returns:
-
List[str]–Parent node names in alphabetical order.
random_value
¶
random_value(pvs: Optional[Dict[str, str]]) -> str
Generate a random value for a node given the value of its parents.
Parameters:
-
(pvs¶Optional[Dict[str, str]]) –Parental values, {parent1: value1, ...}.
Returns:
-
str–Random value for node.
to_spec
¶
to_spec(name_map: Dict[str, str]) -> Dict[str, Any]
Returns external specification format of CPT, renaming nodes according to a name map.
Parameters:
-
(name_map¶Dict[str, str]) –Map of node names {old: new}.
Returns:
-
Dict[str, Any]–CPT specification with renamed nodes.
Raises:
-
TypeError–If bad arg type.
-
ValueError–If bad arg value, e.g. coeff keys not in map.
validate_cnds
classmethod
¶
Checks that all CNDs in graph are consistent with one another and with graph structure.
Parameters:
-
(nodes¶list) –BN nodes.
-
(cnds¶dict) –Set of CNDs for the BN, {node: cnd}.
-
(parents¶dict) –Parents of non-orphan nodes, {node: parents}.
Raises:
-
TypeError–If invalid types used in arguments.
-
ValueError–If any inconsistent values found.
validate_parents
¶
validate_parents(
node: str, parents: Dict[str, List[str]], node_values: Dict[str, List[str]]
) -> None
Checks every CPT's parents and parental values are consistent with other relevant CPTs and the DAG structure.
Parameters:
-
(node¶str) –Name of node.
-
(parents¶Dict[str, List[str]]) –Parents of all nodes {node: parents}.
-
(node_values¶Dict[str, List[str]]) –Values of each cat. node {node: values}.
Raises:
-
ValueError–If parent mismatch or missing parental
NodeValueCombinations
¶
Iterable over all combinations of node values
:param dict node_values: allowed values for each node {node: [values]} :param bool sort: whether to sort node names and values into alphabetic order
Methods:
__iter__
¶
__iter__() -> NodeValueCombinations
Returns the initialised iterator
:returns NodeValueCombinations: the iterator
__next__
¶
Generate the next node value combination
:raises StopIteration: when all combinations have been returned
:returns dict: next node value combination {node: value}