Skip to content

CPT (Conditional Probability Table)

The CPT class represents conditional probability tables for discrete/categorical variables in Bayesian Networks. It stores and manages probability distributions that map parent value combinations to child variable probabilities.

Overview

A Conditional Probability Table (CPT) defines the local probability distribution for a discrete node given its parent values. It consists of:

  • Values: The possible states/categories for this variable
  • Table: Probability values organized by parent combinations
  • Parents: Names of parent variables (if any)
  • Normalization: Ensures probabilities sum to 1 for each parent combination

Key Features

  • Flexible Parent Support: Handle any number of discrete parent variables
  • Efficient Storage: Compact representation of probability tables
  • Automatic Validation: Ensures probability constraints are satisfied
  • Missing Data Handling: Robust handling of incomplete data during learning
  • Fast Lookup: Optimized probability queries

Table Organization

For a node with parents, the probability table is organized as:

  • Rows: Each parent value combination
  • Columns: Each possible value of the child variable
  • Entries: P(child=value | parents=combination)

Example Usage

from causaliq_core.bn.dist import CPT

# Simple unconditional distribution
weather = CPT(
    values=['Sunny', 'Rainy'],
    table=[0.8, 0.2]  # P(Sunny)=0.8, P(Rainy)=0.2
)

# Conditional distribution with one parent
sprinkler = CPT(
    values=['On', 'Off'], 
    parents=['Weather'],
    table=[
        0.1, 0.9,  # P(Sprinkler | Weather=Sunny) = [0.1, 0.9] 
        0.7, 0.3   # P(Sprinkler | Weather=Rainy) = [0.7, 0.3]
    ]
)

# Conditional distribution with multiple parents
grass = CPT(
    values=['Wet', 'Dry'],
    parents=['Weather', 'Sprinkler'], 
    table=[
        # Weather=Sunny, Sprinkler=On:  P(Wet)=0.9, P(Dry)=0.1
        # Weather=Sunny, Sprinkler=Off: P(Wet)=0.2, P(Dry)=0.8  
        # Weather=Rainy, Sprinkler=On:  P(Wet)=0.95, P(Dry)=0.05
        # Weather=Rainy, Sprinkler=Off: P(Wet)=0.8, P(Dry)=0.2
        0.9, 0.1, 0.2, 0.8, 0.95, 0.05, 0.8, 0.2
    ]
)

# Access probabilities
prob = sprinkler.prob('On', parents_values={'Weather': 'Rainy'})
print(f"P(Sprinkler=On | Weather=Rainy) = {prob}")

Data Learning

CPTs can learn parameters from data:

import pandas as pd
from causaliq_core.bn.dist import CPT

# Training data
data = pd.DataFrame({
    'Weather': ['Sunny', 'Rainy', 'Sunny', 'Rainy', 'Sunny'],
    'Sprinkler': ['Off', 'On', 'On', 'Off', 'Off']
})

# Learn CPT from data
learned_cpt = CPT.from_data(
    variable='Sprinkler',
    parents=['Weather'], 
    data=data,
    values=['On', 'Off']
)

NodeValueCombinations Utility

The module also provides the NodeValueCombinations utility class for handling parent value combinations:

from causaliq_core.bn.dist import NodeValueCombinations

# Create combinations for multiple parents
nvc = NodeValueCombinations(['Weather', 'Season'], 
                          [['Sunny', 'Rainy'], ['Summer', 'Winter']])

# Get all combinations
combinations = nvc.combinations()
# Result: [('Sunny', 'Summer'), ('Sunny', 'Winter'), 
#          ('Rainy', 'Summer'), ('Rainy', 'Winter')]

API Reference

cpt

Classes:

  • CPT

    Base class for conditional probability tables.

  • NodeValueCombinations

    Iterable over all combinations of node values

CPT

CPT(
    pmfs: Union[Dict[str, float], List[Tuple[Dict[str, str], Dict[str, float]]]],
    estimated: int = 0,
)

Base class for conditional probability tables.

Parameters:

  • pmfs
    (Union[Dict[str, float], List[Tuple[Dict[str, str], Dict[str, float]]]]) –

    A pmf of {value: prob} for parentless nodes OR list of tuples ({parent: value}, {value: prob}).

  • estimated
    (int, default: 0 ) –

    How many PMFs were estimated.

Attributes:

  • cpt

    Internal representation of the CPT. {node_values: prob} for parentless node, otherwise {parental_values as frozenset: {node_values: prob}}.

  • estimated

    Number of PMFs that were estimated.

  • values

    Values which node can take.

Raises:

  • TypeError

    If arguments are of wrong type.

  • ValueError

    If arguments have invalid or conflicting values.

Methods:

  • __eq__

    Return whether two CPTs are the same allowing for probability

  • __str__

    Human-friendly description of the contents of the CPT.

  • cdist

    Return conditional probabilities of node values for specified

  • fit

    Constructs a CPT (Conditional Probability Table) from data.

  • node_values

    Return node values (states) of node CPT relates to.

  • param_ratios

    Returns distribution of parameter ratios across all parental

  • parents

    Return parents of node CPT relates to.

  • random_value

    Generate a random value for a node given the value of its parents.

  • to_spec

    Returns external specification format of CPT,

  • validate_cnds

    Checks that all CNDs in graph are consistent with one another

  • validate_parents

    Checks every CPT's parents and parental values are consistent

__eq__
__eq__(other: object) -> bool

Return whether two CPTs are the same allowing for probability rounding errors

:param other: CPT to compared to self :type other: CPT

:returns: whether CPTs are PRACTICALLY the same :rtype: bool

__str__
__str__() -> str

Human-friendly description of the contents of the CPT.

Returns:

  • str

    String representation of the CPT contents.

cdist
cdist(parental_values: Optional[Dict[str, str]] = None) -> Dict[str, float]

Return conditional probabilities of node values for specified parental values.

Parameters:

  • parental_values
    (Optional[Dict[str, str]], default: None ) –

    Parental values for which pmf required

Raises:

  • TypeError

    If args are of wrong type.

  • ValueError

    If args have invalid or conflicting values.

fit classmethod
fit(
    node: str,
    parents: Optional[Tuple[str, ...]],
    data: Union[BNFit, Any],
    autocomplete: bool = True,
) -> Tuple[Tuple[type, Dict[str, Any]], Optional[int]]

Constructs a CPT (Conditional Probability Table) from data.

Parameters:

  • node
    (str) –

    Node that CPT applies to.

  • parents
    (Optional[Tuple[str, ...]]) –

    Parents of node.

  • data
    (Union[BNFit, Any]) –

    Data to fit CPT to.

  • autocomplete
    (bool, default: True ) –

    Whether to ensure CPT data contains entries for

Returns:

  • Tuple[type, Dict[str, Any]]

    Tuple of (cnd_spec, estimated_pmfs) where

  • Optional[int]

    cnd_spec is (CPT class, cpt_spec for CPT())

  • Tuple[Tuple[type, Dict[str, Any]], Optional[int]]

    estimated_pmfs is int, # estimated pmfs.

node_values
node_values() -> List[str]

Return node values (states) of node CPT relates to.

Returns:

  • List[str]

    Node values in alphabetical order.

param_ratios
param_ratios() -> None

Returns distribution of parameter ratios across all parental values for each combination of possible node values.

:returns dict: {(node value pair): (param ratios across parents)

parents
parents() -> List[str]

Return parents of node CPT relates to.

Returns:

  • List[str]

    Parent node names in alphabetical order.

random_value
random_value(pvs: Optional[Dict[str, str]]) -> str

Generate a random value for a node given the value of its parents.

Parameters:

  • pvs
    (Optional[Dict[str, str]]) –

    Parental values, {parent1: value1, ...}.

Returns:

  • str

    Random value for node.

to_spec
to_spec(name_map: Dict[str, str]) -> Dict[str, Any]

Returns external specification format of CPT, renaming nodes according to a name map.

Parameters:

  • name_map
    (Dict[str, str]) –

    Map of node names {old: new}.

Returns:

  • Dict[str, Any]

    CPT specification with renamed nodes.

Raises:

  • TypeError

    If bad arg type.

  • ValueError

    If bad arg value, e.g. coeff keys not in map.

validate_cnds classmethod
validate_cnds(
    nodes: List[str], cnds: Dict[str, CND], parents: Dict[str, List[str]]
) -> None

Checks that all CNDs in graph are consistent with one another and with graph structure.

Parameters:

  • nodes
    (list) –

    BN nodes.

  • cnds
    (dict) –

    Set of CNDs for the BN, {node: cnd}.

  • parents
    (dict) –

    Parents of non-orphan nodes, {node: parents}.

Raises:

  • TypeError

    If invalid types used in arguments.

  • ValueError

    If any inconsistent values found.

validate_parents
validate_parents(
    node: str, parents: Dict[str, List[str]], node_values: Dict[str, List[str]]
) -> None

Checks every CPT's parents and parental values are consistent with other relevant CPTs and the DAG structure.

Parameters:

  • node
    (str) –

    Name of node.

  • parents
    (Dict[str, List[str]]) –

    Parents of all nodes {node: parents}.

  • node_values
    (Dict[str, List[str]]) –

    Values of each cat. node {node: values}.

Raises:

  • ValueError

    If parent mismatch or missing parental

NodeValueCombinations

NodeValueCombinations(node_values: Dict[str, List[str]], sort: bool = True)

Iterable over all combinations of node values

:param dict node_values: allowed values for each node {node: [values]} :param bool sort: whether to sort node names and values into alphabetic order

Methods:

  • __iter__

    Returns the initialised iterator

  • __next__

    Generate the next node value combination

__iter__
__iter__() -> NodeValueCombinations

Returns the initialised iterator

:returns NodeValueCombinations: the iterator

__next__
__next__() -> Dict[str, str]

Generate the next node value combination

:raises StopIteration: when all combinations have been returned

:returns dict: next node value combination {node: value}