Distribution Classes¶
The distribution module provides conditional node distribution (CND) classes for representing local probability distributions in Bayesian Networks. These classes define how each node's values depend on its parents in the network.
Overview¶
The distribution classes implement different types of conditional probability distributions:
- CPT: Discrete conditional probability tables for categorical variables
- LinGauss: Linear Gaussian distributions for continuous variables
- CND: Abstract base class defining the common interface
Distribution Types¶
Conditional Probability Tables (CPT)¶
Used for discrete/categorical variables. Stores probability tables that map parent value combinations to child value probabilities.
Key Features:
- Support for multiple discrete parent variables
- Efficient storage and lookup of probability values
- Automatic normalization and validation
- Missing data handling
Linear Gaussian (LinGauss)¶
Used for continuous variables that follow normal distributions with linear dependencies on parents.
Key Features:
- Linear regression relationships with parent variables
- Support for both continuous and discrete parent variables
- Gaussian noise modeling
- Parameter estimation from data
Base Class (CND)¶
Abstract base class that defines the common interface for all conditional node distributions.
Key Features:
- Common methods for probability computation
- Standardized parameter access
- Consistent serialization interface
- Type checking and validation
Common Operations¶
All distribution classes support common operations:
- Probability Evaluation: Computing P(child | parents)
- Sampling: Generating samples from the distribution
- Parameter Access: Getting and setting distribution parameters
- Validation: Checking parameter consistency
Example Usage¶
from causaliq_core.bn.dist import CPT, LinGauss
# Create discrete distribution (CPT)
weather_dist = CPT(
values=['Sunny', 'Rainy'],
table=[0.7, 0.3] # P(Weather=Sunny)=0.7, P(Weather=Rainy)=0.3
)
# Create conditional discrete distribution
sprinkler_dist = CPT(
values=['On', 'Off'],
table=[0.1, 0.9, 0.8, 0.2], # Depends on Weather
parents=['Weather']
)
# Create continuous distribution
temperature_dist = LinGauss(
mean=20.0, # Base temperature
sd=2.0, # Standard deviation
coeffs={}, # No parent dependencies in this example
parents=[]
)
Distribution Selection¶
Choose the appropriate distribution type based on your variable characteristics:
- Use CPT for categorical/discrete variables (e.g., weather conditions, disease status)
- Use LinGauss for continuous variables with linear relationships (e.g., temperature, measurements)
API Reference¶
dist
¶
Distribution classes for Bayesian Network nodes.
This module contains conditional node distribution (CND) implementations including the abstract base class and concrete implementations like Linear Gaussian distributions and Conditional Probability Tables.
Modules:
Classes:
-
CND–Conditional Node Distribution for a node conditional on parental values.
-
CPT–Base class for conditional probability tables.
-
LinGauss–Conditional Linear Gaussian Distribution.
-
NodeValueCombinations–Iterable over all combinations of node values
CND
¶
Conditional Node Distribution for a node conditional on parental values.
Concrete subclasses support specific kinds of distributions, for example, CPT (multinomial), LinearGaussian etc.
Attributes:
-
has_parents(bool) –Whether CND is for a node with parents.
-
free_params(int) –Number of free params in CND.
Methods:
-
__eq__–Return whether two CNDs are the same allowing for
-
__str__–Human-friendly description of the contents of the CND.
-
cdist–Return conditional distribution for specified parental values.
-
fit–Constructs a CND (Conditional Node Distribution) from data.
-
parents–Return parents of node CND relates to.
-
random_value–Generate a random value for a node given the value of its parents.
-
to_spec–Returns external specification format of CND,
-
validate_cnds–Checks that all CNDs in graph are consistent with one another
-
validate_parents–Checks every CND's parents and (categorical) parental values
cdist
abstractmethod
¶
cdist(parental_values: Optional[Dict[str, Any]] = None) -> Any
Return conditional distribution for specified parental values.
Parameters:
-
(parental_values¶dict, default:None) –Parental values for which dist. required for non-orphans.
Raises:
-
TypeError–If args are of wrong type.
-
ValueError–If args have invalid or conflicting values.
fit
abstractmethod
classmethod
¶
fit(
node: str, parents: Optional[Tuple[str, ...]], data: Any, autocomplete: bool = True
) -> Tuple[Tuple[type, Dict[str, Any]], Optional[int]]
Constructs a CND (Conditional Node Distribution) from data.
Parameters:
-
(node¶str) –Node that CND applies to.
-
(parents¶tuple) –Parents of node.
-
(data¶Data) –Data to fit CND to.
-
(autocomplete¶bool, default:True) –Whether complete CPT tables.
Returns:
-
tuple(Tuple[Tuple[type, Dict[str, Any]], Optional[int]]) –(cnd_spec, estimated_pmfs) where cnd_spec is (CPT class, cpt_spec for CPT()) estimated_pmfs int/None - only for CPTs.
parents
abstractmethod
¶
Return parents of node CND relates to.
Returns:
-
list(List[str]) –Parent node names in alphabetical order.
random_value
abstractmethod
¶
random_value(pvs: Optional[Dict[str, Any]]) -> Union[str, float]
Generate a random value for a node given the value of its parents.
Parameters:
-
(pvs¶dict) –Parental values, {parent1: value1, ...}.
Returns:
-
Union[str, float]–str or float: Random value for node.
to_spec
abstractmethod
¶
to_spec(name_map: Dict[str, str]) -> Dict[str, Any]
Returns external specification format of CND, renaming nodes according to a name map.
Parameters:
-
(name_map¶dict) –Map of node names {old: new}.
Returns:
-
dict(Dict[str, Any]) –CND specification with renamed nodes.
validate_cnds
classmethod
¶
Checks that all CNDs in graph are consistent with one another and with graph structure.
Parameters:
-
(nodes¶list) –BN nodes.
-
(cnds¶dict) –Set of CNDs for the BN, {node: cnd}.
-
(parents¶dict) –Parents of non-orphan nodes, {node: parents}.
Raises:
-
TypeError–If invalid types used in arguments.
-
ValueError–If any inconsistent values found.
validate_parents
abstractmethod
¶
validate_parents(
node: str, parents: Dict[str, List[str]], node_values: Dict[str, List[str]]
) -> None
Checks every CND's parents and (categorical) parental values are consistent.
Validates consistency with the other relevant CNDs and the DAG structure.
Parameters:
CPT
¶
CPT(
pmfs: Union[Dict[str, float], List[Tuple[Dict[str, str], Dict[str, float]]]],
estimated: int = 0,
)
Base class for conditional probability tables.
Parameters:
-
(pmfs¶Union[Dict[str, float], List[Tuple[Dict[str, str], Dict[str, float]]]]) –A pmf of {value: prob} for parentless nodes OR list of tuples ({parent: value}, {value: prob}).
-
(estimated¶int, default:0) –How many PMFs were estimated.
Attributes:
-
cpt–Internal representation of the CPT. {node_values: prob} for parentless node, otherwise {parental_values as frozenset: {node_values: prob}}.
-
estimated–Number of PMFs that were estimated.
-
values–Values which node can take.
Raises:
-
TypeError–If arguments are of wrong type.
-
ValueError–If arguments have invalid or conflicting values.
Methods:
-
__eq__–Return whether two CPTs are the same allowing for probability
-
__str__–Human-friendly description of the contents of the CPT.
-
cdist–Return conditional probabilities of node values for specified
-
fit–Constructs a CPT (Conditional Probability Table) from data.
-
node_values–Return node values (states) of node CPT relates to.
-
param_ratios–Returns distribution of parameter ratios across all parental
-
parents–Return parents of node CPT relates to.
-
random_value–Generate a random value for a node given the value of its parents.
-
to_spec–Returns external specification format of CPT,
-
validate_cnds–Checks that all CNDs in graph are consistent with one another
-
validate_parents–Checks every CPT's parents and parental values are consistent
__eq__
¶
Return whether two CPTs are the same allowing for probability rounding errors
:param other: CPT to compared to self :type other: CPT
:returns: whether CPTs are PRACTICALLY the same :rtype: bool
__str__
¶
Human-friendly description of the contents of the CPT.
Returns:
-
str–String representation of the CPT contents.
cdist
¶
cdist(parental_values: Optional[Dict[str, str]] = None) -> Dict[str, float]
Return conditional probabilities of node values for specified parental values.
Parameters:
-
(parental_values¶Optional[Dict[str, str]], default:None) –Parental values for which pmf required
Raises:
-
TypeError–If args are of wrong type.
-
ValueError–If args have invalid or conflicting values.
fit
classmethod
¶
fit(
node: str,
parents: Optional[Tuple[str, ...]],
data: Union[BNFit, Any],
autocomplete: bool = True,
) -> Tuple[Tuple[type, Dict[str, Any]], Optional[int]]
Constructs a CPT (Conditional Probability Table) from data.
Parameters:
-
(node¶str) –Node that CPT applies to.
-
(parents¶Optional[Tuple[str, ...]]) –Parents of node.
-
(data¶Union[BNFit, Any]) –Data to fit CPT to.
-
(autocomplete¶bool, default:True) –Whether to ensure CPT data contains entries for
Returns:
-
Tuple[type, Dict[str, Any]]–Tuple of (cnd_spec, estimated_pmfs) where
-
Optional[int]–cnd_spec is (CPT class, cpt_spec for CPT())
-
Tuple[Tuple[type, Dict[str, Any]], Optional[int]]–estimated_pmfs is int, # estimated pmfs.
node_values
¶
Return node values (states) of node CPT relates to.
Returns:
-
List[str]–Node values in alphabetical order.
param_ratios
¶
Returns distribution of parameter ratios across all parental values for each combination of possible node values.
:returns dict: {(node value pair): (param ratios across parents)
parents
¶
Return parents of node CPT relates to.
Returns:
-
List[str]–Parent node names in alphabetical order.
random_value
¶
random_value(pvs: Optional[Dict[str, str]]) -> str
Generate a random value for a node given the value of its parents.
Parameters:
-
(pvs¶Optional[Dict[str, str]]) –Parental values, {parent1: value1, ...}.
Returns:
-
str–Random value for node.
to_spec
¶
to_spec(name_map: Dict[str, str]) -> Dict[str, Any]
Returns external specification format of CPT, renaming nodes according to a name map.
Parameters:
-
(name_map¶Dict[str, str]) –Map of node names {old: new}.
Returns:
-
Dict[str, Any]–CPT specification with renamed nodes.
Raises:
-
TypeError–If bad arg type.
-
ValueError–If bad arg value, e.g. coeff keys not in map.
validate_cnds
classmethod
¶
Checks that all CNDs in graph are consistent with one another and with graph structure.
Parameters:
-
(nodes¶list) –BN nodes.
-
(cnds¶dict) –Set of CNDs for the BN, {node: cnd}.
-
(parents¶dict) –Parents of non-orphan nodes, {node: parents}.
Raises:
-
TypeError–If invalid types used in arguments.
-
ValueError–If any inconsistent values found.
validate_parents
¶
validate_parents(
node: str, parents: Dict[str, List[str]], node_values: Dict[str, List[str]]
) -> None
Checks every CPT's parents and parental values are consistent with other relevant CPTs and the DAG structure.
Parameters:
-
(node¶str) –Name of node.
-
(parents¶Dict[str, List[str]]) –Parents of all nodes {node: parents}.
-
(node_values¶Dict[str, List[str]]) –Values of each cat. node {node: values}.
Raises:
-
ValueError–If parent mismatch or missing parental
LinGauss
¶
LinGauss(lg: Dict[str, Any])
Conditional Linear Gaussian Distribution.
Parameters:
-
(lg¶Dict[str, Any]) –Specification of Linear Gaussian in following form: {'coeffs': {node: coeff}, 'mean': mean, 'sd': sd}.
Attributes:
-
coeffs–Linear coefficient of parents {parent: coeff}.
-
mean–Mean of Gaussian noise (aka intercept, mu).
-
sd–S.D. of Gaussian noise (aka sigma).
Raises:
-
TypeError–If called with bad arg types.
-
ValueError–If called with bad arg values.
Methods:
-
__eq__–Return whether two CNDs are the same allowing for probability
-
__str__–Human-friendly formula description of the Linear Gaussian.
-
cdist–Return conditional distribution for specified parental values.
-
fit–Fit a Linear Gaussian to data.
-
parents–Return parents of node CND relates to.
-
random_value–Generate a random value for a node given the value of its parents.
-
to_spec–Returns external specification format of LinGauss,
-
validate_parents–Check LinGauss coeff keys consistent with parents in DAG.
__eq__
¶
Return whether two CNDs are the same allowing for probability rounding errors
:param CND other: CND to compared to self
:returns bool: whether LinGauss objects are the same up to 10 sf
__str__
¶
Human-friendly formula description of the Linear Gaussian.
Returns:
-
str–String representation of the Linear Gaussian formula.
cdist
¶
cdist(parental_values: Optional[Dict[str, float]] = None) -> Tuple[float, float]
Return conditional distribution for specified parental values.
Parameters:
-
(parental_values¶Optional[Dict[str, float]], default:None) –Parental values for which dist. required
Returns:
-
Tuple[float, float]–Tuple of (mean, sd) of child Gaussian distribution.
Raises:
-
TypeError–If args are of wrong type.
-
ValueError–If args have invalid or conflicting values.
fit
classmethod
¶
fit(
node: str,
parents: Optional[Tuple[str, ...]],
data: Union[Pandas, BNFit],
autocomplete: bool = True,
) -> Tuple[Tuple[type, Dict[str, Any]], Optional[int]]
Fit a Linear Gaussian to data.
Parameters:
-
(node¶str) –Node that Linear Gaussian applies to.
-
(parents¶Optional[Tuple[str, ...]]) –Parents of node.
-
(data¶Union[Pandas, BNFit]) –Data to fit Linear Gaussian to.
-
(autocomplete¶bool, default:True) –Not used for Linear Gaussian.
Returns:
-
Tuple[Tuple[type, Dict[str, Any]], Optional[int]]–Tuple of (lg_spec, None) where lg is (LinGauss class, lg_spec).
Raises:
-
TypeError–With bad arg types.
-
ValueError–With bad arg values.
parents
¶
Return parents of node CND relates to.
Returns:
-
List[str]–Parent node names in alphabetical order.
random_value
¶
random_value(pvs: Optional[Dict[str, float]]) -> float
Generate a random value for a node given the value of its parents.
Parameters:
-
(pvs¶Optional[Dict[str, float]]) –Parental values, {parent1: value1, ...}.
Returns:
-
float–Random value for node.
to_spec
¶
to_spec(name_map: Dict[str, str]) -> Dict[str, Any]
Returns external specification format of LinGauss, renaming nodes according to a name map.
Parameters:
-
(name_map¶Dict[str, str]) –Map of node names {old: new}.
Returns:
-
Dict[str, Any]–LinGauss specification with renamed nodes.
Raises:
-
TypeError–If bad arg type.
-
ValueError–If bad arg value, e.g. coeff keys not in map.
validate_parents
¶
validate_parents(
node: str, parents: Dict[str, List[str]], node_values: Dict[str, List[str]]
) -> None
Check LinGauss coeff keys consistent with parents in DAG.
:param str node: name of node :param dict parents: parents of all nodes defined in DAG :param dict node_values: values of each cat. node [UNUSED]
NodeValueCombinations
¶
Iterable over all combinations of node values
:param dict node_values: allowed values for each node {node: [values]} :param bool sort: whether to sort node names and values into alphabetic order
Methods:
__iter__
¶
__iter__() -> NodeValueCombinations
Returns the initialised iterator
:returns NodeValueCombinations: the iterator
__next__
¶
Generate the next node value combination
:raises StopIteration: when all combinations have been returned
:returns dict: next node value combination {node: value}