BN Class¶
The BN class is the main class for representing Bayesian Networks in CausalIQ Core. It combines a directed acyclic graph (DAG) structure with conditional probability distributions to create a complete probabilistic model.
Overview¶
A Bayesian Network consists of:
- A DAG structure defining the conditional independence relationships
- Conditional Node Distributions (CNDs) for each node specifying local probability distributions
- Parameters that can be learned from data or specified manually
Key Features¶
- Probabilistic Inference: Compute marginal and conditional probabilities
- Parameter Learning: Fit distributions from data using the
fit()method - Multiple Distribution Types: Support for both discrete (CPT) and continuous (LinGauss) distributions
- Caching: Efficient computation with cached marginals
- Serialization: Save and load networks in multiple formats
Class Methods¶
The BN class provides several key methods:
- Construction: Initialize from DAG and conditional distribution specifications
- Inference: Compute marginals and conditional probabilities
- Learning: Fit parameters from data
- Validation: Check network consistency and parameter validity
Example Usage¶
from causaliq_core.bn import BN, CPT
from causaliq_core.graph import DAG
# Create DAG structure
dag = DAG(['Weather', 'Sprinkler', 'Grass'],
[('Weather', 'Sprinkler'), ('Weather', 'Grass'), ('Sprinkler', 'Grass')])
# Define conditional distributions
cnd_specs = {
'Weather': CPT(values=['Sunny', 'Rainy'], table=[0.7, 0.3]),
'Sprinkler': CPT(values=['On', 'Off'],
table=[0.1, 0.9, 0.8, 0.2],
parents=['Weather']),
'Grass': CPT(values=['Wet', 'Dry'],
table=[0.95, 0.05, 0.8, 0.2, 0.9, 0.1, 0.05, 0.95],
parents=['Weather', 'Sprinkler'])
}
# Create Bayesian Network
bn = BN(dag, cnd_specs)
# Compute marginal probabilities
marginals = bn.marginals(['Weather', 'Grass'])
print(marginals)
# Compute conditional probability
conditional = bn.conditional(['Grass'], ['Weather'], 'Rainy')
print(conditional)
API Reference¶
BN
¶
BN(dag: DAG, cnd_specs: Dict[str, Any], estimated_pmfs: Dict[str, Any] = {})
Base class for Bayesian Networks.
Bayesian Networks have a DAG and an associated probability distribution defined by CPTs.
Parameters:
-
(dag¶DAG) –DAG for the Bayesian Network.
-
(cnd_specs¶Dict[str, Any]) –Specification of each conditional node distribution.
-
(estimated_pmfs¶Dict[str, Any], default:{}) –Number of PMFs that had to be estimated for each node.
Attributes:
-
dag–BN's DAG.
-
cnds–Conditional distributions for each node {node: CND}.
-
free_params–Total number of free parameters in BN.
-
estimated_pmfs–Number of estimated pmfs for each node.
Raises:
-
TypeError–If arguments have invalid types.
-
ValueError–If arguments have invalid values.
Methods:
-
__eq__–Compare another BN with this one.
-
fit–Alternative instantiation of BN using data to implicitly define the
-
generate_cases–Generate specified number of random data cases for this BN.
-
global_distribution–Generate the global probability distribution for the BN.
-
lnprob_case–Return log of probability of set of node values (case) occuring.
-
marginal_distribution–Generate a marginal probability distribution for a specified node
-
marginals–Return marginal distribution for specified nodes.
-
rename–Rename nodes in place according to name map.
__eq__
¶
__eq__(other: object) -> bool
Compare another BN with this one.
Parameters:
-
(other¶object) –The other BN to compare with this one.
Returns:
-
bool–True, if other BN is same as this one.
fit
classmethod
¶
Alternative instantiation of BN using data to implicitly define the conditional probability data.
Parameters:
Returns:
-
BN–A new BN instance fitted to the data.
Raises:
-
TypeError–If arguments have invalid types.
-
ValueError–If arguments have invalid values.
generate_cases
¶
Generate specified number of random data cases for this BN.
Parameters:
-
(n¶int) –Number of cases to generate.
-
(outfile¶Optional[str], default:None) –Name of file to write instance to.
-
(pseudo¶bool, default:True) –If pseudo-random (i.e. repeatable cases) to be produced, otherwise truly random.
Returns:
-
DataFrame–Random data cases.
Raises:
-
TypeError–If arguments not of correct type.
-
ValueError–If invalid number of rows requested.
-
FileNotFoundError–If outfile in nonexistent folder.
global_distribution
¶
Generate the global probability distribution for the BN.
Returns:
-
DataFrame–Global distribution in descending probability (and then by
-
DataFrame–ascending values).
lnprob_case
¶
lnprob_case(case_values: Dict[str, Any], base: Union[int, str] = 10) -> Optional[float]
Return log of probability of set of node values (case) occuring.
Parameters:
-
(case_values¶Dict[str, Any]) –Value for each node {node: value}.
-
(base¶Union[int, str], default:10) –Logarithm base to use - 2, 10 or 'e'.
Returns:
-
Optional[float]–Log of probability of case occuring, or None if case has zero
-
Optional[float]–probability.
Raises:
-
TypeError–If arguments wrong type.
-
ValueError–If arguments have invalid values.
marginal_distribution
¶
Generate a marginal probability distribution for a specified node and its parents in same format returned by Panda crosstab function.
Parameters:
-
(node¶str) –Node for which distribution required.
-
(parents¶Optional[List[str]], default:None) –Parents of node.
Returns:
-
DataFrame–Marginal distribution with parental value combos as columns,
-
DataFrame–and node values as rows.
marginals
¶
marginals(nodes: List[str]) -> DataFrame
Return marginal distribution for specified nodes.
Parameters:
-
(nodes¶List[str]) –Nodes for which marginal distribution required.
Returns:
-
DataFrame–Marginal distribution in same format returned by Pandas
-
DataFrame–crosstab function.
Raises:
-
TypeError–If arguments have bad type.
-
ValueError–If arguments contain bad values.