BN Class¶

The BN class is the main class for representing Bayesian Networks in CausalIQ Core. It combines a directed acyclic graph (DAG) structure with conditional probability distributions to create a complete probabilistic model.

Overview¶

A Bayesian Network consists of:

A DAG structure defining the conditional independence relationships
Conditional Node Distributions (CNDs) for each node specifying local probability distributions
Parameters that can be learned from data or specified manually

Key Features¶

Probabilistic Inference: Compute marginal and conditional probabilities
Parameter Learning: Fit distributions from data using the fit() method
Multiple Distribution Types: Support for both discrete (CPT) and continuous (LinGauss) distributions
Caching: Efficient computation with cached marginals
Serialization: Save and load networks in multiple formats

Class Methods¶

The BN class provides several key methods:

Construction: Initialize from DAG and conditional distribution specifications
Inference: Compute marginals and conditional probabilities
Learning: Fit parameters from data
Validation: Check network consistency and parameter validity

Example Usage¶

from causaliq_core.bn import BN, CPT
from causaliq_core.graph import DAG

# Create DAG structure
dag = DAG(['Weather', 'Sprinkler', 'Grass'], 
          [('Weather', 'Sprinkler'), ('Weather', 'Grass'), ('Sprinkler', 'Grass')])

# Define conditional distributions  
cnd_specs = {
    'Weather': CPT(values=['Sunny', 'Rainy'], table=[0.7, 0.3]),
    'Sprinkler': CPT(values=['On', 'Off'], 
                     table=[0.1, 0.9, 0.8, 0.2], 
                     parents=['Weather']),
    'Grass': CPT(values=['Wet', 'Dry'],
                 table=[0.95, 0.05, 0.8, 0.2, 0.9, 0.1, 0.05, 0.95],
                 parents=['Weather', 'Sprinkler'])
}

# Create Bayesian Network
bn = BN(dag, cnd_specs)

# Compute marginal probabilities
marginals = bn.marginals(['Weather', 'Grass'])
print(marginals)

# Compute conditional probability
conditional = bn.conditional(['Grass'], ['Weather'], 'Rainy')
print(conditional)

API Reference¶

BN ¶

BN(dag: DAG, cnd_specs: Dict[str, Any], estimated_pmfs: Dict[str, Any] = {})

Base class for Bayesian Networks.

Bayesian Networks have a DAG and an associated probability distribution defined by CPTs.

Parameters:

dag ¶
(DAG) –

DAG for the Bayesian Network.
cnd_specs ¶
(Dict[str, Any]) –

Specification of each conditional node distribution.
estimated_pmfs ¶
(Dict[str, Any], default: {} ) –

Number of PMFs that had to be estimated for each node.

Attributes:

dag –

BN's DAG.
cnds –

Conditional distributions for each node {node: CND}.
free_params –

Total number of free parameters in BN.
estimated_pmfs –

Number of estimated pmfs for each node.

Raises:

TypeError –

If arguments have invalid types.
ValueError –

If arguments have invalid values.

Methods:

__eq__ –

Compare another BN with this one.
fit –

Alternative instantiation of BN using data to implicitly define the
generate_cases –

Generate specified number of random data cases for this BN.
global_distribution –

Generate the global probability distribution for the BN.
lnprob_case –

Return log of probability of set of node values (case) occuring.
marginal_distribution –

Generate a marginal probability distribution for a specified node
marginals –

Return marginal distribution for specified nodes.
rename –

Rename nodes in place according to name map.

eq ¶

__eq__(other: object) -> bool

Compare another BN with this one.

Parameters:

other ¶
(object) –

The other BN to compare with this one.

Returns:

bool –

True, if other BN is same as this one.

fit `classmethod` ¶

fit(dag: DAG, data: BNFit) -> BN

Alternative instantiation of BN using data to implicitly define the conditional probability data.

Parameters:

dag ¶
(DAG) –

DAG for the Bayesian Network.
data ¶
(BNFit) –

Data to fit CPTs to.

Returns:

BN –

A new BN instance fitted to the data.

Raises:

TypeError –

If arguments have invalid types.
ValueError –

If arguments have invalid values.

generate_cases ¶

generate_cases(
    n: int, outfile: Optional[str] = None, pseudo: bool = True
) -> DataFrame

Generate specified number of random data cases for this BN.

Parameters:

n ¶
(int) –

Number of cases to generate.
outfile ¶
(Optional[str], default: None ) –

Name of file to write instance to.
pseudo ¶
(bool, default: True ) –

If pseudo-random (i.e. repeatable cases) to be produced, otherwise truly random.

Returns:

DataFrame –

Random data cases.

Raises:

TypeError –

If arguments not of correct type.
ValueError –

If invalid number of rows requested.
FileNotFoundError –

If outfile in nonexistent folder.

global_distribution ¶

global_distribution() -> DataFrame

Generate the global probability distribution for the BN.

Returns:

DataFrame –

Global distribution in descending probability (and then by
DataFrame –

ascending values).

lnprob_case ¶

lnprob_case(case_values: Dict[str, Any], base: Union[int, str] = 10) -> Optional[float]

Return log of probability of set of node values (case) occuring.

Parameters:

case_values ¶
(Dict[str, Any]) –

Value for each node {node: value}.
base ¶
(Union[int, str], default: 10 ) –

Logarithm base to use - 2, 10 or 'e'.

Returns:

Optional[float] –

Log of probability of case occuring, or None if case has zero
Optional[float] –

probability.

Raises:

TypeError –

If arguments wrong type.
ValueError –

If arguments have invalid values.

marginal_distribution ¶

marginal_distribution(node: str, parents: Optional[List[str]] = None) -> DataFrame

Generate a marginal probability distribution for a specified node and its parents in same format returned by Panda crosstab function.

Parameters:

node ¶
(str) –

Node for which distribution required.
parents ¶
(Optional[List[str]], default: None ) –

Parents of node.

Returns:

DataFrame –

Marginal distribution with parental value combos as columns,
DataFrame –

and node values as rows.

marginals ¶

marginals(nodes: List[str]) -> DataFrame

Return marginal distribution for specified nodes.

Parameters:

nodes ¶
(List[str]) –

Nodes for which marginal distribution required.

Returns:

DataFrame –

Marginal distribution in same format returned by Pandas
DataFrame –

crosstab function.

Raises:

TypeError –

If arguments have bad type.
ValueError –

If arguments contain bad values.

rename ¶

rename(name_map: Dict[str, str]) -> None

Rename nodes in place according to name map.

Parameters:

name_map ¶
(Dict[str, str]) –

Name mapping {name: new name}.

Raises:

TypeError –

With bad arg type.
ValueError –

With bad arg values e.g. unknown node names.

BN Class¶

Overview¶

Key Features¶

Class Methods¶

Example Usage¶

API Reference¶

BN ¶

`dag` ¶

`cnd_specs` ¶

`estimated_pmfs` ¶

eq ¶

`other` ¶

fit `classmethod` ¶

`dag` ¶

`data` ¶

generate_cases ¶

`n` ¶

`outfile` ¶

`pseudo` ¶

global_distribution ¶

lnprob_case ¶

`case_values` ¶

`base` ¶

marginal_distribution ¶

`node` ¶

`parents` ¶

marginals ¶

`nodes` ¶

rename ¶

`name_map` ¶

BN Class¶

Overview¶

Key Features¶

Class Methods¶

Example Usage¶

API Reference¶

BN ¶

dag ¶

cnd_specs ¶

estimated_pmfs ¶

__eq__ ¶

other ¶

fit classmethod ¶

dag ¶

data ¶

generate_cases ¶

n ¶

outfile ¶

pseudo ¶

global_distribution ¶

lnprob_case ¶

case_values ¶

base ¶

marginal_distribution ¶

node ¶

parents ¶

marginals ¶

nodes ¶

rename ¶

name_map ¶

`dag` ¶

`cnd_specs` ¶

`estimated_pmfs` ¶

eq ¶

`other` ¶

fit `classmethod` ¶

`dag` ¶

`data` ¶

`n` ¶

`outfile` ¶

`pseudo` ¶

`case_values` ¶

`base` ¶

`node` ¶

`parents` ¶

`nodes` ¶

`name_map` ¶