Skip to content

BN Class

The BN class is the main class for representing Bayesian Networks in CausalIQ Core. It combines a directed acyclic graph (DAG) structure with conditional probability distributions to create a complete probabilistic model.

Overview

A Bayesian Network consists of:

  • A DAG structure defining the conditional independence relationships
  • Conditional Node Distributions (CNDs) for each node specifying local probability distributions
  • Parameters that can be learned from data or specified manually

Key Features

  • Probabilistic Inference: Compute marginal and conditional probabilities
  • Parameter Learning: Fit distributions from data using the fit() method
  • Multiple Distribution Types: Support for both discrete (CPT) and continuous (LinGauss) distributions
  • Caching: Efficient computation with cached marginals
  • Serialization: Save and load networks in multiple formats

Class Methods

The BN class provides several key methods:

  • Construction: Initialize from DAG and conditional distribution specifications
  • Inference: Compute marginals and conditional probabilities
  • Learning: Fit parameters from data
  • Validation: Check network consistency and parameter validity

Example Usage

from causaliq_core.bn import BN, CPT
from causaliq_core.graph import DAG

# Create DAG structure
dag = DAG(['Weather', 'Sprinkler', 'Grass'], 
          [('Weather', 'Sprinkler'), ('Weather', 'Grass'), ('Sprinkler', 'Grass')])

# Define conditional distributions  
cnd_specs = {
    'Weather': CPT(values=['Sunny', 'Rainy'], table=[0.7, 0.3]),
    'Sprinkler': CPT(values=['On', 'Off'], 
                     table=[0.1, 0.9, 0.8, 0.2], 
                     parents=['Weather']),
    'Grass': CPT(values=['Wet', 'Dry'],
                 table=[0.95, 0.05, 0.8, 0.2, 0.9, 0.1, 0.05, 0.95],
                 parents=['Weather', 'Sprinkler'])
}

# Create Bayesian Network
bn = BN(dag, cnd_specs)

# Compute marginal probabilities
marginals = bn.marginals(['Weather', 'Grass'])
print(marginals)

# Compute conditional probability
conditional = bn.conditional(['Grass'], ['Weather'], 'Rainy')
print(conditional)

API Reference

BN

BN(dag: DAG, cnd_specs: Dict[str, Any], estimated_pmfs: Dict[str, Any] = {})

Base class for Bayesian Networks.

Bayesian Networks have a DAG and an associated probability distribution defined by CPTs.

Parameters:

  • dag

    (DAG) –

    DAG for the Bayesian Network.

  • cnd_specs

    (Dict[str, Any]) –

    Specification of each conditional node distribution.

  • estimated_pmfs

    (Dict[str, Any], default: {} ) –

    Number of PMFs that had to be estimated for each node.

Attributes:

  • dag

    BN's DAG.

  • cnds

    Conditional distributions for each node {node: CND}.

  • free_params

    Total number of free parameters in BN.

  • estimated_pmfs

    Number of estimated pmfs for each node.

Raises:

  • TypeError

    If arguments have invalid types.

  • ValueError

    If arguments have invalid values.

Methods:

  • __eq__

    Compare another BN with this one.

  • fit

    Alternative instantiation of BN using data to implicitly define the

  • generate_cases

    Generate specified number of random data cases for this BN.

  • global_distribution

    Generate the global probability distribution for the BN.

  • lnprob_case

    Return log of probability of set of node values (case) occuring.

  • marginal_distribution

    Generate a marginal probability distribution for a specified node

  • marginals

    Return marginal distribution for specified nodes.

  • rename

    Rename nodes in place according to name map.

__eq__

__eq__(other: object) -> bool

Compare another BN with this one.

Parameters:

  • other
    (object) –

    The other BN to compare with this one.

Returns:

  • bool

    True, if other BN is same as this one.

fit classmethod

fit(dag: DAG, data: BNFit) -> BN

Alternative instantiation of BN using data to implicitly define the conditional probability data.

Parameters:

  • dag
    (DAG) –

    DAG for the Bayesian Network.

  • data
    (BNFit) –

    Data to fit CPTs to.

Returns:

  • BN

    A new BN instance fitted to the data.

Raises:

  • TypeError

    If arguments have invalid types.

  • ValueError

    If arguments have invalid values.

generate_cases

generate_cases(
    n: int, outfile: Optional[str] = None, pseudo: bool = True
) -> DataFrame

Generate specified number of random data cases for this BN.

Parameters:

  • n
    (int) –

    Number of cases to generate.

  • outfile
    (Optional[str], default: None ) –

    Name of file to write instance to.

  • pseudo
    (bool, default: True ) –

    If pseudo-random (i.e. repeatable cases) to be produced, otherwise truly random.

Returns:

  • DataFrame

    Random data cases.

Raises:

  • TypeError

    If arguments not of correct type.

  • ValueError

    If invalid number of rows requested.

  • FileNotFoundError

    If outfile in nonexistent folder.

global_distribution

global_distribution() -> DataFrame

Generate the global probability distribution for the BN.

Returns:

  • DataFrame

    Global distribution in descending probability (and then by

  • DataFrame

    ascending values).

lnprob_case

lnprob_case(case_values: Dict[str, Any], base: Union[int, str] = 10) -> Optional[float]

Return log of probability of set of node values (case) occuring.

Parameters:

  • case_values
    (Dict[str, Any]) –

    Value for each node {node: value}.

  • base
    (Union[int, str], default: 10 ) –

    Logarithm base to use - 2, 10 or 'e'.

Returns:

  • Optional[float]

    Log of probability of case occuring, or None if case has zero

  • Optional[float]

    probability.

Raises:

  • TypeError

    If arguments wrong type.

  • ValueError

    If arguments have invalid values.

marginal_distribution

marginal_distribution(node: str, parents: Optional[List[str]] = None) -> DataFrame

Generate a marginal probability distribution for a specified node and its parents in same format returned by Panda crosstab function.

Parameters:

  • node
    (str) –

    Node for which distribution required.

  • parents
    (Optional[List[str]], default: None ) –

    Parents of node.

Returns:

  • DataFrame

    Marginal distribution with parental value combos as columns,

  • DataFrame

    and node values as rows.

marginals

marginals(nodes: List[str]) -> DataFrame

Return marginal distribution for specified nodes.

Parameters:

  • nodes
    (List[str]) –

    Nodes for which marginal distribution required.

Returns:

  • DataFrame

    Marginal distribution in same format returned by Pandas

  • DataFrame

    crosstab function.

Raises:

  • TypeError

    If arguments have bad type.

  • ValueError

    If arguments contain bad values.

rename

rename(name_map: Dict[str, str]) -> None

Rename nodes in place according to name map.

Parameters:

  • name_map
    (Dict[str, str]) –

    Name mapping {name: new name}.

Raises:

  • TypeError

    With bad arg type.

  • ValueError

    With bad arg values e.g. unknown node names.