Skip to content

LinGauss (Linear Gaussian Distribution)

The LinGauss class represents Linear Gaussian distributions for continuous variables in Bayesian Networks. It models variables that follow normal distributions with linear dependencies on their parent variables.

Overview

A Linear Gaussian distribution models a continuous variable as:

X = μ + Σ βᵢ * Parentᵢ + ε

Where:

  • μ (mean): Base mean value when all parents are zero
  • βᵢ (coefficients): Linear coefficients for each parent variable
  • σ (sd): Standard deviation of the Gaussian noise term ε
  • ε: Zero-mean Gaussian noise with variance σ²

Key Features

  • Linear Relationships: Models linear dependencies on parent variables
  • Mixed Parents: Support for both continuous and discrete parent variables
  • Parameter Learning: Efficient estimation from data using regression
  • Probabilistic Inference: Exact inference for Gaussian networks
  • Robust Estimation: Handles missing data and numerical stability

Model Specification

The distribution is fully specified by:

  • Mean (μ): Baseline value of the variable
  • Standard Deviation (σ): Noise level/uncertainty
  • Coefficients: Linear weights for each parent variable
  • Parents: Names of parent variables in the network

Example Usage

from causaliq_core.bn.dist import LinGauss

# Simple unconditional Gaussian
temperature = LinGauss(
    mean=20.0,    # 20°C baseline
    sd=3.0,       # 3°C standard deviation
    coeffs={},    # No parents
    parents=[]
)

# Linear dependency on one continuous parent
outdoor_temp = LinGauss(
    mean=15.0,        # Base indoor temperature  
    sd=2.0,           # Noise level
    coeffs={'OutdoorTemp': 0.7},  # 0.7 * OutdoorTemp
    parents=['OutdoorTemp']
)

# Multiple parent dependencies
room_temp = LinGauss(
    mean=18.0,
    sd=1.5, 
    coeffs={
        'OutdoorTemp': 0.4,    # Outdoor influence
        'HeatingLevel': 2.0    # Heating system effect
    },
    parents=['OutdoorTemp', 'HeatingLevel']
)

# Sample from distribution
sample = room_temp.sample({'OutdoorTemp': 10.0, 'HeatingLevel': 5.0})
print(f"Room temperature sample: {sample:.1f}°C")

# Compute probability density
density = room_temp.pdf(22.0, {'OutdoorTemp': 15.0, 'HeatingLevel': 3.0})
print(f"PDF at 22°C: {density:.4f}")

Learning from Data

LinGauss distributions can learn parameters from data using linear regression:

import pandas as pd
from causaliq_core.bn.dist import LinGauss

# Training data
data = pd.DataFrame({
    'OutdoorTemp': [10, 15, 20, 25, 30],
    'HeatingLevel': [8, 6, 4, 2, 0], 
    'IndoorTemp': [18, 19, 21, 23, 24]
})

# Learn distribution from data
learned_dist = LinGauss.from_data(
    variable='IndoorTemp',
    parents=['OutdoorTemp', 'HeatingLevel'],
    data=data
)

print(f"Learned mean: {learned_dist.mean}")
print(f"Learned coefficients: {learned_dist.coeffs}")
print(f"Learned standard deviation: {learned_dist.sd}")

Discrete Parents

LinGauss can also handle discrete parent variables by treating them as indicator variables:

# Distribution with discrete parent
energy_usage = LinGauss(
    mean=100.0,      # Base energy usage
    sd=10.0,
    coeffs={
        'Season': 20.0,     # Extra usage in winter
        'OutdoorTemp': -2.0  # Decrease with higher temp
    },
    parents=['Season', 'OutdoorTemp']  # Season is discrete, OutdoorTemp continuous
)

# Season would be encoded as 0 (Summer) or 1 (Winter)
usage = energy_usage.sample({'Season': 1, 'OutdoorTemp': 5.0})  # Winter, 5°C

Inference Properties

Linear Gaussian networks have special properties:

  • Exact Inference: Marginal and conditional distributions remain Gaussian
  • Efficient Computation: No approximation needed for probabilistic queries
  • Analytical Solutions: Closed-form expressions for most operations
  • Numerical Stability: Well-conditioned linear algebra operations

API Reference

lingauss

Classes:

  • LinGauss

    Conditional Linear Gaussian Distribution.

LinGauss

LinGauss(lg: Dict[str, Any])

Conditional Linear Gaussian Distribution.

Parameters:

  • lg
    (Dict[str, Any]) –

    Specification of Linear Gaussian in following form: {'coeffs': {node: coeff}, 'mean': mean, 'sd': sd}.

Attributes:

  • coeffs

    Linear coefficient of parents {parent: coeff}.

  • mean

    Mean of Gaussian noise (aka intercept, mu).

  • sd

    S.D. of Gaussian noise (aka sigma).

Raises:

  • TypeError

    If called with bad arg types.

  • ValueError

    If called with bad arg values.

Methods:

  • __eq__

    Return whether two CNDs are the same allowing for probability

  • __str__

    Human-friendly formula description of the Linear Gaussian.

  • cdist

    Return conditional distribution for specified parental values.

  • fit

    Fit a Linear Gaussian to data.

  • parents

    Return parents of node CND relates to.

  • random_value

    Generate a random value for a node given the value of its parents.

  • to_spec

    Returns external specification format of LinGauss,

  • validate_parents

    Check LinGauss coeff keys consistent with parents in DAG.

__eq__
__eq__(other: object) -> bool

Return whether two CNDs are the same allowing for probability rounding errors

:param CND other: CND to compared to self

:returns bool: whether LinGauss objects are the same up to 10 sf

__str__
__str__() -> str

Human-friendly formula description of the Linear Gaussian.

Returns:

  • str

    String representation of the Linear Gaussian formula.

cdist
cdist(parental_values: Optional[Dict[str, float]] = None) -> Tuple[float, float]

Return conditional distribution for specified parental values.

Parameters:

  • parental_values
    (Optional[Dict[str, float]], default: None ) –

    Parental values for which dist. required

Returns:

  • Tuple[float, float]

    Tuple of (mean, sd) of child Gaussian distribution.

Raises:

  • TypeError

    If args are of wrong type.

  • ValueError

    If args have invalid or conflicting values.

fit classmethod
fit(
    node: str,
    parents: Optional[Tuple[str, ...]],
    data: Union[Pandas, BNFit],
    autocomplete: bool = True,
) -> Tuple[Tuple[type, Dict[str, Any]], Optional[int]]

Fit a Linear Gaussian to data.

Parameters:

  • node
    (str) –

    Node that Linear Gaussian applies to.

  • parents
    (Optional[Tuple[str, ...]]) –

    Parents of node.

  • data
    (Union[Pandas, BNFit]) –

    Data to fit Linear Gaussian to.

  • autocomplete
    (bool, default: True ) –

    Not used for Linear Gaussian.

Returns:

  • Tuple[Tuple[type, Dict[str, Any]], Optional[int]]

    Tuple of (lg_spec, None) where lg is (LinGauss class, lg_spec).

Raises:

  • TypeError

    With bad arg types.

  • ValueError

    With bad arg values.

parents
parents() -> List[str]

Return parents of node CND relates to.

Returns:

  • List[str]

    Parent node names in alphabetical order.

random_value
random_value(pvs: Optional[Dict[str, float]]) -> float

Generate a random value for a node given the value of its parents.

Parameters:

  • pvs
    (Optional[Dict[str, float]]) –

    Parental values, {parent1: value1, ...}.

Returns:

  • float

    Random value for node.

to_spec
to_spec(name_map: Dict[str, str]) -> Dict[str, Any]

Returns external specification format of LinGauss, renaming nodes according to a name map.

Parameters:

  • name_map
    (Dict[str, str]) –

    Map of node names {old: new}.

Returns:

  • Dict[str, Any]

    LinGauss specification with renamed nodes.

Raises:

  • TypeError

    If bad arg type.

  • ValueError

    If bad arg value, e.g. coeff keys not in map.

validate_parents
validate_parents(
    node: str, parents: Dict[str, List[str]], node_values: Dict[str, List[str]]
) -> None

Check LinGauss coeff keys consistent with parents in DAG.

:param str node: name of node :param dict parents: parents of all nodes defined in DAG :param dict node_values: values of each cat. node [UNUSED]