LinGauss (Linear Gaussian Distribution)¶
The LinGauss class represents Linear Gaussian distributions for continuous variables in Bayesian Networks. It models variables that follow normal distributions with linear dependencies on their parent variables.
Overview¶
A Linear Gaussian distribution models a continuous variable as:
Where:
- μ (mean): Base mean value when all parents are zero
- βᵢ (coefficients): Linear coefficients for each parent variable
- σ (sd): Standard deviation of the Gaussian noise term ε
- ε: Zero-mean Gaussian noise with variance σ²
Key Features¶
- Linear Relationships: Models linear dependencies on parent variables
- Mixed Parents: Support for both continuous and discrete parent variables
- Parameter Learning: Efficient estimation from data using regression
- Probabilistic Inference: Exact inference for Gaussian networks
- Robust Estimation: Handles missing data and numerical stability
Model Specification¶
The distribution is fully specified by:
- Mean (μ): Baseline value of the variable
- Standard Deviation (σ): Noise level/uncertainty
- Coefficients: Linear weights for each parent variable
- Parents: Names of parent variables in the network
Example Usage¶
from causaliq_core.bn.dist import LinGauss
# Simple unconditional Gaussian
temperature = LinGauss(
mean=20.0, # 20°C baseline
sd=3.0, # 3°C standard deviation
coeffs={}, # No parents
parents=[]
)
# Linear dependency on one continuous parent
outdoor_temp = LinGauss(
mean=15.0, # Base indoor temperature
sd=2.0, # Noise level
coeffs={'OutdoorTemp': 0.7}, # 0.7 * OutdoorTemp
parents=['OutdoorTemp']
)
# Multiple parent dependencies
room_temp = LinGauss(
mean=18.0,
sd=1.5,
coeffs={
'OutdoorTemp': 0.4, # Outdoor influence
'HeatingLevel': 2.0 # Heating system effect
},
parents=['OutdoorTemp', 'HeatingLevel']
)
# Sample from distribution
sample = room_temp.sample({'OutdoorTemp': 10.0, 'HeatingLevel': 5.0})
print(f"Room temperature sample: {sample:.1f}°C")
# Compute probability density
density = room_temp.pdf(22.0, {'OutdoorTemp': 15.0, 'HeatingLevel': 3.0})
print(f"PDF at 22°C: {density:.4f}")
Learning from Data¶
LinGauss distributions can learn parameters from data using linear regression:
import pandas as pd
from causaliq_core.bn.dist import LinGauss
# Training data
data = pd.DataFrame({
'OutdoorTemp': [10, 15, 20, 25, 30],
'HeatingLevel': [8, 6, 4, 2, 0],
'IndoorTemp': [18, 19, 21, 23, 24]
})
# Learn distribution from data
learned_dist = LinGauss.from_data(
variable='IndoorTemp',
parents=['OutdoorTemp', 'HeatingLevel'],
data=data
)
print(f"Learned mean: {learned_dist.mean}")
print(f"Learned coefficients: {learned_dist.coeffs}")
print(f"Learned standard deviation: {learned_dist.sd}")
Discrete Parents¶
LinGauss can also handle discrete parent variables by treating them as indicator variables:
# Distribution with discrete parent
energy_usage = LinGauss(
mean=100.0, # Base energy usage
sd=10.0,
coeffs={
'Season': 20.0, # Extra usage in winter
'OutdoorTemp': -2.0 # Decrease with higher temp
},
parents=['Season', 'OutdoorTemp'] # Season is discrete, OutdoorTemp continuous
)
# Season would be encoded as 0 (Summer) or 1 (Winter)
usage = energy_usage.sample({'Season': 1, 'OutdoorTemp': 5.0}) # Winter, 5°C
Inference Properties¶
Linear Gaussian networks have special properties:
- Exact Inference: Marginal and conditional distributions remain Gaussian
- Efficient Computation: No approximation needed for probabilistic queries
- Analytical Solutions: Closed-form expressions for most operations
- Numerical Stability: Well-conditioned linear algebra operations
API Reference¶
lingauss
¶
Classes:
-
LinGauss–Conditional Linear Gaussian Distribution.
LinGauss
¶
LinGauss(lg: Dict[str, Any])
Conditional Linear Gaussian Distribution.
Parameters:
-
(lg¶Dict[str, Any]) –Specification of Linear Gaussian in following form: {'coeffs': {node: coeff}, 'mean': mean, 'sd': sd}.
Attributes:
-
coeffs–Linear coefficient of parents {parent: coeff}.
-
mean–Mean of Gaussian noise (aka intercept, mu).
-
sd–S.D. of Gaussian noise (aka sigma).
Raises:
-
TypeError–If called with bad arg types.
-
ValueError–If called with bad arg values.
Methods:
-
__eq__–Return whether two CNDs are the same allowing for probability
-
__str__–Human-friendly formula description of the Linear Gaussian.
-
cdist–Return conditional distribution for specified parental values.
-
fit–Fit a Linear Gaussian to data.
-
parents–Return parents of node CND relates to.
-
random_value–Generate a random value for a node given the value of its parents.
-
to_spec–Returns external specification format of LinGauss,
-
validate_parents–Check LinGauss coeff keys consistent with parents in DAG.
__eq__
¶
Return whether two CNDs are the same allowing for probability rounding errors
:param CND other: CND to compared to self
:returns bool: whether LinGauss objects are the same up to 10 sf
__str__
¶
Human-friendly formula description of the Linear Gaussian.
Returns:
-
str–String representation of the Linear Gaussian formula.
cdist
¶
cdist(parental_values: Optional[Dict[str, float]] = None) -> Tuple[float, float]
Return conditional distribution for specified parental values.
Parameters:
-
(parental_values¶Optional[Dict[str, float]], default:None) –Parental values for which dist. required
Returns:
-
Tuple[float, float]–Tuple of (mean, sd) of child Gaussian distribution.
Raises:
-
TypeError–If args are of wrong type.
-
ValueError–If args have invalid or conflicting values.
fit
classmethod
¶
fit(
node: str,
parents: Optional[Tuple[str, ...]],
data: Union[Pandas, BNFit],
autocomplete: bool = True,
) -> Tuple[Tuple[type, Dict[str, Any]], Optional[int]]
Fit a Linear Gaussian to data.
Parameters:
-
(node¶str) –Node that Linear Gaussian applies to.
-
(parents¶Optional[Tuple[str, ...]]) –Parents of node.
-
(data¶Union[Pandas, BNFit]) –Data to fit Linear Gaussian to.
-
(autocomplete¶bool, default:True) –Not used for Linear Gaussian.
Returns:
-
Tuple[Tuple[type, Dict[str, Any]], Optional[int]]–Tuple of (lg_spec, None) where lg is (LinGauss class, lg_spec).
Raises:
-
TypeError–With bad arg types.
-
ValueError–With bad arg values.
parents
¶
Return parents of node CND relates to.
Returns:
-
List[str]–Parent node names in alphabetical order.
random_value
¶
random_value(pvs: Optional[Dict[str, float]]) -> float
Generate a random value for a node given the value of its parents.
Parameters:
-
(pvs¶Optional[Dict[str, float]]) –Parental values, {parent1: value1, ...}.
Returns:
-
float–Random value for node.
to_spec
¶
to_spec(name_map: Dict[str, str]) -> Dict[str, Any]
Returns external specification format of LinGauss, renaming nodes according to a name map.
Parameters:
-
(name_map¶Dict[str, str]) –Map of node names {old: new}.
Returns:
-
Dict[str, Any]–LinGauss specification with renamed nodes.
Raises:
-
TypeError–If bad arg type.
-
ValueError–If bad arg value, e.g. coeff keys not in map.
validate_parents
¶
validate_parents(
node: str, parents: Dict[str, List[str]], node_values: Dict[str, List[str]]
) -> None
Check LinGauss coeff keys consistent with parents in DAG.
:param str node: name of node :param dict parents: parents of all nodes defined in DAG :param dict node_values: values of each cat. node [UNUSED]