Architecture Overview¶
CausalIQ Ecosystem¶
causaliq-core is a foundational component of the overall CausalIQ ecosystem architecture, providing core utilities and patterns used across CausalIQ projects.
Design Philosophy¶
The architecture emphasises reproducibility, performance, and cross-platform compatibility through thoughtful design patterns and data management strategies.
Key Architectural Features¶
Deterministic Random Number Generation¶
Design Goal: Ensure reproducible results across platforms and Python versions.
Implementation:
- Uses an embedded list of pre-generated random numbers
(
STABLE_RANDOM_SEQUENCE) - 10,000+ values generated from a fixed seed for cross-platform repeatability
- Avoids platform-specific random number generator differences
- Supports stable experiment randomisation for scientific reproducibility
# Stable across all platforms and Python versions
from causaliq_core.utils.random import stable_random
values = stable_random(seed=42, count=100) # Always identical results
Intelligent Environment Caching¶
Design Goal: Minimise expensive system queries while maintaining fresh data.
Implementation:
- 24-hour cache expiration for hardware/software environment detection
- Platform-appropriate cache directories (follows OS conventions)
- Graceful fallback on cache corruption or permission errors
- JSON-based cache storage for human readability and debugging
# First call queries system, subsequent calls use cache
env = environment() # May take 100ms+ on first call
env = environment() # Returns cached data instantly
Enhanced Enumeration Pattern¶
Design Goal: Extend Python enums with additional attributes while preserving enum semantics.
Implementation:
EnumWithAttrsbase class for enums with human-readable labels- Maintains enum value integrity while adding metadata
- Supports extensible attribute patterns for domain-specific needs
Performance-Aware Timing Infrastructure¶
Design Goal: Non-intrusive performance measurement for production use.
Implementation:
- Singleton pattern for centralised timing collection
- Thread-safe timeout decorators and context managers
- Optional activation to eliminate overhead in production
- Hierarchical timing with action/scale categorisation
Mathematical Precision Controls¶
Design Goal: Consistent numerical formatting across scientific applications.
Implementation:
- Significant figure rounding with exact legacy behaviour preservation
- Configurable zero thresholds for scientific notation edge cases
- String-based output for precise display formatting control
SQLite-Backed Token Cache¶
Design Goal: Efficient, persistent caching with compression support for workflow data and computed results.
Implementation:
- SQLite-backed storage with concurrency support via database locking
- In-memory mode (
:memory:) for fast, non-persistent caching - Pluggable compressor architecture for data compression
- Shared token dictionary for cross-entry deduplication
- Hit counting and access tracking for cache analytics
from causaliq_core.cache import TokenCache
with TokenCache(":memory:") as cache:
cache.put("key123", b"data")
data = cache.get("key123")
Action Provider Framework¶
Design Goal: Standardised interface for workflow components that expose multiple related actions.
Implementation:
CausalIQActionProviderabstract base class defining the action interfaceCoreActionProviderwith GraphML and JSON compression/decompression- Structured input/output specifications via
ActionInputandActionOutput - Action validation and execution error handling
Module Organisation¶
causaliq_core/
├── __init__.py # Package metadata and constants
├── action.py # Action provider framework
├── cli.py # Command-line interface
├── bn/ # Bayesian Networks
│ ├── bn.py # BN class (network structure)
│ ├── bnfit.py # BNFit class (fitted parameters)
│ ├── dist/ # Conditional distributions (CPT, LinGauss)
│ └── io/ # BN file I/O (DSC, XDSL formats)
├── cache/ # Caching infrastructure
│ ├── token_cache.py # SQLite-backed TokenCache
│ └── compressors/ # Pluggable compression backends
│ ├── base.py # Compressor interface
│ └── json_compressor.py
├── graph/ # Graph structures and algorithms
│ ├── sdg.py # Simple Dependency Graphs (base class)
│ ├── dag.py # Directed Acyclic Graphs
│ ├── pdag.py # Partially Directed Acyclic Graphs
│ ├── enums.py # EdgeType, EdgeMark enumerations
│ ├── convert.py # Graph type conversions
│ └── io/ # Graph I/O formats
│ ├── common.py # Unified read_graph/write_graph
│ ├── graphml.py # GraphML format support
│ ├── bayesys.py # BayeSys format support
│ └── tetrad.py # Tetrad format support
└── utils/ # Core utilities
├── __init__.py # EnumWithAttrs and public exports
├── environment.py # System environment detection
├── io.py # File and DataFrame I/O utilities
├── math.py # rndsf, ln mathematical functions
├── random.py # Stable random generation
├── same.py # Precision-aware value comparison
└── timing.py # Performance measurement
Graph Module Architecture¶
Design Goal: Flexible representation of dependency graphs supporting multiple graph types used in causal discovery.
Implementation:
- SDG (Simple Dependency Graph): Base class supporting arbitrary endpoint
marks (head
>, tail-, circleo) - DAG: Directed Acyclic Graph with parent/child relationships
- PDAG: Partially Directed Acyclic Graph (CPDAGs, equivalence classes)
- Conversion Functions:
dag_to_pdag,pdag_to_cpdag,extend_pdag - Multi-Format I/O: GraphML, BayeSys, and Tetrad format support
Supported graph types:
- Markov Graphs
- Directed Acyclic Graphs (DAGs)
- Partially Directed Acyclic Graphs (PDAGs)
- Maximal Ancestral Graphs (MAGs)
- Partial Ancestral Graphs (PAGs)
Bayesian Networks Architecture¶
Design Goal: Modular probabilistic modelling with multiple distribution types and file format support.
Implementation:
- BN Core: Main
BNandBNFitclasses for network representation - Distribution Module: Pluggable conditional distributions (CPT for discrete, LinGauss for continuous)
- I/O Layer: Format-agnostic interface with DSC and XDSL backend support
- Graph Integration: Built on
causaliq_core.graph.DAGfoundation
Value Comparison Utilities¶
Design Goal: Precision-aware comparison of numeric values, dictionaries, and probability distributions.
Implementation:
values_same: Compare numbers to specified significant figuresdicts_same: Deep dictionary comparison with numeric precisiondists_same: Probability distribution comparison
Package Organisation Rationale¶
The package structure balances functionality distribution:
- Main package: Constants and metadata (lightweight imports)
- Cache package: Caching infrastructure with compressor plugins
- Graph package: Graph representations and I/O formats
- BN package: Bayesian Network modelling
- Utils package: Mathematical functions and core utilities
- Action module: Workflow integration framework
This structure supports both convenience imports
(from causaliq_core.utils import rndsf) and modular usage patterns while
maintaining backward compatibility.