CausalIQ Analysis Trace¶
This module implements detailed tracing of structure learning processes, allowing researchers to record, analyze, and compare the step-by-step evolution of causal graphs during algorithm execution.
⚠️ NOTE: these functionality will be superseded by a more open format based on csv and GraphML standards.
Core Classes¶
Trace
¶
Trace(context: Optional[Dict[str, Any]] = None)
Class encapsulating detailed structure learning trace.
Parameters:
-
(context¶dict, default:None) –Description of learning context.
Attributes:
-
context(dict) –Learning context.
-
trace(list) –Iteration by iteration structure learning trace.
-
start(float) –Time at which tracing started.
-
result(SDG) –Learnt graph.
-
treestats(TreeStats) –Statistics from a tree search.
Raises:
-
TypeError–If arguments have invalid types.
-
ValueError–If invalid context fields provided.
Methods:
-
__eq__–Test if other Trace is identical to this one.
-
__str__–Return details of Trace in human-readable printable format.
-
add–Add an entry to the structure learning trace.
-
context_string–Return a trace context as a human readable string.
-
diffs_from–Find differences of trace from reference trace.
-
get–Return the trace information.
-
read–Read set of Traces matching partial_id from serialised file.
-
rename–Rename nodes in trace in place according to name map.
-
save–Save the trace to a composite serialised (pickle) file.
-
set_result–Set the result of the learning GraphAction.
-
set_treestats–Set the statistics of a tree learning GraphAction.
-
update_scores–Update score in all traces of a series.
__str__
¶
Return details of Trace in human-readable printable format.
Returns:
-
str(str) –Trace in printable form.
add
¶
add(activity: GraphAction, details: Dict[GraphActionDetail, Any]) -> Trace
Add an entry to the structure learning trace.
Parameters:
-
(activity¶GraphAction) –Action e.g. initialisation, add arc.
-
(details¶dict) –Supplementary details relevant to activity.
Returns:
-
Trace(Trace) –Returns trace after entry added.
Raises:
-
TypeError–If arguments have invalid types.
context_string
classmethod
¶
diffs_from
¶
Find differences of trace from reference trace.
Parameters:
-
(ref¶Trace) –Reference trace to compare this one to.
-
(strict¶bool, default:True) –Whether floats are tested to be strictly the same or just reasonably similar. Defaults to True.
Returns:
-
Optional[Tuple[Dict[Any, Any], List[int], str]]–tuple or None: (major differences, minor differences, textual summary) or None if identical.
Raises:
-
TypeError–If ref is not of type Trace.
-
ValueError–If either trace is invalid.
get
¶
Return the trace information.
Returns:
-
DataFrame(DataFrame) –Trace as Pandas data frame.
read
classmethod
¶
read(partial_id: str, root_dir: str) -> Optional[Dict[str, Trace]]
Read set of Traces matching partial_id from serialised file.
Parameters:
Returns:
-
Optional[Dict[str, Trace]]–dict or None: {key: Trace} of traces matching partial id.
Raises:
-
TypeError–If arguments are not strings.
-
FileNotFoundError–If root_dir doesn't exist.
-
ValueError–If partial_id is entry or serialised file is not a dictionary of traces.
rename
¶
rename(name_map: Dict[str, str]) -> None
Rename nodes in trace in place according to name map.
Parameters:
-
(name_map¶dict) –Name mapping {name: new name}.
Raises:
-
TypeError–If bad arg type.
save
¶
save(root_dir: str) -> None
Save the trace to a composite serialised (pickle) file.
Parameters:
-
(root_dir¶str) –Root directory under which pickle files saved.
Raises:
-
TypeError–If bad argument types.
-
ValueError–If no id defined for trace.
-
FileNotFoundError–If root_dir does not exist.
set_result
¶
set_treestats
¶
update_scores
classmethod
¶
update_scores(
series: str,
networks: List[str],
score: str,
root_dir: str,
save: bool = False,
test: bool = False,
) -> Dict[Tuple[str, str], Tuple[Optional[float], float]]
Update score in all traces of a series.
Parameters:
-
(series¶str) –Series to update traces for.
-
(networks¶list) –List of networks to update.
-
(score¶str) –Score to update e.g. 'bic', 'loglik'.
-
(root_dir¶str) –Root directory holding trace files.
-
(save¶bool, default:False) –Whether to save updated scores in trace file. Defaults to False.
-
(test¶bool, default:False) –Whether score should be evaluated on test data. Defaults to False.
Raises:
-
ValueError–If bad arg values.
Main class for recording and managing structure learning traces with detailed context, iteration-by-iteration recording, and comparison capabilities.
DiffType
¶
Enumeration defining the different types of differences that can be detected when comparing traces.
CompatibilityUnpickler
¶
Custom unpickler that handles module path changes for backward compatibility.
Maps specific classes that have been moved between modules.
Classes:
-
PlaceholderEnum–Placeholder class for missing enum-like classes from legacy modules.
Methods:
-
find_class–Override find_class to handle module path changes.
PlaceholderEnum
¶
Placeholder class for missing enum-like classes from legacy modules.
This handles unpickling of enum instances that reference classes not available in the current environment (e.g., learn.hc_worker.Prefer).
The issue: Legacy pickle files contain references to EnumWithAttrs classes from the 'learn' module that don't exist in standalone causaliq_analysis. When unpickling, Python tries to instantiate these enums with arguments, but a simple empty class fails with "takes no arguments" error.
This placeholder accepts any arguments during construction to allow successful unpickling while maintaining compatibility with the discovery repo.
Custom unpickler for handling backward compatibility when loading trace files with module path changes.
Utility Functions¶
load_with_compatibility
¶
load_with_compatibility(
file_handle: Any, compression: str = "gzip", **kwargs: Any
) -> Any
Load trace files with backward compatibility support for older module structures.
Overview¶
Structure Learning Tracing¶
The Trace class provides comprehensive functionality for recording and analyzing causal graph structure learning:
Recording Capabilities: - Context Management: Store learning algorithm parameters, data characteristics, and environment information - Iteration Tracking: Record each step of the learning process with detailed action information - Result Storage: Capture final learned graphs and search statistics - Flexible Storage: Support for compressed file formats and cross-platform compatibility
Analysis Features: - Trace Comparison: Compare two traces to identify differences in learning paths - Score Updates: Retroactively update or recalculate scores using different metrics - Variable Renaming: Handle variable name changes across different datasets - Statistical Summaries: Generate summaries of learning algorithm behavior
Key Methods¶
Creation and Management:
- __init__(): Initialize a new trace with optional context information
- add(): Record a new iteration with action details and graph state
- read(): Load trace data from files with compatibility support
- save(): Save trace data to compressed files
Analysis and Comparison:
- diffs_from(): Compare this trace with another to identify differences
- update_scores(): Recalculate scores using different scoring functions
- get(): Export trace data as a pandas DataFrame for analysis
- rename(): Apply variable name mappings throughout the trace
Results and Context:
- set_result(): Set the final learned graph
- set_treestats(): Add tree search statistics
- context_string(): Generate human-readable context descriptions
Trace Differences¶
The tracing system can identify several types of differences between learning runs:
- MISSING/EXTRA: Iterations present in one trace but not another
- ACTION: Different actions taken at corresponding iterations
- ARC: Different arcs modified in corresponding actions
- SCORE: Different scores recorded for the same actions
- DETAILS: Differences in recorded action details
File Compatibility¶
The module includes robust backward compatibility features:
- Module Migration: Handles changes in module structure over time
- Class Mapping: Automatically maps old class locations to new ones
- Version Tracking: Records software versions for reproducibility
- Compressed Storage: Efficient storage using gzip compression
Integration with Graph Actions¶
Traces work seamlessly with the graph action enumerations to provide detailed records of:
- Which specific arcs were added, deleted, or reversed
- Score changes resulting from each modification
- Alternative actions that were considered but not taken
- Statistical constraints and prior knowledge influences
- Debugging information for algorithm development
This comprehensive tracing capability is essential for:
- Algorithm development and debugging
- Reproducible research in causal discovery
- Performance analysis and optimization
- Comparative studies of different learning approaches
- Educational demonstrations of structure learning behavior