Skip to content

CausalIQ Analysis Trace

This module implements detailed tracing of structure learning processes, allowing researchers to record, analyze, and compare the step-by-step evolution of causal graphs during algorithm execution.

⚠️ NOTE: these functionality will be superseded by a more open format based on csv and GraphML standards.

Core Classes

Trace

Trace(context: Optional[Dict[str, Any]] = None)

Class encapsulating detailed structure learning trace.

Parameters:

  • context

    (dict, default: None ) –

    Description of learning context.

Attributes:

  • context (dict) –

    Learning context.

  • trace (list) –

    Iteration by iteration structure learning trace.

  • start (float) –

    Time at which tracing started.

  • result (SDG) –

    Learnt graph.

  • treestats (TreeStats) –

    Statistics from a tree search.

Raises:

  • TypeError

    If arguments have invalid types.

  • ValueError

    If invalid context fields provided.

Methods:

  • __eq__

    Test if other Trace is identical to this one.

  • __str__

    Return details of Trace in human-readable printable format.

  • add

    Add an entry to the structure learning trace.

  • context_string

    Return a trace context as a human readable string.

  • diffs_from

    Find differences of trace from reference trace.

  • get

    Return the trace information.

  • read

    Read set of Traces matching partial_id from serialised file.

  • rename

    Rename nodes in trace in place according to name map.

  • save

    Save the trace to a composite serialised (pickle) file.

  • set_result

    Set the result of the learning GraphAction.

  • set_treestats

    Set the statistics of a tree learning GraphAction.

  • update_scores

    Update score in all traces of a series.

__eq__

__eq__(other: Any) -> bool

Test if other Trace is identical to this one.

Parameters:

  • other
    (Trace) –

    Trace to compare with self.

Returns:

  • bool ( bool ) –

    True if other is identical to self.

__str__

__str__() -> str

Return details of Trace in human-readable printable format.

Returns:

  • str ( str ) –

    Trace in printable form.

add

Add an entry to the structure learning trace.

Parameters:

  • activity
    (GraphAction) –

    Action e.g. initialisation, add arc.

  • details
    (dict) –

    Supplementary details relevant to activity.

Returns:

  • Trace ( Trace ) –

    Returns trace after entry added.

Raises:

  • TypeError

    If arguments have invalid types.

context_string classmethod

context_string(context: Dict[str, Any], start: float) -> str

Return a trace context as a human readable string.

Parameters:

  • context
    (dict) –

    Individual context information.

  • start
    (float) –

    Start time information.

Returns:

  • str ( str ) –

    Context information in readable form.

diffs_from

diffs_from(
    ref: Trace, strict: bool = True
) -> Optional[Tuple[Dict[Any, Any], List[int], str]]

Find differences of trace from reference trace.

Parameters:

  • ref
    (Trace) –

    Reference trace to compare this one to.

  • strict
    (bool, default: True ) –

    Whether floats are tested to be strictly the same or just reasonably similar. Defaults to True.

Returns:

  • Optional[Tuple[Dict[Any, Any], List[int], str]]

    tuple or None: (major differences, minor differences, textual summary) or None if identical.

Raises:

  • TypeError

    If ref is not of type Trace.

  • ValueError

    If either trace is invalid.

get

get() -> DataFrame

Return the trace information.

Returns:

  • DataFrame ( DataFrame ) –

    Trace as Pandas data frame.

read classmethod

read(partial_id: str, root_dir: str) -> Optional[Dict[str, Trace]]

Read set of Traces matching partial_id from serialised file.

Parameters:

  • partial_id
    (str) –

    Partial_id of Trace.

  • root_dir
    (str) –

    Root directory holding trace files.

Returns:

  • Optional[Dict[str, Trace]]

    dict or None: {key: Trace} of traces matching partial id.

Raises:

  • TypeError

    If arguments are not strings.

  • FileNotFoundError

    If root_dir doesn't exist.

  • ValueError

    If partial_id is entry or serialised file is not a dictionary of traces.

rename

rename(name_map: Dict[str, str]) -> None

Rename nodes in trace in place according to name map.

Parameters:

  • name_map
    (dict) –

    Name mapping {name: new name}.

Raises:

  • TypeError

    If bad arg type.

save

save(root_dir: str) -> None

Save the trace to a composite serialised (pickle) file.

Parameters:

  • root_dir
    (str) –

    Root directory under which pickle files saved.

Raises:

  • TypeError

    If bad argument types.

  • ValueError

    If no id defined for trace.

  • FileNotFoundError

    If root_dir does not exist.

set_result

set_result(result: SDG) -> Trace

Set the result of the learning GraphAction.

Parameters:

  • result
    (SDG) –

    Graph result from learning activity.

Returns:

  • Trace ( Trace ) –

    Current Trace to support chaining.

Raises:

  • TypeError

    If result argument is not a SDG.

set_treestats

set_treestats(treestats: Any) -> Trace

Set the statistics of a tree learning GraphAction.

Parameters:

  • treestats
    (TreeStats) –

    Statistics from tree learning activity.

Returns:

  • Trace ( Trace ) –

    Current Trace to support chaining.

Raises:

  • TypeError

    If treestats argument is incorrect type.

update_scores classmethod

update_scores(
    series: str,
    networks: List[str],
    score: str,
    root_dir: str,
    save: bool = False,
    test: bool = False,
) -> Dict[Tuple[str, str], Tuple[Optional[float], float]]

Update score in all traces of a series.

Parameters:

  • series
    (str) –

    Series to update traces for.

  • networks
    (list) –

    List of networks to update.

  • score
    (str) –

    Score to update e.g. 'bic', 'loglik'.

  • root_dir
    (str) –

    Root directory holding trace files.

  • save
    (bool, default: False ) –

    Whether to save updated scores in trace file. Defaults to False.

  • test
    (bool, default: False ) –

    Whether score should be evaluated on test data. Defaults to False.

Raises:

  • ValueError

    If bad arg values.

Main class for recording and managing structure learning traces with detailed context, iteration-by-iteration recording, and comparison capabilities.

DiffType

Enumeration defining the different types of differences that can be detected when comparing traces.

CompatibilityUnpickler

Custom unpickler that handles module path changes for backward compatibility.

Maps specific classes that have been moved between modules.

Methods:

  • find_class

    Override find_class to handle module path changes.

find_class

find_class(module: str, name: str) -> Any

Override find_class to handle module path changes.

Parameters:

  • module
    (str) –

    Original module name from pickle.

  • name
    (str) –

    Class name.

Returns:

  • Any

    The class object from the new location.

Custom unpickler for handling backward compatibility when loading trace files with module path changes.

Utility Functions

load_with_compatibility

load_with_compatibility(
    file_handle: Any, compression: str = "gzip", **kwargs: Any
) -> Any

Load pickled data with module compatibility handling.

Parameters:

  • file_handle

    (Any) –

    File handle to read from.

  • compression

    (str, default: 'gzip' ) –

    Compression type.

Returns:

  • Any

    Unpickled object.

Load trace files with backward compatibility support for older module structures.

Overview

Structure Learning Tracing

The Trace class provides comprehensive functionality for recording and analyzing causal graph structure learning:

Recording Capabilities: - Context Management: Store learning algorithm parameters, data characteristics, and environment information - Iteration Tracking: Record each step of the learning process with detailed action information - Result Storage: Capture final learned graphs and search statistics - Flexible Storage: Support for compressed file formats and cross-platform compatibility

Analysis Features: - Trace Comparison: Compare two traces to identify differences in learning paths - Score Updates: Retroactively update or recalculate scores using different metrics - Variable Renaming: Handle variable name changes across different datasets - Statistical Summaries: Generate summaries of learning algorithm behavior

Key Methods

Creation and Management: - __init__(): Initialize a new trace with optional context information - add(): Record a new iteration with action details and graph state - read(): Load trace data from files with compatibility support - save(): Save trace data to compressed files

Analysis and Comparison: - diffs_from(): Compare this trace with another to identify differences - update_scores(): Recalculate scores using different scoring functions - get(): Export trace data as a pandas DataFrame for analysis - rename(): Apply variable name mappings throughout the trace

Results and Context: - set_result(): Set the final learned graph - set_treestats(): Add tree search statistics - context_string(): Generate human-readable context descriptions

Trace Differences

The tracing system can identify several types of differences between learning runs:

  • MISSING/EXTRA: Iterations present in one trace but not another
  • ACTION: Different actions taken at corresponding iterations
  • ARC: Different arcs modified in corresponding actions
  • SCORE: Different scores recorded for the same actions
  • DETAILS: Differences in recorded action details

File Compatibility

The module includes robust backward compatibility features:

  • Module Migration: Handles changes in module structure over time
  • Class Mapping: Automatically maps old class locations to new ones
  • Version Tracking: Records software versions for reproducibility
  • Compressed Storage: Efficient storage using gzip compression

Integration with Graph Actions

Traces work seamlessly with the graph action enumerations to provide detailed records of:

  • Which specific arcs were added, deleted, or reversed
  • Score changes resulting from each modification
  • Alternative actions that were considered but not taken
  • Statistical constraints and prior knowledge influences
  • Debugging information for algorithm development

This comprehensive tracing capability is essential for:

  • Algorithm development and debugging
  • Reproducible research in causal discovery
  • Performance analysis and optimization
  • Comparative studies of different learning approaches
  • Educational demonstrations of structure learning behavior