Action Auto-Discovery Design
Overview
The auto-discovery system implements a plugin architecture using Python entry points where actions are automatically found and registered without any configuration files. This design eliminates the need for manual registry management while providing a seamless plugin ecosystem for causal discovery workflows.
The system uses lazy loading to avoid circular import issues - entry points are discovered at startup (metadata only), but the actual action classes are only loaded when first used.
How Auto-Discovery Works: The Complete Journey
Phase 1: System Initialisation
Step 1: Workflow Engine Starts
When a user runs causaliq-workflow experiment.yml:
- WorkflowExecutor initialisation: Creates a new
WorkflowExecutorinstance - ActionRegistry creation: Instantiates
ActionRegistry()which triggers discovery - Entry point discovery: The registry scans for
causaliq.actionsentry points
Step 2: Entry Point Discovery (Lazy Loading)
The system discovers actions without importing them:
- Entry point enumeration: Uses
importlib.metadata.entry_points()to find all packages that declarecausaliq.actionsentry points - Metadata recording: Stores entry point metadata (name, module path) without loading
- Deferred loading: Actual imports happen only when an action is first used
This lazy loading approach is critical for avoiding circular imports, since
action packages (like causaliq-knowledge) depend on causaliq-workflow.
Phase 2: Action Loading (On Demand)
Step 3: First Use Triggers Loading
When a workflow references an action:
steps:
- name: "Generate Graph"
uses: "causaliq-knowledge" # Triggers lazy load
with:
action: "generate_graph"
The system:
- Entry point lookup: Finds the entry point for "causaliq-knowledge"
- Class loading: Calls
entry_point.load()to import the action class - Type validation: Verifies it's a valid
BaseActionProvidersubclass - Caching: Stores the loaded class for future use
Step 4: Action Execution
Once loaded:
- Parameter preparation: Collects parameters from the
withblock - Input validation: Action validates its own inputs using schemas
- Execution: Calls
action.run(inputs)with validated parameters - Output handling: Processes and stores action outputs
Phase 3: Fallback Module Scanning
For backwards compatibility and development scenarios, the registry also supports module scanning:
- Module enumeration: Scans
sys.modulesfor already-imported modules - Convention detection: Looks for modules exporting
ActionProvider - Registration: Adds discovered actions to the registry
The Plugin Developer Experience
Creating an Action Package: Step-by-Step
Step 1: Standard Python Package Setup
Developers start with familiar Python packaging:
mkdir causaliq-pc-algorithm
cd causaliq-pc-algorithm
File: pyproject.toml
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "causaliq-pc-algorithm"
version = "1.0.0"
description = "PC algorithm implementation for CausalIQ workflows"
dependencies = [
"causaliq-workflow>=0.1.1",
"networkx>=2.8.0",
"pandas>=1.5.0"
]
# CRITICAL: Register your action as an entry point
[project.entry-points."causaliq.actions"]
causaliq-pc-algorithm = "causaliq_pc_algorithm:ActionProvider"
The entry point declaration tells causaliq-workflow where to find your
action class. The format is:
action-name = "module.path:ClassName"
Step 2: Implement the Action Class
File: src/causaliq_pc_algorithm/init.py
"""PC algorithm action for CausalIQ workflows."""
from causaliq_workflow.action import BaseActionProvider
import pandas as pd
import networkx as nx
# The class name can be anything, but must match the entry point
class ActionProvider(BaseActionProvider):
"""PC algorithm for causal structure learning."""
name = "causaliq-pc-algorithm"
version = "1.0.0"
description = "PC algorithm for causal structure learning"
def run(self, inputs, **kwargs):
"""Execute the PC algorithm.
Args:
inputs: Dictionary containing:
- dataset: Path to CSV data file
- alpha: Significance level (default: 0.05)
- output_path: Where to save results
Returns:
Dictionary with graph_path, nodes, and edges count.
"""
# Read data
data = pd.read_csv(inputs["dataset"])
# Run PC algorithm
alpha = inputs.get("alpha", 0.05)
graph = self._pc_algorithm(data, alpha)
# Save results
output_path = inputs["output_path"]
nx.write_graphml(graph, f"{output_path}/graph.xml")
return {
"graph_path": f"{output_path}/graph.xml",
"nodes": len(graph.nodes),
"edges": len(graph.edges),
}
def _pc_algorithm(self, data, alpha):
"""PC algorithm implementation."""
# Implementation here
pass
Step 3: Installation and Immediate Availability
# Install the package (editable mode for development)
pip install -e .
# Action is immediately available in workflows
causaliq-workflow my-experiment.yml
File: my-experiment.yml
description: "PC Algorithm Test"
id: pc-test
steps:
- name: "Learn Structure"
uses: "causaliq-pc-algorithm" # Automatically discovered via entry point
with:
dataset: "/data/asia.csv"
alpha: 0.01
output_path: "/results/pc"
Entry Point Discovery Implementation
The Lazy Loading Pattern
import sys
from importlib.metadata import entry_points
from typing import Dict, Optional, Type
class ActionRegistry:
def __init__(self):
self._actions: Dict[str, Type[BaseActionProvider]] = {}
self._entry_points: Dict[str, Any] = {} # Metadata only
self._discover_entry_points()
def _discover_entry_points(self) -> None:
"""Discover entry points WITHOUT importing them."""
# Python 3.10+ API
if sys.version_info >= (3, 10):
eps = entry_points(group="causaliq.actions")
else:
# Python 3.9 compatibility
all_eps = entry_points()
eps = all_eps.get("causaliq.actions", [])
for ep in eps:
# Store metadata only - no imports yet!
self._entry_points[ep.name] = ep
def _load_entry_point(self, name: str) -> Optional[Type[BaseActionProvider]]:
"""Load an entry point on first use."""
if name not in self._entry_points:
return None
ep = self._entry_points[name]
action_class = ep.load() # Import happens here
# Validate and cache
if issubclass(action_class, BaseActionProvider):
self._actions[name] = action_class
return action_class
return None
def get_action_class(self, name: str) -> Type[BaseActionProvider]:
"""Get action class, loading from entry point if needed."""
# Return cached action
if name in self._actions:
return self._actions[name]
# Try lazy loading from entry point
if name in self._entry_points:
action_class = self._load_entry_point(name)
if action_class:
return action_class
raise ActionRegistryError(f"Action '{name}' not found")
Why Lazy Loading?
Lazy loading solves a critical circular import problem:
causaliq-workflow (provides BaseActionProvider base class)
↑
└── causaliq-knowledge (imports BaseActionProvider, exports action)
If causaliq-workflow eagerly imported causaliq-knowledge during registry
initialisation, it would fail because causaliq-knowledge needs to import
from causaliq-workflow first.
By deferring the import until the action is actually used, we ensure:
1. causaliq-workflow initialises completely first
2. causaliq-knowledge can import from it safely
3. The action class is then loaded on demand
Ecosystem Integration Patterns
Distribution Strategy
Action packages follow standard Python distribution:
- PyPI publishing:
pip install causaliq-tetrad-bridge - GitHub releases: Direct installation from repositories
- Local development:
pip install -e .for development packages - Version management: Standard semantic versioning
Dependency Management
Actions declare their dependencies naturally:
# Action package dependencies
dependencies = [
"causaliq-workflow>=0.1.1", # Framework dependency
"rpy2>=3.5.0", # R interface (if needed)
"jpype1>=1.4.0", # Java interface (if needed)
"scikit-learn>=1.2.0" # Algorithm dependencies
]
# REQUIRED: Entry point registration
[project.entry-points."causaliq.actions"]
my-action = "my_action:ActionProvider"
Cross-Language Bridge Pattern
For R and Java integrations:
class RBridgeAction(BaseActionProvider):
"""Bridge to R-based causal discovery algorithms."""
name = "r-bridge-action"
version = "1.0.0"
description = "Execute R algorithms via rpy2"
def __init__(self):
super().__init__()
# Initialise R environment
self.r_session = self._setup_r_environment()
def run(self, inputs, **kwargs):
# Execute R code through rpy2
result = self.r_session.run_algorithm(inputs)
return self._convert_r_output(result)
Benefits of Entry Point Discovery
For Users
- Zero configuration: Install and use immediately
- No registry management: No config files to maintain
- Standard workflow: Familiar
pip install→ use pattern - Automatic updates: New action versions available immediately
For Developers
- Standard Python packaging: Use pyproject.toml entry points
- Simple registration: One line in pyproject.toml
- Full Python ecosystem: Use any Python dependencies
- Independent development: No coordination with core framework needed
- No circular imports: Lazy loading handles dependencies correctly
For the Ecosystem
- Organic growth: Easy to create and share actions
- Version management: Standard semantic versioning
- Quality control: Actions are regular Python packages with testing
- Documentation: Standard Python documentation tools apply
Quick Reference: Creating a New Action
Minimal Example
pyproject.toml:
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "my-causaliq-action"
version = "1.0.0"
dependencies = ["causaliq-workflow>=0.1.1"]
[project.entry-points."causaliq.actions"]
my-causaliq-action = "my_causaliq_action:ActionProvider"
[tool.setuptools.packages.find]
where = ["src"]
src/my_causaliq_action/init.py:
from causaliq_workflow.action import BaseActionProvider
class ActionProvider(BaseActionProvider):
name = "my-causaliq-action"
version = "1.0.0"
description = "My custom action"
def run(self, inputs, **kwargs):
# Your implementation here
return {"status": "complete", "result": inputs.get("value", 0) * 2}
Usage in workflow.yaml:
description: "Test my action"
id: test-my-action
steps:
- name: "Run my action"
uses: "my-causaliq-action"
with:
value: 42
Checklist
- [ ] Create
pyproject.tomlwith entry point registration - [ ] Implement action class inheriting from
BaseActionProvider - [ ] Set
name,version,descriptionclass attributes - [ ] Implement
run(self, inputs, **kwargs)method - [ ] Return a dictionary from
run() - [ ] Install with
pip install -e .for development - [ ] Test with
causaliq-workflow --dry-run workflow.yaml