Action Auto-Discovery Design
Overview
The auto-discovery system implements a convention-over-configuration approach where actions are automatically found and registered without any configuration files. This design eliminates the need for manual registry management while providing a seamless plugin ecosystem for causal discovery workflows.
How Auto-Discovery Works: The Complete Journey
Phase 1: System Initialization
Step 1: Workflow Engine Starts
When a user runs causaliq-workflow experiment.yml:
- WorkflowExecutor initialization: Creates a new
WorkflowExecutorinstance - ActionRegistry creation: Instantiates
ActionRegistry()which triggers discovery - Discovery activation: The registry immediately begins scanning for actions
Step 2: Python Environment Scanning
The system methodically searches the Python environment:
- Module enumeration: Uses
pkgutil.iter_modules()to list all importable modules - Safe import attempts: Tries to import each module with comprehensive error handling
- Module filtering: Focuses only on modules that successfully import
Phase 2: Action Detection and Registration
Step 3: Convention-Based Detection
For each successfully imported module, the system applies the discovery convention:
- Attribute inspection: Checks if the module has an attribute named 'CausalIQAction'
- Type validation: Verifies that the 'CausalIQAction' attribute is a class
- Inheritance verification: Confirms it inherits from
causaliq_workflow.action.Action - Action validation: Ensures the class implements required abstract methods
Step 4: Automatic Registration
When an action is detected:
- Name extraction: Uses the module name as the action identifier
- Registry storage: Stores mapping:
{module_name: CausalIQAction_class} - Metadata collection: Gathers action name, version, description from the class
- Availability confirmation: Marks the action as ready for workflow use
Phase 3: Workflow Execution Integration
Step 5: YAML Parsing and Action Resolution
When processing a workflow file:
steps:
- name: "Custom Analysis"
uses: "my_custom_action" # This triggers action lookup
with:
parameter: "value"
The system follows this resolution path:
- Action name extraction: Parses
uses: "my_custom_action" - Registry lookup: Searches registered actions for "my_custom_action"
- Class retrieval: Gets the corresponding CausalIQAction class
- Instance creation: Instantiates the action with
CausalIQAction()
Step 6: Dynamic Action Execution
During step execution:
- Parameter preparation: Collects parameters from the
withblock - Input validation: Action validates its own inputs using schemas
- Execution: Calls
action.run(inputs)with validated parameters - Output handling: Processes and stores action outputs
The Plugin Developer Experience
Creating an Action Package: Step-by-Step
Step 1: Standard Python Package Setup
Developers start with familiar Python packaging:
mkdir causaliq-pc-algorithm
cd causaliq-pc-algorithm
File: pyproject.toml
[project]
name = "causaliq-pc-algorithm"
version = "1.0.0"
description = "PC algorithm implementation for CausalIQ workflows"
dependencies = [
"causaliq-workflow>=0.1.0",
"networkx>=2.8.0",
"pandas>=1.5.0"
]
Step 2: Implement the Action Convention
File: causaliq_pc_algorithm/init.py
from causaliq_workflow.action import Action
import pandas as pd
import networkx as nx
# Must be named 'CausalIQAction' for auto-discovery
class CausalIQAction(Action):
name = "causaliq-pc-algorithm"
version = "1.0.0"
description = "PC algorithm for causal structure learning"
def run(self, inputs):
# Read data
data = pd.read_csv(inputs['dataset'])
# Run PC algorithm
alpha = inputs.get('alpha', 0.05)
graph = self.pc_algorithm(data, alpha)
# Save results
output_path = inputs['output_path']
nx.write_graphml(graph, f"{output_path}/graph.xml")
return {
"graph_path": f"{output_path}/graph.xml",
"nodes": len(graph.nodes),
"edges": len(graph.edges)
}
def pc_algorithm(self, data, alpha):
# PC algorithm implementation
pass
Step 3: Installation and Immediate Availability
# Install the package
pip install causaliq-pc-algorithm
# Action is immediately available in workflows
causaliq-workflow my-experiment.yml
File: my-experiment.yml
name: "PC Algorithm Test"
steps:
- name: "Learn Structure"
uses: "causaliq-pc-algorithm" # Automatically discovered
with:
dataset: "/data/asia.csv"
alpha: 0.01
output_path: "/results/pc"
Discovery Process Implementation Details
Module Introspection Strategy
The system uses Python's built-in introspection capabilities:
import pkgutil
import importlib
from typing import Dict, Type
class ActionRegistry:
def __init__(self):
self.actions: Dict[str, Type[Action]] = {}
self._discover_actions()
def _discover_actions(self):
"""Automatically discover and register all available actions."""
# Scan all importable modules
for finder, module_name, ispkg in pkgutil.iter_modules():
try:
# Safely attempt to import module
module = importlib.import_module(module_name)
# Check for CausalIQAction class using naming convention
if hasattr(module, 'CausalIQAction'):
action_class = getattr(module, 'CausalIQAction')
# Verify it's actually an Action subclass
if (isinstance(action_class, type) and
issubclass(action_class, Action) and
action_class != Action):
# Register using module name
self.actions[module_name] = action_class
except ImportError:
# Skip modules that can't be imported
continue
Error Handling and Resilience
The discovery process is designed to be robust:
- Import error tolerance: Failed module imports are silently skipped
- Type safety: Strict validation of action classes
- Namespace isolation: Each action operates in its own namespace
- Graceful degradation: System continues working even if some actions fail to load
Performance Considerations
The auto-discovery system is optimized for efficiency:
- Lazy loading: Actions are only instantiated when needed
- One-time discovery: Module scanning happens once during registry creation
- Minimal overhead: Discovery adds negligible startup time
- Cached results: Action registry is reused across workflow steps
Ecosystem Integration Patterns
Distribution Strategy
Action packages follow standard Python distribution:
- PyPI publishing:
pip install causaliq-tetrad-bridge - GitHub releases: Direct installation from repositories
- Local development:
pip install -e .for development packages - Version management: Standard semantic versioning
Dependency Management
Actions declare their dependencies naturally:
# Action package dependencies
dependencies = [
"causaliq-workflow>=0.1.0", # Framework dependency
"rpy2>=3.5.0", # R interface (if needed)
"jpype1>=1.4.0", # Java interface (if needed)
"scikit-learn>=1.2.0" # Algorithm dependencies
]
Cross-Language Bridge Pattern
For R and Java integrations:
class RBridgeAction(Action):
def __init__(self):
# Initialize R environment
self.r_session = self._setup_r_environment()
def run(self, inputs):
# Execute R code through rpy2
result = self.r_session.run_algorithm(inputs)
return self._convert_r_output(result)
Benefits of the Auto-Discovery Approach
For Users
- Zero configuration: Install and use immediately
- No registry management: No config files to maintain
- Standard workflow: Familiar pip install → use pattern
- Automatic updates: New action versions available immediately
For Developers
- Standard Python packaging: No custom build systems
- Simple convention: Just export an 'Action' class
- Full Python ecosystem: Use any Python dependencies
- Independent development: No coordination with core framework needed
For the Ecosystem
- Organic growth: Easy to create and share actions
- Version management: Standard semantic versioning
- Quality control: Actions are regular Python packages with testing
- Documentation: Standard Python documentation tools apply