CausalIQ Workflow - Development Roadmap
Last updated: November 20, 2025
Current release: CLI Implementation Complete
🎯 Current Status
**✅ COMPLETED: Release 0.3 - Basic CLI
Latest commit: 36b16d8 feat: implement CLI with real-time workflow execution feedback
Current Capabilities:
- Complete command-line interface:
causaliq-workflow run [--dry-run] <workflow> - Real-time workflow execution with step-by-step feedback
- Action registry with plugin architecture
- 100% test coverage (471/471 lines) with full quality compliance
- Working end-to-end execution from YAML configuration
Next Release: 0.4 Enahnced Workflow
Target: 1.0 Production Workflow
✅ Completed Implementation
See Git commit history for detailed implementation progress
Key Commits:
ce41487- Action registry with auto-discovery plugin system302b70a- WorkflowExecutor with YAML parsing and matrix expansiona2c01da- Action framework with dummy structure learnerb9c9c81- Schema validation using JSON Schema36b16d8- CLI with real-time workflow execution feedback
Current Architecture:
- 📋 CLI:
causaliq-workflow run [--dry-run] <workflow> - 🔌 Action Registry: Auto-discovery plugin system
- ⚙️ WorkflowExecutor: YAML parsing, matrix expansion, step execution
- 📊 Schema Validation: JSON Schema with error reporting
- 🧩 Testing: 100% coverage, 201 tests passing
🛣️ Upcoming Implementation
Release 0.3: Enhanced Workflow
Key Deliverables: Conservative execution and dry-run capability
Commit 0.3.1: Basic Task Logging Infrastructure
- [ ] log_task() method - Implement formatted message output with status/runtime/files
- [ ] Message formatting - Standardized format: timestamp, action, status, description
- [ ] Comprehensive testing - All status types with various input/output scenarios
Commit 0.3.2: Action Output File Interface
- [ ] get_output_files() method - Add to Action base class for file discovery
- [ ] Default implementation - Empty list for actions without specific outputs
- [ ] Test integration - Implement in test actions for validation
Commit 0.3.3: FileManager Foundation
- [ ] FileManager class - File existence and comparison utilities
- [ ] Traditional file logic - Basic exists/missing detection for replace-semantics files
- [ ] Isolated testing - File operations without workflow integration
Commit 0.3.4: Skip Logic Implementation
- [ ] should_skip_action() method - Determine if action can skip based on existing outputs
- [ ] Traditional files only - Skip logic for replace-semantics files (no append-semantics yet)
- [ ] Comprehensive scenarios - Test various file existence and modification patterns
Commit 0.3.5: ActionExecutor Wrapper
- [ ] ActionExecutor class - Wrapper for action execution with status determination
- [ ] Status logic - EXECUTES vs SKIPS for traditional files in run mode
- [ ] Mock integration - Test execution wrapper without WorkflowExecutor changes
Commit 0.3.6: Dry-Run Status Logic
- [ ] WOULD_EXECUTE status - Implement dry-run equivalent of EXECUTES
- [ ] WOULD_SKIP status - Implement dry-run equivalent of SKIPS
- [ ] Mode differentiation - Proper status based on run vs dry-run mode
Commit 0.3.7: WorkflowExecutor Integration
- [ ] Logger creation - WorkflowExecutor creates and configures WorkflowLogger
- [ ] ActionExecutor usage - Replace direct action calls with ActionExecutor wrapper
- [ ] Regression testing - Ensure all existing workflows continue to pass
Release 0.4: Progress and Summary
Key deliverables: Real-time progress tracking and execution summary
Commit 0.4.1: Runtime Estimation Interface
- [ ] estimate_runtime() method - Add to Action base class for progress calculation
- [ ] Default estimation - 1-second default for actions without specific estimates
- [ ] Progress foundation - Basic estimation without user interface
Commit 0.4.2: Progress Calculation Engine
- [ ] Progress calculation - Aggregate runtime estimates for workflow progress tracking
- [ ] Background tracking - Progress computation without user interface display
- [ ] Accuracy testing - Validate progress calculation with various workflow scenarios
Commit 0.4.3: ProgressReporter Foundation
- [ ] ProgressReporter class - Click integration for progress bar display
- [ ] Basic structure - Progress bar initialization and configuration
- [ ] Static progress - Progress structure without real-time updates yet
Commit 0.4.4: Live Progress Integration
- [ ] Real-time updates - Connect progress reporter to workflow execution
- [ ] Action completion - Update progress as actions complete
- [ ] Optional display - Toggle progress bars based on CLI parameters
Commit 0.4.5: Status Aggregation & Summary
- [ ] Status aggregation - Count tasks by status type (EXECUTES, SKIPS, etc.)
- [ ] Summary formatting - Clear report with counts, runtime, resource usage
- [ ] Report accuracy - Comprehensive testing for summary calculation
Commit 0.4.6: Enhanced Error Reporting
- [ ] FAILED status formatting - User-friendly error messages with actionable suggestions
- [ ] INVALID_* status details - Clear parameter validation error reporting
- [ ] Error summary - Aggregate error information for debugging
Commit 0.4.7: CLI Enhancement & Testing
- [ ] Enhanced CLI options - Improve CLI based on real-world testing feedback
- [ ] Better error messages - Refine error handling discovered during external package testing
- [ ] Path resolution improvements - Handle edge cases found during real usage
- [ ] Complete CLI testing - All logging features with file output verification
- [ ] Performance validation - Logging overhead measurement and optimization
Release 0.5: Advanced Features
Key deliverables: Metadata, compare mode, timeouts, estimated completion.
Commit 0.5.1: Append-Semantics File Support
- [ ] get_output_contribution_key() - Action method for append-semantics identification
- [ ] has_existing_contribution() - Check if action's section exists in append-semantics files
- [ ] FileManager enhancement - Handle metadata.json style files with action-specific sections
Commit 0.5.2: File Comparison Foundation
- [ ] Comparison utilities - Basic file diff and comparison logic in FileManager
- [ ] Text file diffs - Generate meaningful comparisons for various file types
- [ ] Isolated testing - File comparison without execution integration
Commit 0.5.3: Compare Mode Status Logic
- [ ] IDENTICAL status - Implement when re-execution produces same outputs
- [ ] DIFFERENT status - Implement when re-execution produces changed outputs
- [ ] Compare mode execution - New execution path for output comparison
Commit 0.5.4: Resource Monitoring Infrastructure
- [ ] Memory monitoring - Track memory usage during action execution
- [ ] CPU monitoring - Track CPU utilization and report in log messages
- [ ] Resource reporting - Include resource usage in status messages
Commit 0.5.5: Timeout Handling
- [ ] Timeout configuration - Per-action timeout settings and monitoring
- [ ] TIMED_OUT status - Graceful termination with timeout status reporting
- [ ] Cleanup logic - Proper resource cleanup when actions exceed timeout
Commit 0.5.6: Advanced Progress Features
- [ ] Estimated completion - Real-time estimates based on action progress
- [ ] Resource display - Memory/CPU usage in progress indicators
- [ ] Smart updates - Adaptive progress update frequency based on action complexity
Commit 0.5.7: Integration Testing & Optimization
- [ ] End-to-end validation - Complete logging system integration testing
- [ ] Matrix workflows - Multi-action workflow testing with all status types
- [ ] 100% coverage - Maintain comprehensive test coverage for all logging features
- [ ] Compare mode testing - Complete integration testing for output comparison
- [ ] Resource monitoring validation - Accuracy testing for memory/CPU tracking
- [ ] Performance optimization - Final performance tuning for large workflows
- [ ] Documentation updates - Complete documentation for all advanced logging features
Release 0.5: Algorithm Integration Foundation
Target: Robust testing infrastructure with concrete action implementations
Commit 0.5.1: Test Action Fixtures
- [ ] Concrete test actions - Real algorithm implementations in tests/functional/fixtures
- [ ] PC algorithm test action - Simple structure learning with actual causal discovery logic
- [ ] Multiple test algorithms - Different algorithms to test various scenarios (GES, constraint-based)
- [ ] Data processing - Handle real CSV data and generate actual GraphML outputs
Commit 0.5.2: Algorithm Testing Infrastructure
- [ ] Output standardization - GraphML files with proper causal graph representation
- [ ] Parameter validation - Algorithm-specific parameter handling and validation
- [ ] Data fixtures - Test datasets for consistent algorithm validation
- [ ] Results validation - Verify GraphML output structure and content
Commit 0.5.3: End-to-End Validation
- [ ] Complete workflow testing - CLI → ActionRegistry → Test actions → Results
- [ ] Matrix workflow validation - Multi-algorithm, multi-dataset test scenarios
- [ ] Performance benchmarking - Execution time and resource usage with real algorithms
- [ ] Documentation examples - Complete usage examples using test action fixtures
🚀 Possible Future Features
External Algorithm Integration (After robust test infrastructure):
- Multi-language workflows (R bnlearn, Java Tetrad, Python causal-learn)
- External CausalIQ package integration (discovery, analysis)
- Matrix-driven algorithm comparisons across datasets
- Automatic dataset download and preprocessing
Production Features:**
- 📋 Workflow queuing - CI-style runner management
- 📊 Monitoring dashboard - Real-time execution tracking
- 🗺 Artifacts & caching - Persistent storage, result reuse
- 🔒 Security & isolation - Secrets management, containers
- 📈 Performance optimization - Resource limits, scheduling
Research Platform:
- 🤖 LLM integration - Model averaging, hypothesis generation
- 🌐 Web interface - Browser-based workflow designer
- 🚀 Cloud deployment - AWS/GCP/Azure runners
- 👥 Collaboration - Multi-researcher workflows
- 📚 Publication workflows - Reproducible research outputs
Advanced Capabilities:
- Workflow marketplace - Sharing and discovering research workflow templates
- Interactive notebooks - Jupyter integration with workflow execution
- Multi-machine execution - Distributed workflows across compute clusters
- AI-assisted optimization - Automated hyperparameter and workflow tuning
- Integration ecosystem - Plugins for major research tools and platforms
This roadmap leverages Git commit history for completed work, provides detailed release-based planning for upcoming functionality, and outlines future possibilities.