Abstract
Causal inference from observational data remains a fundamental challenge in statistics and machine learning. This work proposes a novel framework combining doubly robust estimation with modern machine learning techniques.
Background
Traditional causal inference methods rely on strong parametric assumptions. We aim to leverage the flexibility of machine learning while maintaining the theoretical guarantees of causal inference.
Proposed Method
Our approach, ML-Causal, combines:
- Random forests for propensity score estimation
- Gradient boosting for outcome modeling
- Cross-fitting to reduce overfitting bias
- Doubly robust estimation for robustness
Experiments
We evaluated our method on:
- Synthetic data: Known ground truth for validation
- Healthcare data: Estimating treatment effects in electronic health records
- Economic data: Policy evaluation studies
Key Findings
- Reduced bias compared to traditional methods
- Better coverage of confidence intervals
- Scalability to large datasets
Discussion
The integration of machine learning with causal inference opens new possibilities for data-driven decision making in complex domains where randomized experiments are infeasible.
Future Directions
- Extension to time-varying treatments
- Integration with deep learning
- Development of sensitivity analysis tools