We study the crucial problem of delay prediction in railway networks, with the goal of improving real-time passenger information and enabling operators to proactively manage the network.
Instead of using traditional approaches that directly fit a regression model to predict delays from the current network state, we take a simulation-driven approach: given the current state of the network, we train a model to predict the state 30 seconds into the future, using historical data. By repeatedly rolling this model forward, we obtain delay predictions. This forces the model to learn the underlying dynamics of the network, leading to more robust predictions and enabling uncertainty estimation.
On the methodological side, we introduce a new imitation learning scheme, Drift-Corrected Imitation Learning, designed to better handle covariate shift while remaining easy to train in practice.
Empirically, on three years of nation-wide operational data, we show that this simulation-based method outperforms direct regression for 30-minute horizon delay predictions, for both Multi-Layer Perceptron (MLP) and Transformer architectures. Remarkably, an MLP trained with our simulation approach outperforms a Transformer trained with standard regression, despite using 14× fewer parameters. This underscores that problem formulation and modeling matters more than sheer model size for this task.
This work was done jointly with Jesse Read, Sonia Vanier, and Albert Bifet, as part of the “AI and Optimisation for Mobility” research chair between École Polytechnique and SNCF. This contribution highlights the importance of academia–industry partnerships as a powerful driver of meaningful, real-world impact in AI.