Anomaly detection
for the NYC Subway,
every 30 seconds.
A streaming system that ingests live GTFS-Realtime feeds from the MTA, computes per-stop headway features, and scores anomalies with an online ML pipeline that retrains continuously. Operators triage incidents on a live Mapbox command center — ranked, contextualized, and color-coded by severity.
From GTFS feed to live incident
Streaming · Stateful · 30s cadenceCollect
Worker polls seven MTA GTFS-Realtime protobuf feeds every ~30 seconds, deduplicates trip updates, and emits normalized stop events.
Persist
TimescaleDB stores stop-event time-series with hypertables for route/stop indexing, enabling fast range queries over observed vs predicted headway.
Score
Online ML with River retrains continuously on sliding windows. A shadow deep model runs in parallel for A/B quality telemetry and drift tracking.
Visualize
Next.js command center with Mapbox renders live heatmaps, ranked incident table, and model telemetry. Auto-refresh every 10s with optimistic KPIs.
Real-time anomaly stream
Auto-refresh · 15m window- awaiting next worker cycle · feed refreshes every 10s
Six services, one dashboard
Docker compose · VPS deployment- workerMTA poll + dedupe
- dbTimescaleDB
- trainerRiver online
- dl_shadowPyTorch A/B
- apiFastAPI
- telemetrydrift metrics
- uiNext.js 14
- mapMapbox GL
- nginxSSL + subpath
Open the live command center.
Real map of New York City. Real MTA feeds. Real anomalies as they happen — ranked, scored, and visualized for operator triage.