About the Data
MetroPulse Baku is committed to transparency about what our data shows, what it can infer, and what it cannot claim. This page explains our official open data sources, station-demand methodology, and product limits clearly.
Simple methodology in one line
Official daily station-entry totals + weather features + temporal patterns = station-demand forecast with confidence bounds.
Important: Station-Entry Data Only
The Baku Metro dataset used by this product contains station-entry / validation counts only. It records how many passengers entered a station on a given day.
- ✗Does NOT include where passengers traveled to
- ✗Does NOT support origin-destination trip reconstruction
- ✗Does NOT reveal individual passenger paths
- ✓Shows how many entries each station received per day
- ✓Supports demand forecasting at the station level
- ✓Enables comparison of stations by busy-ness
Data Sources
Daily Station Demand
OfficialDailyDaily station-entry / validation counts for Baku Metro stations for 2025–2026. Published by the Baku Metropolitan. Each row represents the total number of passengers who validated entry at a station on that date.
datestation_identriessource_yearStation Exit Coordinates
OfficialUpdated when exits changeExit-level geospatial data for Baku Metro stations, including exit number, label, address, and coordinates. Used for best-exit recommendations.
exit_noexit_labeladdresslatlonaccessibleWeather Observations
City CenterEvery 3 hoursWeather observations for Baku City Center, updated every 3 hours. Used as exogenous features in our demand forecasting model. Includes temperature, humidity, wind speed, and precipitation.
temp_cfeels_like_chumiditywind_speedprecipitationconditionForecasting Methodology
Model
Station-level demand forecasts are generated using a LightGBM gradient-boosted model trained on the historical daily station entry data, enriched with weather observations and temporal features.
Features used
- Day of week, is_weekend, month, day_of_year
- lag_1 (yesterday's entries), lag_7 (same day last week)
- rolling_avg_7, rolling_avg_14, rolling_std_7
- temperature, humidity, precipitation, wind_speed
- Station identity and station type
Output
Each forecast includes a point estimate (predicted entries), uncertainty bounds (lower/upper), a confidence level, and a weather effect score (−1 to +1).
Simple flow (demo explanation)
- Load official station-entry history for each station.
- Build temporal features and join recent weather observations.
- Predict upcoming station entries and compute confidence bounds.
- Label demand as low, normal, high, or surge against baseline.
Baseline
Demand is compared against a rolling 7-day average and a day-of-week baseline. The delta is used to classify stations as quieter than usual, normal, busier than usual, or surge.
Estimated Intraday Profiles
Because the official dataset is daily — not hourly — this product does not have access to real observed hourly entry counts.
Instead we use an estimated intraday profile system. Each station is assigned a profile type (e.g. commuter-heavy, central, residential) which maps to a typical hourly share distribution. This distribution is multiplied by the day's forecast total to produce estimated hourly figures.
Language used for estimated intraday data
- ✓ “Estimated intraday demand profile”
- ✓ “Expected crowd window”
- ✓ “Likely quieter period”
- ✓ “Modeled from daily totals and station profile patterns”
- ✗ NOT “Exact observed hourly count”
- ✗ NOT “Real-time occupancy”
What MetroPulse Can and Cannot Claim
We CAN show
- ✓How many passengers entered a station per day
- ✓Which stations are busiest on average
- ✓Demand forecast for the next several days
- ✓How today compares to the typical baseline
- ✓Which exits are near your destination
- ✓Estimated rush windows based on station profile
- ✓How weather may shift demand levels
We CANNOT show
- ✗Where individual passengers traveled to
- ✗Complete origin-destination trip data
- ✗Real-time carriage or platform occupancy
- ✗Exact observed hourly counts
- ✗Passenger path or route reconstruction
- ✗Dwell time or journey duration
- ✗Transfer patterns between lines
Official Open Data Usage
MetroPulse Baku uses official open station-entry data published by public sources. This data is aggregated at station/day level and is suitable for station-demand intelligence.
Because source data is aggregated, MetroPulse does not infer who traveled where, and does not provide origin-destination reconstruction.
Update Schedule
Questions or corrections? Return to dashboard