thoughts on time series and big data    About    Archive    Feed

Will I be late for work?

Introduction

In this post we will take a fast look at delays of selected busses in Stavanger region during morning and afternoon rush hours.

If you are interested, you can find iPython notebook with code I used for this analysis on GitHub. There you can also find 7 days sample data. The analysis below bases on 1.5 months of data, so your result might differ if you use a smaller sample.

Morning and afternoon delays

Most of population takes bus to work in the morning and back from work in the afternoon. This foreseeable increase in demand is not easy to address due to increased car (and bus) traffic during these hours. What is more, passengers traveling during these periods are most sensitive to delays, risking being late for work in the morning and late picking up kids from kindergarten in the afternoon.

In the next two graph I show expected delays for morning and afternoon period combined for two busses. In the morning X60 going from Stavanger downtown, through University to Forus, and X76 going from Randaberg, through University to Forus. In the afternoon same busses going the opposite direction. Morning rush hours are observed usually between 7:00 and 8:30, and afternoon rush hours between 15:00 and 16:30.

Morning delays in Stavanger/Randaberg, UiS, and Forus per quarter hour

Figure above shows median expected delays in the morning for each quarter hour, based on departure time from the first stop. Before going further, let me explain how to read this and the next graph.

For each quarter hour you can see 3 stacked bars - dark green, red, and dirty yellow. Dark green bar shows the delay on the first stop, either Stavanger or Randaberg. In some cases busses arrive before scheduled time, so the green bar can go down, into negative area. I do not analyze here the difference between arrival and departure time, which is a topic for another analysis. Subsequently red bar shows the extra delay of the bus when it reaches half way stop at the University of Stavanger. The start point of the bar is the end of green bar, and the end is total delay at that stop. Finally, dirty yellow bar shows delay at Forus; for practical reasons I chose Tvedtsenter as the reference point there. Yellow bar starts at the end of red bar, and the end represent final delay on the route. Bars are slightly shifted w.r.t. one another to improve readability in case of overlap. Sometimes one of the bars might not be visible if change in delay for respective stretch is very small. In addition the black lines going up and down from the end of each bar show the variability of the median. In particular, they show one standard deviation each way.

Busses starting at 7:30, 8:00, and 8:15, come to the origin bus stop at scheduled time. Busses starting at 7:45 come in fact 2 minutes ahead of schedule and have relatively big variation. Coming almost 4 minutes ahead of schedule is not uncommon. Regrdless of the start time, the stretch to UiS contributes most to the overal delay, from 3 to 6 minutes, but longest delays are partially offset by coming earlier to the origin. Again variation is significant and can reach 6 minutes. The stretch from UiS to Forus increases delay for busses starting at 7:45 and 8:00. The total delay for pick of rush hours is usually around 8 minutes, but can reach over 12 minutes. If you take bus earlier or later you can in fact expect to arrive to Forus on time, despite the delay half way at UiS.

Let's now look at the delays in the opposite direction — from Forus — in the afternoon.

Afternoon delays in Forus, UiS, and Stavanger/Sunde per quarter hour

The figure shows far worse delays than what we saw in the morning. Usually delays reach at least 9 minutes, but delays from 20 to 30 minutes are not uncommon, easpecially around 15:15. One interesting observation is that significant part of the delay is already incurred at Forus, busses arrive delayed at Tvedtsenter. Variations of over 12 minutes further worsen passanger experience.

Detailed look (histograms)

In the following 6 figures I present more detailed data for delays in form of histograms. This is mostly for the readers who would like to better understand data behind the above two figures, but do not have time to look into the raw data. Those of you who are not interested in these details can scrow down directly to conclusions.

Histogram of afternoon delays in Forus Histogram of afternoon delays in UiS Histogram of afternoon delays in Stavanger/Sunde Histogram of morning delays in Stavanger/Randaberg Histogram of morning delays in UiS Histogram of morning delays in Forus

Conclusions

Do you remember the question we started with - "will I be late for work?". The simple answer is yes. Especially, if you leave around 7:15, you can expect to be late for 8 minutes on average and it will vary significantly depending both on the place you're going to and on a particular day you travel.

You can expect even longer delays on your way back, 20 to 30 minutes is not uncommon. Variability is also notable and reaches 12 minutes. Delay on the return from work, might be particularly problematic for people using public transportation to pick up their kids from kindergarten. In this context, it is understandable why parents might want to use a car instead of a bus.

The time you take the bus is important. If you can travel right before or after rush hours, delays are almost negligible. A difficult aspect of the delays, already mentioned several times, is variability, which makes adjusting bus schedules a difficult task.

Future work

This post is just a first look. It opens possibilities for further exploration in several different directions. First aspect would be to analyze each bus line in detail, stop by stop, with hope to discover problem areas. Such discovery could lead to interventions and hopefully improvements for passengers. We have seen examples of such interventions, e.g. near Tvedtsenter. Looking at this area before and after interventions will be another aspect of further work.

Current analysis focuses on two express lines. It is interesting to see if major regular lines — such as 1, 4, and 6 — behave in a similar way. Final element is to differentiate between arrival time and departure time. At first sight, difference should be minor, but some busses seem to be significantly delayed due to time spent on selling tickets paid with cash. Recently, Kolumbus introduced changes to this system, what opens a possibility for a comparative analysis.