We’re well past a month into the COVID-19 shutdown, and I noticed that fewer and fewer trains were running on Regional Rail each day. I knew that SEPTA had decreased their service due to less riders, but I wondered just how strict the service cuts were. I also wondered if more or perhaps less trains were running on time.
I started off by firing up Splunk Lab then went for a walk while it took 15 minutes or so to load the data up. I came back, and decided to see where we were:
That’s well north of 60 million data points. While I could crunch that data as-is, it would take longer for me to run my subsequent queries as well as look for trends in the data. I ended up writing a couple of scripts to summarize that data on a daily basis, so that I can get a bucketed breakdown of late trains for the entire day. We’ll get back to that bucketing later, because I first want to talk about train volumes.
I figured that with the perceived dropoff in service, I could look at how many distinct trains (identified by train number) each day. Sure enough, there was a drop off in train service levels:
Things started to get serious the week of Monday, March 16th. In fact, that was the last day of “normal” service on Regional Rail. Starting on Tuesday the 17th, the number of trains per weekday went from nearly 500 to about 362, with the weekends unaffected.
Anyway, once you have enabled Untrusted Shortcuts, you can click on any of these links to add shortcuts to Siri that will query the website’s API and report back on the status of Regional Rail, Busses, or both:
With the release of SEPTA’s new app, I’ve suddenly been flooded with questions about their API. People wanted to know how stable it was.
Well, I don’t work for SEPTA, which means I don’t have insight into their operations, but I can perform some analytics based on what I have, which is approximately 18 months of Regional Rail train data, read every minute by SEPTA Stats.
This is all of the data that I have in Septa Stats currently:
Events Since Inception: 26,924,887 events
First Event: Mar 1, 2016 12:00:01 AM
Last Event: Nov 16, 2017 10:33:53 PM
That’s way more events than minutes in that timeframe, and the reason for that is each API query is split into a separate event for each train. So if an API call returns status for 20 trains, that gets split into 20 different events. This is done because Splunk has a much easier time working with JSON that isn’t a giant array. 🙂
I’m pleased to announce that dashboards are now available on train views.
Prior to the introduction of the dashboard, it was difficult to tell if a train was running, what route it was on, and what station it was at. This new dashboard makes use of the “latest” API endpoint described in the previous post to provide a snapshot of the current status of the train.