Parsing CIF files to get train schedule data in PHP

I’ve had some fun looking at what data you can get via the National Rail Open Data scheme, and was really impressed by the ActiveMQ implementation they’ve got for Real Time Train Movement Messages!

The messages National Rail send allow you to plot trains to stations, or even along route, but the message only contains station IDs or train IDs – which is kinda boring. I like to visualise the data, on a map for example.

To get the station name and location for a particular train message one has to access an entirely different database, from another provider. Along comes ATOC with their CIF files, which look very scary compared to ActiveMQ.

To cut a long story as short as I can, the CIF files contain a lot of information (400MB+ files), using string length and new lines to split up the data. You can sign up to download the files on the ATOC website, and the specification for the files is available here.

Why the blog post? Well, I needed a way to parse these files to populate a Mongo database and wanted to promote the PHP CIF parser I’ve started work on: