Any help greatly appreciated!! Darwin Push Port data extraction

1,170 views
Skip to first unread message

apcar...@gmail.com

unread,
Mar 29, 2015, 8:22:04 PM3/29/15
to openrail...@googlegroups.com
Anyone out there who is willing to give me a bit of time to explain how I can extract this new push port data that is being released will be an absolute hero. Here is where I stand at the moment

1. I require the Darwin Push Port (DPP) data for a piece of research I am doing. I will only require specific stations when the DPP allows you to select stations and not just the whole network but I know currently DPP is not allowing this option.

2. I have a little experience and understanding of ruby coding (having spent many hours googling) and have downloaded ruby interactive via http://rubyinstaller.org/

3. I know from reading the wikipages that I need to download a stomp client but from there I am lost as to how everything links together.
Therefore:
a. I am told to download a stomp client from the list, which one should I use? http://nrodwiki.rockshore.net/index.php/Darwin:Push_Port#How_do_I_get_the_data.3F?
b. Where do I type my code? Do I set up my ruby code in my command prompt on my PC or on my ruby interactive that I have downloaded (see 2.)
c. How does this ruby code link to the stomp client?
d. I have read about ruby gems being important, how do these work?
e. If I ever manage to get my code working and all the stomp client etc linked and then I am extracting the Darwin push port data, where does this data end up? I know I will be sent in zipped update messages, do I set in the code that I would like them sent to my c drive for example?

If anyone out there can even answer 1 of the questions or give me any help I would be so grateful. I have spent the last few days making very little progress :(

p.s. I have an example ruby code from the National Rail data feeds website so am hoping to make use of this (after some tweaking). I can post it if needs be.

Peter Hicks

unread,
Mar 30, 2015, 4:12:37 AM3/30/15
to apcar...@gmail.com, openrail...@googlegroups.com
Hello

On 30 Mar 2015, at 01:22, apcar...@gmail.com wrote:
1. I require the Darwin Push Port (DPP) data for a piece of research I am doing. I will only require specific stations when the DPP allows you to select stations and not just the whole network but I know currently DPP is not allowing this option.

As it stands at the moment, there’s no option to filter the data - this will be coming in the next few weeks.

2. I have a little experience and understanding of ruby coding (having spent many hours googling) and have downloaded ruby interactive via http://rubyinstaller.org/

I’d strongly recommend you don’t use Windows as your development platform when using Ruby.  If you don’t have much experience of Ruby, learn Java instead, as you can use Eclipse or IDEA as an IDE.

3. I know from reading the wikipages that I need to download a stomp client but from there I am lost as to how everything links together.

You need to write a Stomp client - there’s example code on the wiki as to how to do this, but since you also need a fair bit of programming expertise to be able to work with the data, the “how” isn’t offered up on a plate since it’s merely the start.

If you just want data from the Push Port, one option available to you is to FTP in to datafeeds.nationalrail.co.uk with the username/password displayed on the ‘My Feeds’ screen and download the log files instead.  You can then work on parsing them in Ruby or your language of choice without having to tackle using a Stomp client straight off.

If you have little coding experience, this is probably going to be the start of a fairly steep learning curve…!


Peter


signature.asc

Andrew Carson

unread,
Mar 30, 2015, 5:25:32 AM3/30/15
to openrail...@googlegroups.com, apcar...@gmail.com
Thanks for your response Peter.

The bottom line is that I require the timetabled running for certain trains and then the actually running for these trains. I then need to observe which trains are delayed and then why they are delayed. From my knowledge I thought that only the DPP data realeased on the 31st March (I know it is already out) is the only one giving information as to the nature of delays (if known). 

Consequently would you recommend I use the newly released DPP data or the older network rail data feeds?

I have realised that this is going to be a very steep learning curve however I am dedicating this whole week to it so have plenty of hours hopefully. I am going to have to get to grips with this otherwise I am going to have to sit and watch trains on the live feed all day for a while!
I also appreciate that to process the data I am going to need some more knowledge, my plan was to get 2 months data and during this time to set up the code to analyse the data.

I had also heard that the new DPP data was going to be easier to work with! 

What would you suggest my next steps are? Download a stomp client and attempt to write some code (do i write this code in the stomp client or on a java platform which then links the 2)? Which stomp client is the best for what I want or are they all equally as hard/usable?

Peter Hicks

unread,
Mar 30, 2015, 6:05:19 AM3/30/15
to Andrew Carson, openrail...@googlegroups.com
Hi Andrew

On 30 Mar 2015, at 10:25, Andrew Carson <apcar...@gmail.com> wrote:

The bottom line is that I require the timetabled running for certain trains and then the actually running for these trains. I then need to observe which trains are delayed and then why they are delayed. From my knowledge I thought that only the DPP data realeased on the 31st March (I know it is already out) is the only one giving information as to the nature of delays (if known).  Consequently would you recommend I use the newly released DPP data or the older network rail data feeds?

Use the Push Port for this data.  When a train is delayed, providing a reason for the delay is entered in to Darwin, you’ll get a reason code sent out for that train.

The TRUST feed from Network Rail won’t have any reason data entered, because that’s held in a separate system called TRUST DA (Delay Attribution), the data from which isn’t entered in real-time, nor is it public.

I have realised that this is going to be a very steep learning curve however I am dedicating this whole week to it so have plenty of hours hopefully. I am going to have to get to grips with this otherwise I am going to have to sit and watch trains on the live feed all day for a while!
I also appreciate that to process the data I am going to need some more knowledge, my plan was to get 2 months data and during this time to set up the code to analyse the data.

I had also heard that the new DPP data was going to be easier to work with!

It is - the schedules and real-time data are much easier to parse than the data from Network Rail, albeit they’re not in as much detail.

What would you suggest my next steps are? Download a stomp client and attempt to write some code (do i write this code in the stomp client or on a java platform which then links the 2)? Which stomp client is the best for what I want or are they all equally as hard/usable?

I’d strongly suggest you grab some of the archived data from today from the FTP site and work with that so that you can get started quickly.  Once you’re able to load in, say, a day’s worth of data and get the outputs you want, then you could go on to working with a Stomp client to grab the data in real-time.

When you come to use a Stomp client, if you’re running it on Ruby, I’d use the ‘stomp’ gem (see https://rubygems.org/gems/stomp).  On Java, you’ve got a wider choice - I’ve used the Apache Camel component for Stomp, but there are others.

A typical design pattern, no matter what platform you’re using, is to have your message processing code as a class (or more likely, a set of classes), and your Stomp client calling these classes to process messages.


Peter

signature.asc

Andrew Carson

unread,
Mar 30, 2015, 6:22:27 AM3/30/15
to openrail...@googlegroups.com, apcar...@gmail.com
Hi Peter,
Really appreciating you taking the time to answer my questions, sorry for being quite slow with this, am trying my best. 
I didn't know that any data was archived from the https://datafeeds.nationalrail.co.uk/, I only thought I could subscribe to a live feed? Where do I get the archived data and what programme should I play round with it in?
I will deal with that first and then get on to the Stomp client later today, thinking that small steps might be the way forwards!

And with your last point you would advise setting up the processing and collection at the same time to save future faffing around?
ie so it is only calling the data I want?

Regards

Peter Hicks

unread,
Mar 30, 2015, 7:19:43 AM3/30/15
to Andrew Carson, openrail...@googlegroups.com
Hi Andrew

On 30 Mar 2015, at 11:22, Andrew Carson <apcar...@gmail.com> wrote:

I didn't know that any data was archived from the https://datafeeds.nationalrail.co.uk/, I only thought I could subscribe to a live feed? Where do I get the archived data and what programme should I play round with it in?

On the ‘My Feeds’ page, there’s details for an FTP server.  Log in to there and in one of the directories, you’ll find a load of pPort log files from the current day.  Download these - they’re in XML, one record per line.

And with your last point you would advise setting up the processing and collection at the same time to save future faffing around?
ie so it is only calling the data I want?

You could do it either way - collect and process the messages in real-time, or you could collect the messages during the course of the day and process them after the event if you didn’t need to run real-time.


Peter


signature.asc

Andrew Carson

unread,
Mar 30, 2015, 7:42:42 AM3/30/15
to openrail...@googlegroups.com, apcar...@gmail.com
Peter,
Thanks, just the small problem of an error message saying "An Error Occurred opening that folder on the FTP server.  Make Sure you have permission to access that folder.  Details: The Connection with the server was reset." is thwarting me at the moment when I type in ftp://datafeeds.nationalrail.co.uk into windows explorer and then enter my username and password. I will have a look on google to solve the issue then have a look at the files.
Cheers
Andrew 

Peter Hicks

unread,
Mar 30, 2015, 7:46:21 AM3/30/15
to Andrew Carson, openrail...@googlegroups.com

On 30 Mar 2015, at 12:42, Andrew Carson <apcar...@gmail.com> wrote:

Thanks, just the small problem of an error message saying "An Error Occurred opening that folder on the FTP server.  Make Sure you have permission to access that folder.  Details: The Connection with the server was reset." is thwarting me at the moment when I type in ftp://datafeeds.nationalrail.co.uk into windows explorer and then enter my username and password. I will have a look on google to solve the issue then have a look at the files.

Hmm - it works for me using Chrome, and also using FTP on the common line.

If your connection is being reset, I guess you may have outbound FTP access blocked from wherever you’re connecting from, or maybe you’re behind a NAT (address translation) device which doesn’t handle FTP connections properly and may require you to set passive-mode FTP, which I don’t think you can do in IE.


Peter


signature.asc

Kevin Fullerton

unread,
Mar 30, 2015, 7:52:37 AM3/30/15
to openrail...@googlegroups.com
If you're using Windows, I'd recommend Filezilla Client for doing FTP - much easier to work out and simpler than IE (or any web browser).

Cheers
Kevin

--
You received this message because you are subscribed to the Google Groups "A gathering place for the Open Rail Data community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openraildata-t...@googlegroups.com.
To post to this group, send email to openrail...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nigel Mundy

unread,
Mar 30, 2015, 7:56:22 AM3/30/15
to Andrew Carson, openrail...@googlegroups.com

If using IE to ftp (yuk), rather than an ftp client like filezilla, the url format is

ftp://username:pass...@datafeeds.nationalrail.co.uk/

HTH

Nigel.

Andrew Carson

unread,
Mar 30, 2015, 11:39:00 AM3/30/15
to openrail...@googlegroups.com, apcar...@gmail.com
Unfortunately none of the above methods work. On filezilla I get the following:
Status: Resolving address of datafeeds.nationalrail.co.uk
Status: Connecting to 54.77.164.226:21...
Status: Connection established, waiting for welcome message...
Status: Insecure server, it does not support FTP over TLS.
Status: Connected
Status: Retrieving directory listing...
Command: PWD
Response: 257 "/"
Command: TYPE I
Response: 200 Switching to Binary mode.
Command: PASV
Error: Disconnected from server: ECONNABORTED - Connection aborted
Error: Failed to retrieve directory listing

And I get the same error in chrome and with the different URL that Nigel suggested. Must be something in my firewall stopping the connection, will have to investigate further

On Monday, March 30, 2015 at 12:56:22 PM UTC+1, Nigel Mundy wrote:

If using IE to ftp (yuk), rather than an ftp client like filezilla, the url format is

Andrew Carson

unread,
Mar 31, 2015, 8:21:05 AM3/31/15
to openrail...@googlegroups.com, apcar...@gmail.com
I have had absolutely no luck getting the FTP stuff to work, I am trawling through the filezilla forums to try and find an answer. I think it may have something to do with the fact that I am on University wifi but cant be sure. I've tried lots of things and it doesn't seem like the firewall. I still have the same error as I did yesterday.
If I can't get this to work does this mean it is unlikely I will be able to connect to a stomp client? 

Mike Flynn

unread,
Mar 31, 2015, 9:21:47 AM3/31/15
to openrail...@googlegroups.com
Hi Andrew

The clue's in the error message;
> Status: Insecure server, it does not support FTP over TLS.

Use only plain FTP (insecure)

Mike Flynn

unread,
Mar 31, 2015, 9:29:58 AM3/31/15
to openrail...@googlegroups.com
> The clue's in the error message;
I meant to say the clue's hidden in the 'status' message :)

Andrew Carson

unread,
Mar 31, 2015, 9:50:59 AM3/31/15
to openrail...@googlegroups.com
I've tried that on the site manager but for some reason it didn't work either. When I tested it (with plan ftp) on https://ftptest.net/ I got the following messages: I don't think its the firewall and think it must be something with the router/range of ports with my uni wifi!

Error: Server returned unroutable private IP address in PASV reply

  • Make sure the server is configured to allow passive mode connections.
  • If the server is behind a NAT router, make sure the server knows its external IP address.
  • The range of ports used for passive mode must be opened in all involved firewalls.
  • The range of ports used for passive mode must be forwarded by all involved NAT routers.
  • Try uninstalling all firewalls and plug your computer directly into your modem, thus bypassing the router

Mike Flynn

unread,
Mar 31, 2015, 10:03:38 AM3/31/15
to openrail...@googlegroups.com
I can replicate your error (using FileZilla) when set to FTP over TLS.  And in Site Manager set Encryption to Use only plain text  connects.

Have you tried command line?
ftp datafeeds.nationalrail.co.uk
If still doesn't work what is message?

Andrew Jolly

unread,
Mar 31, 2015, 10:28:04 AM3/31/15
to Andrew Carson, openrail...@googlegroups.com
works on filezilla for me too but that ftptest.net threw the same error for me so try in filezilla using passive mode and plain ftp

--

Andrew Carson

unread,
Mar 31, 2015, 10:57:45 AM3/31/15
to openrail...@googlegroups.com, apcar...@gmail.com, and...@jollysolutions.co.uk
Quote:
Who is my server? (You can see I am new to this!)
Is it the national rail data feeds or the University who provide my wifi or something else?
The former. In Passive mode they send a 10.x.y.z IPv4 which is a private, unroutable IPv4 that does not work over public connections. Your client is doing the same in Active mode. Since you both are behind NAT operation is not possible without configuration.

As you seem to be behind campus NAT (without any control over) you can only use Passive mode. Servers not working in Passive mode can't be reached, no way around in this case (unless the server and yourself are both in the same LAN).


The above is from the filezilla forums that I have also been trying to get an answer from, looks like I may not be able to access the FTP files unless I go home and use my wifi/router there, grrrrrrh!  

Mike Flynn

unread,
Mar 31, 2015, 11:30:29 AM3/31/15
to openrail...@googlegroups.com
> ..on uni private network, behind firewall...have to wait!
Good answer!

Just to connect with Stomp now, LOL.  My advice is to try a few approaches.  I tried connecting with PHP, and with Java, from Windows, and with a Linux VPS.  Getting Stomp installed I think is three-quarters the battle.  I found it impossible to install Stomp on PHP using Windows.  Installiing Stomp extension for PHP in Linux is very much dependant on what your linux setup is.  I found a relatively painless way was with Centos and WHM/cPanel installation.  Some good VPS providers I've found vultr.com, digitalocean.com.  With Java I found that the Gozirra implementation of stomp worked for me eventually.  Did you say you've set aside a week for this?  IME, never mind the actual data, you'll need that to get that damn STOMP thing working!

Peter Hicks

unread,
Mar 31, 2015, 11:40:14 AM3/31/15
to Andrew Carson, openrail...@googlegroups.com, and...@jollysolutions.co.uk
Hi Andrew

I have FileZilla 3.10.2 installed on my Mac here, and see:

Status:       Server sent passive reply with unroutable address. Using server address instead.

Do you see that on your side too?  If not, can you check the following configuration option:


The FTP server is on Amazon, who use NAT - I’m not sure there’s an application-layer gateway (ALG) available, but I’ll check.



Peter

signature.asc

Peter Hicks

unread,
Mar 31, 2015, 11:46:40 AM3/31/15
to Mike Flynn, openrail...@googlegroups.com
Hi Mike

On 31 Mar 2015, at 16:30, Mike Flynn <mi...@a1publishing.com> wrote:

> Just to connect with Stomp now, LOL. My advice is to try a few approaches. I tried connecting with PHP, and with Java, from Windows, and with a Linux VPS. Getting Stomp installed I think is three-quarters the battle. I found it impossible to install Stomp on PHP using Windows. Installiing Stomp extension for PHP in Linux is very much dependant on what your linux setup is. I found a relatively painless way was with Centos and WHM/cPanel installation. Some good VPS providers I've found vultr.com, digitalocean.com. With Java I found that the Gozirra implementation of stomp worked for me eventually. Did you say you've set aside a week for this? IME, never mind the actual data, you'll need that to get that damn STOMP thing working!

The combination of Windows and PHP is troublesome, as is obfuscating a Linux server by using cPanel. You’re making life difficult for yourself by trying to make life easy, ironically!

Whenever you compile something from source on Linux, you need development libraries - these are trickier to install on CentOS (in my Ubuntu/Debian-biased view) than other distributions. When installing PHP extensions on Windows, I dread to think what kind of magic you’ll require.

The only problems I’ve had connecting using Stomp on either Ruby or Java have been in writing the client, not with installing dependencies. Python will probably be similar.


Peter

signature.asc

Andrew Carson

unread,
Mar 31, 2015, 11:57:11 AM3/31/15
to openrail...@googlegroups.com, mi...@a1publishing.com
Mike, 
I had a week but my supervisor has given me until the darwin pushport lets you specify by TIPLOC to get it sorted (As we want to keep the amount of data down, don't want the whole network really!!) so I have a few weeks (although I have 10,000 other words to write, the joy of a masters!!). So I have windows 8 on my computer and I am prepared to try anything that I can get working! 

So from the advise from the 2 of you (which is very kind of you to try and help) it would probably be best is to write some code for stomp on Java or ruby? Also do I need a VPS to process the data (I know they cost, but not too much) 

And Peter I have got my settings as you do above, no idea what the problem is, tried everything with the firewall, turned it off, opened all ports etc. Tried plain ftp, active mode, passive mode etc  so may have to give up with that/try it on my home wifi at the weekend. 

Peter Hicks

unread,
Mar 31, 2015, 12:03:17 PM3/31/15
to Andrew Carson, openraildata-talk
Hi Andrew

On 31 Mar 2015, at 16:57, Andrew Carson <apcar...@gmail.com> wrote:

I had a week but my supervisor has given me until the darwin pushport lets you specify by TIPLOC to get it sorted (As we want to keep the amount of data down, don't want the whole network really!!) so I have a few weeks (although I have 10,000 other words to write, the joy of a masters!!). So I have windows 8 on my computer and I am prepared to try anything that I can get working!

If all you’re after is some historical data at this stage, I can put that on Google Drive or Dropbox and make it available for you to download.

So from the advise from the 2 of you (which is very kind of you to try and help) it would probably be best is to write some code for stomp on Java or ruby? Also do I need a VPS to process the data (I know they cost, but not too much)

You can process the data on any platform you want - it depends what you’re most experienced in.  I develop on my Mac and run services on Ubuntu or CentOS.

Chances are a VPS will give you better results as it’ll be up 24x7 (or thereabouts, depending on how little you pay), which may be important for you.

And Peter I have got my settings as you do above, no idea what the problem is, tried everything with the firewall, turned it off, opened all ports etc. Tried plain ftp, active mode, passive mode etc  so may have to give up with that/try it on my home wifi at the weekend. 

At a guess, it may be something to do with the device on your Uni network which is doing NAT.  FTP has always been a difficult protocol to work with, however in this case - where we have lots of files to download - it’s a lot easier to work with than, say, HTTP.


Peter

signature.asc

Mark Goodge

unread,
Mar 31, 2015, 12:04:11 PM3/31/15
to openrail...@googlegroups.com
On 31/03/2015 16:46, Peter Hicks wrote:
> Hi Mike
>
> On 31 Mar 2015, at 16:30, Mike Flynn <mi...@a1publishing.com> wrote:
>
>> Just to connect with Stomp now, LOL. My advice is to try a few
>> approaches. I tried connecting with PHP, and with Java, from
>> Windows, and with a Linux VPS. Getting Stomp installed I think is
>> three-quarters the battle. I found it impossible to install Stomp
>> on PHP using Windows. Installiing Stomp extension for PHP in Linux
>> is very much dependant on what your linux setup is. I found a
>> relatively painless way was with Centos and WHM/cPanel
>> installation. Some good VPS providers I've found vultr.com,
>> digitalocean.com. With Java I found that the Gozirra
>> implementation of stomp worked for me eventually. Did you say
>> you've set aside a week for this? IME, never mind the actual data,
>> you'll need that to get that damn STOMP thing working!
>
> The combination of Windows and PHP is troublesome, as is obfuscating
> a Linux server by using cPanel. You’re making life difficult for
> yourself by trying to make life easy, ironically!

Very true!

I found getting Stomp to work in PHP very difficult. And when I did get
it working, it was a huge resource hog. So I tried a completely
different approach, and used Ruby instead, having been on a brief Ruby
on Rails course a couple of years previously :-)

That turned out to be considerably easier. The only difficulty is that
I'm not an experienced Ruby programmer (this was, in fact, my first ever
piece of production code in Ruby!), so I had to work with the Ruby
manual open in one window as I typed into the other one. But once
written, it worked superbly. It's only 70 lines of Ruby, excluding the
libraries that need to be installed separately, and probably half of
that is typical beginner's bloat.

Mark
--
http://www.markgoodge.uk

Mike Flynn

unread,
Mar 31, 2015, 12:17:38 PM3/31/15
to openrail...@googlegroups.com
Peter:

Oh no!  Here I we go down another route again.  Just joking :P

>> The combination of Windows and PHP is troublesome, as is obfuscating a Linux server by using cPanel.  You’re making life difficult for yourself by trying to make life easy, ironically!

I know what you mean.  But it's like the VI vs GUI discussion.  It's horses for courses really.  I mean why even use Windows if it's easier on the command line?   But yes, with a control panel it can make life harder.  I've a rip it up and start again test server as well as live and backup servers running WHM/cPanel. 

>> I dread to think what kind of magic you’ll require.

You'ld need to write the STOMP client/server from scratch since, at the last count, I found none freely available to be in existance. 


>> The only problems I’ve had connecting using Stomp on either Ruby or Java have been in writing the client, not with installing dependencies.  Python will probably be similar.

Yes, I'm sure, but I bet it took you a week plus.  Though I've never tried Ruby nor Python before.  Took me ages to get Java working.  And with PHP.

>> (in my Ubuntu/Debian-biased view)
Exactly!  Horses for courses.   

Andrew:

You can use VPS from your uni and will be outside 'their' firewall.  You probably would want one on your VPS though.  

I just spent ages researching VPS.  As Peter says, 'a blank canvas' can be an easier/good option.  But then you've got the learning curve of all the other stuff: ftp, cron, mail, apache, (shell), dns, firewall, etc, etc.  Use a control panel which installs everything you need.  WHM/cPanel most popular.  $200 per year / $15 p.m. from cpanel.net (14 days free trial).  vultr.com/digitialocean.com vps $10+ per month.  Or VPS with cPanel ready installed (troublesome in my opinion - you don't know exactly what you're getting).  Centos-web-panel is free but a bit rough and ready.  Also plenty other free cp's and for redhat or debian flavours of linux.

Mark:

Ruby?  Sounds like worth having a look at.

Andrew Carson

unread,
Mar 31, 2015, 12:48:02 PM3/31/15
to openrail...@googlegroups.com, apcar...@gmail.com
Peter,
I require the timetabled running of trains and then the actual train times as they happened in real life (for specific services, I am in the process of deciding which but they are all centred on journeys ending at certain location). I am also looking to obtain the code for the delay (if there is one) as well. I didn't think that national rail could give me any of the data I want in archived form. Am I wrong? 
Hence I wanted to capture the data I need from DPP (as I thought this was the only source) for certain TIPLOCs when you can filter it like by TIPLOCs in a couple of weeks
Also I had this advise on the filezilla forum I don't know that server, but if the ftptest.net site (representing a 100% non-firewalled client) tells you there is a problem with the server (or server firewall), then there is. 

But as you can all use it I will just try it when I get home on Saturday.

Andrew Carson

unread,
Mar 31, 2015, 12:54:02 PM3/31/15
to openrail...@googlegroups.com
Mike, 
With my dissertation I wasn't expecting to have to put out any cost but if I want a good and easier VPS it looks like I may have to? I have never used a VPS before so maybe the one with the free trial will be good (If I cant do it I may just give up and use the realtraintimes website and have to log it manually, which would be probably more effort than this!!).
Looks like I will have to write some ruby code (how difficult was it Mark/what help did you use) or some Java.

I still don't quite understand how the data goes from a stomp client to the VPS, maybe my brain is just frazzled!

Mike Flynn

unread,
Mar 31, 2015, 5:43:38 PM3/31/15
to openrail...@googlegroups.com
Andrew

> With my dissertation I wasn't expecting to have to put out any cost..

Another free option: if you've a PC (haven't we all) run Linux on Windows or dual-boot with Linux.  Once again though, if this is new to you, it's another learning curve.  That said, it's not that difficult getting Linux up and running.  There's many different guides out there.  And getting this STOMP to connect to Network Rail/National Rail servers is on balance, I'd say, easier than from Windows.  

Getting the connection, of course you want to know you can connect first before you dip into some stale data, can be very frustrating.  Like a message in a bottle, did they get it or not?!  And with this STOMP and the Network Rail/National Rail servers, there's so many ways to go, where do you start?  I'd say (but see caveat horses for course above) the easiest way to get connected would be the Linux/PHP route.  For your purposes, ie., as part of a dissertation, PHP serve you fine.  And there's not much to learn.  Here's your plan of action; 

1/ Get access to Linux 
2/ Install PHP
3/ Install PHP STOMP extension (you'll likely be thwarted many times doing this or you might be 'lucky' and get it working straight out).  You'll need to first intall PHP-PEAR and then 'pecl install stomp'.  
4/ When you think you've installed stomp run the php example in open rail wiki using your user name and password.  Try connecting to the Network Rail site as well (I think the example code is for Network Rail).  It's easier connecting to National Rail DPP though as all the info you need is on your My Feeds page; username, password, channel is that queue name.
5/ You'll know whether you have STOMP working or not as you'll either receive a data stream OR a STOMP error message. If you get this far you're nearly there.  Check ports, 61613 and/or 61618 are open if you're running a firewall. 

With PHP it;s 3/ is the difficult part IMO. And it's only difficult because no-one in their right mind would ever use this as a means of streaming data backwards and forwards unless you're a complete computer expert who likes running obscure and difficult to use software LOL. 

As you know by now, there's many other ways to connect.  With Java there's quite a bit experience required; knowledge of classpath, jars, API's, and more.  Far more to learn/know than with PHP.  The other languages I don't know.   Good luck and I hope you get connected :)

Chris Bailiss

unread,
Mar 31, 2015, 5:57:55 PM3/31/15
to openrail...@googlegroups.com
>> There's many different guides out there.  And getting this STOMP to connect 
>> to Network Rail/National Rail servers is on balance, I'd say, easier than from Windows.

I haven't followed this whole thread, but...

I think the relative difficulty of each platform will vary person to person on their experience and what type of application is being created.

I use the Apache Stomp libraries which are relatively simple to use in Windows (i.e. reference the library and then add the only a few lines of code to start receiving messages).  See my sample app for details.  If you have Visual Studio installed (e.g. the free Visual Studio 2013 Community Edition) then browse the source code and possibly use it as a template to start from.

I don't doubt that using a PHP library could be more complex, especially in Windows (I have done some work there in the past), but the Apache NMS Stomp libaries are relatively straightforward to use.

NB:  To be clear, I have no strong views on promoting one technology over another.  Different technologies make sense to different people for different use cases.  I just want to mention in this email there are potentially other options!

Regards

Chris

Mike Flynn

unread,
Mar 31, 2015, 6:10:16 PM3/31/15
to openrail...@googlegroups.com
>>   And it's only difficult because no-one in their right mind would ever use this as a means of streaming data backwards and forwards unless you're a complete computer expert who likes running obscure and difficult to use software LOL. 

On reflection, there's probably some good reasons why it's been set up this way.  I believe actually that STOMP isn't the be all and end all of it anyway.  It's a way of connecting to Active MQ.  .And after all, there are all these myriad options available.  I'd also say once you have STOMP installed and working it IS quite straightforward.  It is poorly supported out there though and googlings for answer can lead you down all sorts of roads.

Andrew Carson

unread,
Mar 31, 2015, 6:12:00 PM3/31/15
to openrail...@googlegroups.com
Mike,
Thanks for you great and in depth response and you too Chris, I have seen the thread that you have just started and will have a look at your website too. What I will do is over tomorrow I will have a go/look at both of your methods and see if I can get either to work. I really don't mind what software I use, I just need to get something up and running! 
Once again thanks to you for your continued advise, I will have a further look tomorrow and no doubt be back with a question or 2 and as you say Mike, there are so many different bits of advise on google but you lot are the experts! (Sorry for clogging up your forum which looks as if it normally is only used by intelligent IT people!!)
Andrew 

Peter Hicks

unread,
Mar 31, 2015, 6:25:58 PM3/31/15
to Mike Flynn, openrail...@googlegroups.com
Hi Mike

> On 31 Mar 2015, at 23:10, Mike Flynn <mi...@a1publishing.com> wrote:
>
> >> And it's only difficult because no-one in their right mind would ever use this as a means of streaming data backwards and forwards unless you're a complete computer expert who likes running obscure and difficult to use software LOL.
>
> On reflection, there's probably some good reasons why it's been set up this way. I believe actually that STOMP isn't the be all and end all of it anyway. It's a way of connecting to Active MQ. .And after all, there are all these myriad options available. I'd also say once you have STOMP installed and working it IS quite straightforward. It is poorly supported out there though and googlings for answer can lead you down all sorts of roads.


I’ve read your message several times and I can’t work out what you’re trying to say here.

What technology would you rather see used in place of Stomp?


Peter


signature.asc

Tom Cairns

unread,
Mar 31, 2015, 7:27:26 PM3/31/15
to Mike Flynn, openrail...@googlegroups.com
On 31/03/2015 22:10, "Mike Flynn" <mi...@a1publishing.com> wrote:

On reflection, there's probably some good reasons why it's been set up this way.  I believe actually that STOMP isn't the be all and end all of it anyway.  It's a way of connecting to Active MQ.  .And after all, there are all these myriad options available.  I'd also say once you have STOMP installed and working it IS quite straightforward.  It is poorly supported out there though and googlings for answer can lead you down all sorts of roads.

I have to say that I’ve never had a single issue using or installing Stomp libraries in PHP, Python or Java. And on the former I’ve installed it quite a few times on different Debian based servers. If I remember rightly, all I installed were PHP, PHP FPM, Pear, the standard build libraries (build-essential on Debian I think) and then could just run pecl install stomp without an issue. After that, just import it into the PHP config and jobs a good’un.

Tom

Mike Flynn

unread,
Apr 1, 2015, 4:45:40 AM4/1/15
to openrail...@googlegroups.com


>>>   And it's only difficult because no-one in their right mind would ever use this as a ..

>> I’ve read your message several times and I can’t work out what you’re trying to say here...
>>  What technology would you rather see used in place of Stomp?

>> I have to say that I’ve never had a single issue using or installing Stomp libraries in PHP, Python or Java..


Okay, I knew after my first comment I'd made a mistake.  I thought my backtracking might suffice.  I'm no communications expert and I couldn't say what technology I'd rather see used in place of Stomp.  Having further googled the subject I'd say it probably is the most appropriate protocol available.  In my defence my frustration was borne having approached the subject blind and first trying to connect using PHP on Window: only to discover after an age that there is no dll extension available after all.  It doesn't exist.  Then tried PHP on Linux and discovered that adding an extension in this medium can throw up all sorts of curveballs.  What version of PHP are you using?  Is there more than one PHP installation on your system.  Are your PHP and library extension versions compatible?  (Is my username and password active at the other end? That's just the paranoia bit :).  And after I had begun to get a hold on this I realise that PHP is not great for the job anyway, at least if you want to do some heavy processing with the data.  I knew I had a bit of task ahead of me to even get Java up and running, having had experience years ago with the language years ago, and, indeed, so it was, and it was a while before I was even at the stage of finding a good stomp implementation for the language.  At this point I found there to be quite a few implementations though none I felt were well documented.  Again, which way to go. Of course they all, I'm sure, would and could work, but how to?  I managed it, as I say, eventually, with Gozirra.

Yes, stomp, probably is the right methodology here.  But because not every Tom, Dick and Harry on the block is using it, and there isn't the usually plethora of guides and tutorials for more widely used technologies, I personally found it difficult to get going.  Some find it easier than others for whatever reason.  As I say I was mostly working blind (other than this insensible forum I would add).

And glad to see there is always people out there ready to put you right!

Andrew:
One other mistake I made in this thread before you ask.  I said at the top I thought getting connected was three-quarters the battle.  Well, I must admit, I've only yet ever got to the three-quarters stage.  The last quarter might well be a long one too.  How difficult the NR/NRE data crunching actually is admittedly couldn't really say.  Other than this though and is what I know from the other various feeds and static rail data I have used.  There is documentation out there but I'd say it's similar in nature to that for stomp in many ways.  It's not spelt out for you, it's hard to come by, and there's lots of dots you need to join yourself.  So long as you like a challenge it's great fun.  But not for the light-hearted nor for a quick and easy mark in a dissertation I'd say.

Paul Kelly

unread,
Apr 1, 2015, 5:02:09 AM4/1/15
to openrail...@googlegroups.com
On 01/04/15 09:45, Mike Flynn wrote:
>
> Yes, stomp, probably is the right methodology here. But because not
> every Tom, Dick and Harry on the block is using it, and there isn't the
> usually plethora of guides and tutorials for more widely used
> technologies, I personally found it difficult to get going. Some find
> it easier than others for whatever reason. As I say I was mostly
> working blind (other than this insensible forum I would add).

Hi Mike,
I think the STOMP protocol itself is really simple, clear and
well-documented. When the Network Rail datafeeds were first set up I
managed to write my STOMP client in C in a day or two. In my experience
the endless headaches were more from the Network Rail STOMP server not
following the specification or behaving weirdly in various ways:
* the documentation said that compression was enabled but it turned
out not to be; I spent endless hours trying different parameters to try
and get it and sent a detailed e-mail to the support address with no answer
* the fact that message buffering (in the case of temporarily dropped
connections) only worked when you sent a client-id as part of your
connection was completely unexpected and not documented anywhere in the
Network Rail documentation or indeed STOMP standard; again I spent many
hours or days trying various different ways of disconnecting by dropping
the connection, trying to ignore packets etc. to get this to work before
Peter Hicks discovered it and posted it to this list
* the biggest issue of all, the way every now and then, sometimes
after more than a month of uptime, the STOMP server would just stop
sending data but hold the connection open so that everything appeared to
be normal. This one really did take many months of debugging to find out
about.

In summary, I think it's unfair to blame STOMP. STOMP is a very nice
protocol. In my experience the problems were in the implementation on
the Network Rail datafeeds side.

Paul

Chris Bailiss

unread,
Apr 1, 2015, 5:16:44 AM4/1/15
to openrail...@googlegroups.com
>> How difficult the NR/NRE data crunching actually is admittedly couldn't really say.
>> There is documentation out there but I'd say it's similar in nature to that for stomp in 
>> many ways.  It's not spelt out for you, it's hard to come by, and there's lots of dots you need to join yourself

The NRE Push Port data is relatively speaking easier to work with than the NR Open Data.
(That is not a complaint about the NR Open Data Platform at all - the NRE data is targeted more directly at a particular scenario where as the NROD data is more general and more detailed). 

Personally, I have always found the wiki (thanks largely to Peter + a few others!) + analysis of actual data sufficient, e.g. for the push port:
On the rare occasions that I have further questions this forum is usually helpful in providing the answer!

I have also found Stomp relatively straightforward to use and managed to get going within a few hours initially.  It is supported across multiple platforms and is certainly relatively lightweight compared to other more heavyweight complex messaging tools I have used in the past...

On the discussion about Stomp on Windows... For development on Windows I would strongly suggest using the Apache NMS Stomp .NET assemblies.  This of course requires using Visual Studio / the .NET toolset but this is the default development platform for Windows and so it is the simplest approach for development on Windows.  Of course other options may be possible but will very likely be more difficult. 

Chris 

Peter Hicks

unread,
Apr 1, 2015, 5:37:30 AM4/1/15
to Paul Kelly, openrail...@googlegroups.com
Hi Paul

> On 1 Apr 2015, at 10:02, Paul Kelly <pa...@pdkelly.de> wrote:
>
> * the documentation said that compression was enabled but it turned out not to be; I spent endless hours trying different parameters to try and get it and sent a detailed e-mail to the support address with no answer

I got to the bottom of the compression issue - it was specified as “messages must be compressed” and from a development point of view, they are… but only when they’re being queued on the ActiveMQ server. The idea was that the message bodies were compressed, which - on top of batching - reduces the total number of messages and their size and the overheads of running small messages in large volumes over Stomp. Think of it like sending a load of chads being sent in an envelope with one stamp, versus each one being sent in a separate envelope.

As soon as I could, I got the Developer Pack to ‘go away’, because it quickly turned out that - unless you have a set of instructions that don’t change very often, or you know all the questions that’ll arise, a static document isn’t the right way to document the platform.

This is why the Push Port doesn’t have a Developer Pack with instructions and points everyone to the Wiki.

> * the fact that message buffering (in the case of temporarily dropped connections) only worked when you sent a client-id as part of your connection was completely unexpected and not documented anywhere in the Network Rail documentation or indeed STOMP standard; again I spent many hours or days trying various different ways of disconnecting by dropping the connection, trying to ignore packets etc. to get this to work before Peter Hicks discovered it and posted it to this list

Durable subscriptions aren’t part of Stomp, but they are part of ActiveMQ. I did an awful lot of ActiveMQ learning and experimentation to find out ‘how’...

> * the biggest issue of all, the way every now and then, sometimes after more than a month of uptime, the STOMP server would just stop sending data but hold the connection open so that everything appeared to be normal. This one really did take many months of debugging to find out about.

How did you solve it in the end?


Peter

signature.asc

Paul Kelly

unread,
Apr 1, 2015, 6:30:38 AM4/1/15
to openrail...@googlegroups.com
Peter Hicks wrote:
> Hi Paul
>
>> On 1 Apr 2015, at 10:02, Paul Kelly <pa...@pdkelly.de> wrote:
>>
>> * the documentation said that compression was enabled but it turned out not to be; I spent endless hours trying different parameters to try and get it and sent a detailed e-mail to the support address with no answer
>
> I got to the bottom of the compression issue - it was specified as “messages must be compressed” and from a development point of view, they are… but only when they’re being queued on the ActiveMQ server. The idea was that the message bodies were compressed, which - on top of batching - reduces the total number of messages and their size and the overheads of running small messages in large volumes over Stomp. Think of it like sending a load of chads being sent in an envelope with one stamp, versus each one being sent in a separate envelope.

FWIW, here's what I tried to get it working (not far off 3 years ago now):

I notice in the ActiveMQ documentation that some way of setting a
parameter "useCompression" to true seems to be the way to do it, but I'm
not sure how to do this within a STOMP session, nor indeed whether it
should be part of the CONNECT or the SUBSCRIBE command. Here is what a
typical STOMP session looks like for me (i.e. the commands I send to the
server):

CONNECT
login:pa...@stjohnspoint.co.uk
passcode:mypassword

^@

SUBSCRIBE
destination:/topic/TRAIN_MVT_ALL_TOC
ack:client

^@

This works nicely, but the JSON always comes back uncompressed. I have
tried adding a header, using different combinations of
useCompression:true
usecompression:true
use-compression:true
to both the CONNECT and SUBSCRIBE commands, but that didn't work.

I have also tried appending it like in a URL, to the end of the
subscription destination, i.e. like
destination:/topic/TRAIN_MVT_ALL_TOC?useCompression=true

>> * the biggest issue of all, the way every now and then, sometimes after more than a month of uptime, the STOMP server would just stop sending data but hold the connection open so that everything appeared to be normal. This one really did take many months of debugging to find out about.
>
> How did you solve it in the end?

I enabled non-blocking mode on the connection socket so that it wouldn't
get stuck waiting to receive data in such cases, and then used the Unix
select() function (which waits for a socket to be ready to read from),
with a timeout. If no data appears after a couple of minutes (I'm
subscribed to the movement and VSTP feeds, so no data for this length of
time would be very unusual), select() times out and I close the
connection, wait a while, and try again.

Paul

Ying Wang

unread,
Jun 25, 2015, 6:26:47 AM6/25/15
to openrail...@googlegroups.com
Dear Peter and Andrew, I found your conversation is quite useful for me.

I am new at this online data feeds and also have no experience about online programming. I am studying on the railway rescheduling algorithm under disturbances. To build my reschedule model, I will need some empirical analysis of delay as my starting bases. 

Following Peter's instruction, I am able to download the pPort log files from the current day. Can you please tell me how to get the historical files of the previous months from the data feeds? 

Many thanks! 

Ying

在 2015年3月30日星期一 UTC+1下午12:19:43,Peter Hicks写道:
Hi Andrew

On 30 Mar 2015, at 11:22, Andrew Carson <apcar...@gmail.com> wrote:

I didn't know that any data was archived from the https://datafeeds.nationalrail.co.uk/, I only thought I could subscribe to a live feed? Where do I get the archived data and what programme should I play round with it in?

On the ‘My Feeds’ page, there’s details for an FTP server.  Log in to there and in one of the directories, you’ll find a load of pPort log files from the current day.  Download these - they’re in XML, one record per line.

And with your last point you would advise setting up the processing and collection at the same time to save future faffing around?
ie so it is only calling the data I want?

You could do it either way - collect and process the messages in real-time, or you could collect the messages during the course of the day and process them after the event if you didn’t need to run real-time.


Peter


Peter Hicks

unread,
Jun 25, 2015, 8:00:40 AM6/25/15
to Ying Wang, openrail...@googlegroups.com
Hi Ying

On 25 Jun 2015, at 11:26, Ying Wang <hiwan...@gmail.com> wrote:

> I am new at this online data feeds and also have no experience about online programming. I am studying on the railway rescheduling algorithm under disturbances. To build my reschedule model, I will need some empirical analysis of delay as my starting bases.
>
> Following Peter's instruction, I am able to download the pPort log files from the current day. Can you please tell me how to get the historical files of the previous months from the data feeds?

Data for the last ~24 hours is stored on the FTP server, but it’s not available for previous months. If you need historical data, you’re probably best downloading the data from the FTP server around 0100 each day.

A publicly accessible store of historical messages isn’t provided at the moment.

Kind regards,


Peter

signature.asc

Ying Wang

unread,
Jun 25, 2015, 11:25:43 AM6/25/15
to openrail...@googlegroups.com, hiwan...@gmail.com
Hi Peter, thanks for your prompt reply. Is this the same for the real-time updated data of Network Rail, such as 'Movement'? 

在 2015年6月25日星期四 UTC+1下午1:00:40,Peter Hicks写道:

Peter Hicks

unread,
Jun 25, 2015, 11:30:25 AM6/25/15
to Ying Wang, openrail...@googlegroups.com
There's no FTP server for the Network Rail service - so you have to capture the data as it's sent out, and there's no option to download it in bulk later.

Reply all
Reply to author
Forward
0 new messages