Reading a continuous TCP stream of JSON

419 views
Skip to first unread message

Chris McCann

unread,
Feb 28, 2017, 5:15:08 PM2/28/17
to SD Ruby
I'm looking into processing a stream of aircraft position data provided here:

http://pub-vrs.adsbexchange.com:32005

This effectively returns a JSON array of data structures with aircraft-reported information including latitude and longitude positioning.  There are typically about 6000-7000 "records" provided every 5 seconds.

How does one parse a stream like this?  You can curl that endpoint and see the data, I just can't sort out how to process each record into a JSON object to then do other stuff with.  

I've tried Ruby, Python, and Elixir but haven't come up with a solution yet.  Anyone out there have experience parsing a stream like this?

Cheers,

Chris

Ian Young

unread,
Feb 28, 2017, 5:22:27 PM2/28/17
to sdr...@googlegroups.com
Yikes, is that one giant unterminated array? I feel like you're not going to be able to parse that with most JSON tools because until the end arrives, it's not really valid JSON.

A lot of streaming libraries let you set a callback that responds to a "chunk" of input after it has read a certain number of bytes on the request, but it'll be tough to deal with if there's (likely) no guarantee that the chunks line up to record boundaries. Maybe you can keep a buffer of what you've received so far and pull as many valid hash objects off the string as you can, and leave the rest to prepend to the next chunk.

Good luck! 🙂
--
--
SD Ruby mailing list
---
You received this message because you are subscribed to the Google Groups "SD Ruby" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sdruby+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris McCann

unread,
Feb 28, 2017, 5:28:53 PM2/28/17
to sdr...@googlegroups.com
It actually does terminate after all 6-7000 records come through.  As you pointed out, it's a challenge to process something that big the conventional way, and resorting to byte chunking seems like an approach that might work, but reassembling the chunks across record boundaries is a challenge.

To unsubscribe from this group and stop receiving emails from it, send an email to sdruby+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to a topic in the Google Groups "SD Ruby" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sdruby/WJG1Os1Uens/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sdruby+unsubscribe@googlegroups.com.

Peter Fitzgibbons

unread,
Feb 28, 2017, 5:29:35 PM2/28/17
to sdr...@googlegroups.com
I've dealt with the likes of that before. Illl be atvkeynoard this evening and take a look
--
Peter Fitzgibbons

Bill Vieux

unread,
Feb 28, 2017, 5:30:45 PM2/28/17
to sdr...@googlegroups.com
Perhaps something async like https://github.com/philbooth/bfj

On Tue, Feb 28, 2017 at 2:29 PM, Peter Fitzgibbons <peter.fi...@gmail.com> wrote:
I've dealt with the likes of that before. Illl be atvkeynoard this evening and take a look
On Tue, Feb 28, 2017 at 2:22 PM Ian Young <i...@iangreenleaf.com> wrote:
Yikes, is that one giant unterminated array? I feel like you're not going to be able to parse that with most JSON tools because until the end arrives, it's not really valid JSON.

A lot of streaming libraries let you set a callback that responds to a "chunk" of input after it has read a certain number of bytes on the request, but it'll be tough to deal with if there's (likely) no guarantee that the chunks line up to record boundaries. Maybe you can keep a buffer of what you've received so far and pull as many valid hash objects off the string as you can, and leave the rest to prepend to the next chunk.

Good luck! 🙂


On Tue, Feb 28, 2017, at 04:15 PM, Chris McCann wrote:
I'm looking into processing a stream of aircraft position data provided here:


This effectively returns a JSON array of data structures with aircraft-reported information including latitude and longitude positioning.  There are typically about 6000-7000 "records" provided every 5 seconds.

How does one parse a stream like this?  You can curl that endpoint and see the data, I just can't sort out how to process each record into a JSON object to then do other stuff with.  

I've tried Ruby, Python, and Elixir but haven't come up with a solution yet.  Anyone out there have experience parsing a stream like this?

Cheers,

Chris


--
--
SD Ruby mailing list
---
You received this message because you are subscribed to the Google Groups "SD Ruby" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sdruby+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
--
SD Ruby mailing list
sdr...@googlegroups.com
http://groups.google.com/group/sdruby
---
You received this message because you are subscribed to the Google Groups "SD Ruby" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sdruby+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Peter Fitzgibbons

--
--
SD Ruby mailing list
sdr...@googlegroups.com
http://groups.google.com/group/sdruby
---
You received this message because you are subscribed to the Google Groups "SD Ruby" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sdruby+unsubscribe@googlegroups.com.

Dan Simpson

unread,
Feb 28, 2017, 10:51:44 PM2/28/17
to sdr...@googlegroups.com
The curl hooked me for some reason...  Anyways, dug in a bit, and it seems that the format is nil delimited objects, which have the acList field.  I wrote a quick class which does what you want, and assuming they don't change quoting on the actual json records, it should continue to work.  No promises.

If you want a more continuous stream of position objects, you can add some checks for the byte header of the payload.

--

Chris McCann

unread,
Feb 28, 2017, 11:03:37 PM2/28/17
to SD Ruby
Dan,

Really great work!  Thanks so much for the point-out in terms of processing the stream byte-by-byte.  

It always surprises me what I don't know that others seem to know so well.  

Cheers,

Chris


On Tuesday, February 28, 2017 at 7:51:44 PM UTC-8, dan.simpson wrote:
The curl hooked me for some reason...  Anyways, dug in a bit, and it seems that the format is nil delimited objects, which have the acList field.  I wrote a quick class which does what you want, and assuming they don't change quoting on the actual json records, it should continue to work.  No promises.

If you want a more continuous stream of position objects, you can add some checks for the byte header of the payload.
Reply all
Reply to author
Forward
0 new messages