Atlas single source EPG

81 views
Skip to first unread message

Adam Sutton

unread,
Jan 31, 2012, 6:01:06 AM1/31/12
to Atlas
Sorry couldn't think of a better title!

Basically this is to expand on some points I've made in an off-list
discussion and to get a more general feedback.

Having played around with the atlas system over the last few days I'm
begging to get a feel for its benefits over the existing XMLTV RT
feed. But also seeing some of its limitations going forward.

The way shows are sorted into brands, series and episodes is very
useful. It helps to structure things and makes it easier to implement
things like series linking, my initial reason for getting interested.

My requirements from Atlas are quite simple. I'm looking to build an
EPG to feed into my PVR and therefore need a clean, easy to interpret
source of scheduling info. For each entry in the schedule I probably
only need the following:

- brand (optional)
- series (optional)
- episode number (optional)
- title/subtitle (some of this comes from above)
- start time
- stop time
- summary
- genres

That's pretty much it (I may have missed something), beyond that its
all niceties that might be used for extra info, cast, images,
thumbnails, clips, etc...

One area where atlas falls down, in my opinion, is its handling of
publishers and the presentation of the various identifiers. Currently
I'm only working with PA and BBC data, mostly because I have access
and its pretty good data for initial starting point.

However my application needs to inherently know "which" publisher to
request for a given channel, if I provide both I get multiple returns
and potentially need to handle any overlaying myself. Personally I
don't care where my data comes from, only that I'm provided with the
best set of available data based on certain criteria.

So for example I might be asking for channel X, which is currently
provided by PA. But then the owners of channel X decide to open up
their original source data (which is hopefully more accurate than
PAs). In an ideal world I shouldn't care about this, atlas should
simply have a knowledge of the "best" providers for a given channel
and automatically provide this to me (within reason, i.e. I might need
to request/purchase it on my api key). There can still be an API for
requesting particular publishers if a user requires that, but for me I
don't want to care I just want the best available.

Also currently data is returned, often, with publisher specific URIs.
Although many of these map to the same underlying info. E.g if I get
sched data and it includes a show from the Merlin series (BBC), I get
the brand URI http://pressassociation.com/brands/103768. However if I
look this up I get the original BBC URI http://www.bbc.co.uk/programmes/b00vw027.
Although this mapping does at least mean that if the preferred
publisher switches I can at least handle some mappings myself (by
keeping track) its awkward and requires extra caching and lookups on
the client side.

This is particularly troublesome for scheduling info, which typically
isn't worth caching (at least not in this way). I might already have a
particular show recorded and might be relying on the URI as a unique
identifier to detect this. If the publisher suddenly changes (due to
better data as above) then these URIs will change and suddenly I'm re-
recording lots of stuff I've already got (if it gets repeated).

So I would propose that the atlas system provides a completely
publisher agnostic view of the world. It maintains its own internal
URIs for resources to which it maps the original source data.
Therefore I'm completely insulated to the source of the data.

Anyway these are just my suggestions on what would make Atlas a really
great system for me, others will no doubt have differing views/
requirements.

Regards
Adam


Jonathan Tweed

unread,
Feb 2, 2012, 10:06:32 AM2/2/12
to atla...@googlegroups.com
On Tuesday, 31 January 2012 at 11:01, Adam Sutton wrote:
My requirements from Atlas are quite simple. I'm looking to build an
EPG to feed into my PVR and therefore need a clean, easy to interpret
source of scheduling info. For each entry in the schedule I probably
only need the following:

- brand (optional)
- series (optional)
- episode number (optional)
- title/subtitle (some of this comes from above)
- start time
- stop time
- summary
- genres

That's pretty much it (I may have missed something), beyond that its
all niceties that might be used for extra info, cast, images,
thumbnails, clips, etc...

Hi Adam

Atlas does now support annotations, and these work on the schedule endpoint. Unfortunately they are not documented anywhere as yet, but we do have an action to improve our API documentation in the near future. Specifying any annotation turns on annotations and generally any field that is a list is also an annotation of the same name. There is also a description and extended_description. So you can say:


But I have also taken note of your list of fields above to see if we can make annotations on the schedule even more useful.

One area where atlas falls down, in my opinion, is its handling of
publishers and the presentation of the various identifiers. Currently
I'm only working with PA and BBC data, mostly because I have access
and its pretty good data for initial starting point.

Atlas is designed for aggregating and matching data from multiple publishers, but you are correct we have some work to do around how we present multiple ids. That's on the list, but we've not got to it yet.

However my application needs to inherently know "which" publisher to
request for a given channel, if I provide both I get multiple returns
and potentially need to handle any overlaying myself. Personally I
don't care where my data comes from, only that I'm provided with the
best set of available data based on certain criteria.

So for example I might be asking for channel X, which is currently
provided by PA. But then the owners of channel X decide to open up
their original source data (which is hopefully more accurate than
PAs). In an ideal world I shouldn't care about this, atlas should
simply have a knowledge of the "best" providers for a given channel
and automatically provide this to me (within reason, i.e. I might need
to request/purchase it on my api key). There can still be an API for
requesting particular publishers if a user requires that, but for me I
don't want to care I just want the best available.

Atlas equivalence should take care of this for you. If not, we should improve our merging to make it more useful. A quick outline:

Atlas automatically matches programmes from different publishers. It can then return you all of the programmes separately or on a per API key basis we can turn on and configure equivalence precedence. What this does is instruct Atlas to merge equivalent programmes into one combined programme that includes the best bits of each item in the set. The precedence order (configurable per API key) defines the order in which items are combined. By putting the broadcaster before PA in the list, it should take a higher priority.

Specific examples of where you spot the merging could be improved would be very useful.
 
Also currently data is returned, often, with publisher specific URIs.
Although many of these map to the same underlying info. E.g if I get
sched data and it includes a show from the Merlin series (BBC), I get
the brand URI http://pressassociation.com/brands/103768. However if I
look this up I get the original BBC URI http://www.bbc.co.uk/programmes/b00vw027.
Although this mapping does at least mean that if the preferred
publisher switches I can at least handle some mappings myself (by
keeping track) its awkward and requires extra caching and lookups on
the client side.

This is particularly troublesome for scheduling info, which typically
isn't worth caching (at least not in this way). I might already have a
particular show recorded and might be relying on the URI as a unique
identifier to detect this. If the publisher suddenly changes (due to
better data as above) then these URIs will change and suddenly I'm re-
recording lots of stuff I've already got (if it gets repeated).

So I would propose that the atlas system provides a completely
publisher agnostic view of the world. It maintains its own internal
URIs for resources to which it maps the original source data.
Therefore I'm completely insulated to the source of the data.

This is on its way too! We agree that Atlas generated IDs for the content in Atlas is important and will be turning this on soon. We're already generating them, but they're not yet in the API. They'll be the same as the IDs currently used on RadioTimes. We're doing the migration in a couple of stages and we're almost there, it involves moving the generation of these IDs from our Voila product that sits on top of Atlas into Atlas itself so they're available for everyone.
 
Anyway these are just my suggestions on what would make Atlas a really
great system for me, others will no doubt have differing views/
requirements.

Thanks, much appreciated and keep them coming. There's a lot of great comment on the list that I'm working through and documenting now that I'm back. It's good to get so much great feedback and requirements from everyone.

Cheers
Jonathan

Adam Sutton

unread,
Feb 2, 2012, 3:48:38 PM2/2/12
to atla...@googlegroups.com
On 2 February 2012 15:06, Jonathan Tweed <jona...@metabroadcast.com> wrote:
On Tuesday, 31 January 2012 at 11:01, Adam Sutton wrote:
My requirements from Atlas are quite simple. I'm looking to build an
EPG to feed into my PVR and therefore need a clean, easy to interpret
source of scheduling info. For each entry in the schedule I probably
only need the following:

- brand (optional)
- series (optional)
- episode number (optional)
- title/subtitle (some of this comes from above)
- start time
- stop time
- summary
- genres

That's pretty much it (I may have missed something), beyond that its
all niceties that might be used for extra info, cast, images,
thumbnails, clips, etc...

Hi Adam

Atlas does now support annotations, and these work on the schedule endpoint. Unfortunately they are not documented anywhere as yet, but we do have an action to improve our API documentation in the near future. Specifying any annotation turns on annotations and generally any field that is a list is also an annotation of the same name. There is also a description and extended_description. So you can say:


But I have also taken note of your list of fields above to see if we can make annotations on the schedule even more useful.

This sounds interesting and I will try to take a look. Although I don't see it as a big problem at this stage that the info is separate as much of its unlikely to change and can therefore be easily cached.
 

One area where atlas falls down, in my opinion, is its handling of
publishers and the presentation of the various identifiers. Currently
I'm only working with PA and BBC data, mostly because I have access
and its pretty good data for initial starting point.

Atlas is designed for aggregating and matching data from multiple publishers, but you are correct we have some work to do around how we present multiple ids. That's on the list, but we've not got to it yet.

However my application needs to inherently know "which" publisher to
request for a given channel, if I provide both I get multiple returns
and potentially need to handle any overlaying myself. Personally I
don't care where my data comes from, only that I'm provided with the
best set of available data based on certain criteria.

So for example I might be asking for channel X, which is currently
provided by PA. But then the owners of channel X decide to open up
their original source data (which is hopefully more accurate than
PAs). In an ideal world I shouldn't care about this, atlas should
simply have a knowledge of the "best" providers for a given channel
and automatically provide this to me (within reason, i.e. I might need
to request/purchase it on my api key). There can still be an API for
requesting particular publishers if a user requires that, but for me I
don't want to care I just want the best available.

Atlas equivalence should take care of this for you. If not, we should improve our merging to make it more useful. A quick outline:

Atlas automatically matches programmes from different publishers. It can then return you all of the programmes separately or on a per API key basis we can turn on and configure equivalence precedence. What this does is instruct Atlas to merge equivalent programmes into one combined programme that includes the best bits of each item in the set. The precedence order (configurable per API key) defines the order in which items are combined. By putting the broadcaster before PA in the list, it should take a higher priority.

Specific examples of where you spot the merging could be improved would be very useful.

I might be interested in trying this on my key at some point, just to see how it works. I'll be in contact if I want to give it a go.
Also currently data is returned, often, with publisher specific URIs.
Although many of these map to the same underlying info. E.g if I get
sched data and it includes a show from the Merlin series (BBC), I get
the brand URI http://pressassociation.com/brands/103768. However if I
look this up I get the original BBC URI http://www.bbc.co.uk/programmes/b00vw027.
Although this mapping does at least mean that if the preferred
publisher switches I can at least handle some mappings myself (by
keeping track) its awkward and requires extra caching and lookups on
the client side.

This is particularly troublesome for scheduling info, which typically
isn't worth caching (at least not in this way). I might already have a
particular show recorded and might be relying on the URI as a unique
identifier to detect this. If the publisher suddenly changes (due to
better data as above) then these URIs will change and suddenly I'm re-
recording lots of stuff I've already got (if it gets repeated).

So I would propose that the atlas system provides a completely
publisher agnostic view of the world. It maintains its own internal
URIs for resources to which it maps the original source data.
Therefore I'm completely insulated to the source of the data.

This is on its way too! We agree that Atlas generated IDs for the content in Atlas is important and will be turning this on soon. We're already generating them, but they're not yet in the API. They'll be the same as the IDs currently used on RadioTimes. We're doing the migration in a couple of stages and we're almost there, it involves moving the generation of these IDs from our Voila product that sits on top of Atlas into Atlas itself so they're available for everyone.

This would certainly be an improvement in my opinion and make the potential for using multiple sources in the future much simpler and seamless.
 
Anyway these are just my suggestions on what would make Atlas a really
great system for me, others will no doubt have differing views/
requirements.

Thanks, much appreciated and keep them coming. There's a lot of great comment on the list that I'm working through and documenting now that I'm back. It's good to get so much great feedback and requirements from everyone.

I'll keep playing and feedback whatever I find that might be of use.

Regards
Adam
Reply all
Reply to author
Forward
0 new messages