XMLTV native Atlas grabber

252 views
Skip to first unread message

Adam Sutton

unread,
Jan 25, 2012, 4:34:58 AM1/25/12
to Atlas
Just to follow up on some private conersations I've had with Chris
regarding this. Originally started in another thread.

My main interest in Atlas/XMLTV is I am hoping to look into adding
series linking support to some PVR software I use (XBMC and
tvheadend). To do this requires, ideally, more robust series/episode
information. And more robust information generally.

I've already written my own implementation of the (XMLTV)
tv_grab_uk_rt script, to overcome its inefficiencies, and now I'm
looking to extend/rewrite it to do a better job for my purposes.

To this end I'm looking at writing something that will access the
richer source of data provided directly by Atlas. My initial thoughts
are that the output will still be XMLTV to make it compatible with the
existing parsers (including the one in tvheadend). However Chris has
pointed at that XMLTV possibly has some shortcomings in the area I'm
looking at, mainly the fact that shows/series are linked merely by a
text field (title).

This was something I had already noted was a potential problem with
the existing XMLTV output (most likely a result of lack of info at
source). Whether this can be overcome within the existing framework or
whether it will need format changes I'm not yet sure. However I think
that extra unique indexing for things like shows, series etc.. is
likely required to make this work in a robust manner.

I know there is already discussions elsewhere about changing the
format of the RT data stream and updating the XMLTV parsers etc... So
I don't want to step on peoples toes.

Thoughts?
Adam

Chris Jackson

unread,
Jan 25, 2012, 2:42:42 PM1/25/12
to atla...@googlegroups.com
On 25 January 2012 09:34, Adam Sutton <a...@adamsutton.me.uk> wrote:
> To this end I'm looking at writing something that will access the
> richer source of data provided directly by Atlas. My initial thoughts
> are that the output will still be XMLTV to make it compatible with the
> existing parsers (including the one in tvheadend). However Chris has
> pointed at that XMLTV possibly has some shortcomings in the area I'm
> looking at, mainly the fact that shows/series are linked merely by a
> text field (title).
>
> This was something I had already noted was a potential problem with
> the existing XMLTV output (most likely a result of lack of info at
> source). Whether this can be overcome within the existing framework or
> whether it will need format changes I'm not yet sure. However I think
> that extra unique indexing for things like shows, series etc.. is
> likely required to make this work in a robust manner.

Would be really interested if anyone could shed some light on this
point. In our experience, the brand/series/episode hierarchy is really
important for lots of applications, and it would be good to understand
the extent to which this is supported by XMLTV implementations. Could
be important in our decision on whether to start outputting XMLTV
feeds.

--
ch...@metabroadcast.com -- +44 7967 756705

Adam Sutton

unread,
Jan 25, 2012, 7:10:10 PM1/25/12
to atla...@googlegroups.com
Having looked at the DTD for XMLTV, I think there are definitely shortcomings here that would need to be addressed. There doesn't appear to be any unique concept of show/series identification. It's basic a string title, which without processing will not always be unique for a given show. I know one thing that nearly caught me out is where you have shows with omnibus episodes (Eastenders). Not sure how such things would be handled.

I think that maybe the XMLTV format needs some reworking to add missing parameters. Although to be honest there is nothing that specifically ties me to XMLTV. For my own work I could actually write my atlas processor to work directly within my software.

However I think XMLTV has quite a bit of traction in this area and serves as a useful medium between a variety of possible sources and the end applications. I'll probably have to try and see what the XMLTV guys think.

Karl Dietz

unread,
Jan 25, 2012, 6:12:11 PM1/25/12
to atla...@googlegroups.com
Hi,

I think a plain series link via just another attribute doesn't cut it.
To do all the cool things you'll need a content resolver, a more
flexible data model and a shared meta database. As a workaround you can
give every series a unique title (like they do at thetvdb). e.g map
your series-link-id to a unique title and enjoy that it works with most
consumers without additional work. Add the special content category
"series" to setup recording rules for "everything with the same title"
and you will get guide data for about a dozen countries that work with
your application's one-click-series-recording.

If you are just interested in a summary you can stop reading here.

On 25.01.2012 20:42, Chris Jackson wrote:
> On 25 January 2012 09:34, Adam Sutton<a...@adamsutton.me.uk> wrote:
>> To this end I'm looking at writing something that will access the
>> richer source of data provided directly by Atlas. My initial thoughts
>> are that the output will still be XMLTV to make it compatible with the
>> existing parsers (including the one in tvheadend). However Chris has
>> pointed at that XMLTV possibly has some shortcomings in the area I'm
>> looking at, mainly the fact that shows/series are linked merely by a
>> text field (title).
>>
>> This was something I had already noted was a potential problem with
>> the existing XMLTV output (most likely a result of lack of info at
>> source). Whether this can be overcome within the existing framework or
>> whether it will need format changes I'm not yet sure. However I think
>> that extra unique indexing for things like shows, series etc.. is
>> likely required to make this work in a robust manner.

Without a shared meta database of programmes (and persons) its always
going to be a band aid in one way or another.

Is crid://bbc/the_killing and crid://channel4/the_killing to be
considered the same or a different series? The same question arises
when you change guide sources from EIT to XMLTV (any switch to another
id space will do)
Do you want to add all your recording rules again?
What about your list of programmes you have already seen? (some of the
MythTV users carry around their list of already seen programmes from
the last 5+ years)

A shared database that maps the various ids onto each other also lets
you do advanced stuff like match the list of "must-have-seen" movies
your buddy keeps bugging you about against the guide.
Or suggest upcoming programs which feature actors that the viewer is a
facebook fan of. (and add a "buy this movie on bluray from amazon"
button)
There is musicbrainz doing that for music, see e.g. the album review of
Hands at http://www.bbc.co.uk/music/reviews/wqmq which references the
same album on discogs and wikipedia. (No 1-click buy for amazon there,
but it could be easily added as the metadata is available)
But I don't know of anything similar in the tv/movie domain.

> Would be really interested if anyone could shed some light on this
> point. In our experience, the brand/series/episode hierarchy is really
> important for lots of applications, and it would be good to understand
> the extent to which this is supported by XMLTV implementations. Could
> be important in our decision on whether to start outputting XMLTV
> feeds.

Xmltv does not support anything beyond "program has the same title".
You can add the season and episode number to help with duplicate
checking and there are common categories to signal "its an episode" and
"its a movie" but thats about it. (its been good enough for >10 years,
though.)
There have been suggestions to add programid, seriesid and/or crid
elements to the schema, but it never took off. Any programme id can be
modeled as episode-num with system="<yoursystem>" but each system will
need support from all applications. (e.g MythTV knows dd_progid and
will use that as primary key for programs instead of making one up from
the title and episode-num in system xmltv_ns)


If you want to model the hierarchy you need to follow something like
the TV-Anytime data model where programs can be in all kinds of sets.

There is flat series to episode mapping for series like dailies which
carry only a date as episode identifier. (or at most a running number,
but don't do seasons)

Or deep series -> variant -> season -> episode for series with multiple
sets of episodes. e.g. The X Factor and The Xtra Factor

You can even model one episode having two different episode
numbers in the same series. (e.g. the program with the title
Sauschw�nzlebahn is episode 1 from 1991 and episode 150 from 1995 of
series Eisenbahn-Romantik)

You could also represent movie series like the James Bond or the Star
Trek movies.

Regards,
Karl

Chris Jackson

unread,
Jan 26, 2012, 4:52:02 PM1/26/12
to atla...@googlegroups.com
Hi Karl,

On 25 January 2012 23:12, Karl Dietz <dek...@spaetfruehstuecken.org> wrote:
> 5.01.2012 20:42, Chris Jackson wrote:
>
> Without a shared meta database of programmes (and persons) its always
> going to be a band aid in one way or another.
>
> Is crid://bbc/the_killing and crid://channel4/the_killing to be
> considered the same or a different series? The same question arises
> when you change guide sources from EIT to XMLTV (any switch to another
> id space will do)
> Do you want to add all your recording rules again?
> What about your list of programmes you have already seen? (some of the
> MythTV users carry around their list of already seen programmes from
> the last 5+ years)
>
> A shared database that maps the various ids onto each other also lets
> you do advanced stuff like match the list of "must-have-seen" movies
> your buddy keeps bugging you about against the guide.
> Or suggest upcoming programs which feature actors that the viewer is a
> facebook fan of. (and add a "buy this movie on bluray from amazon"
> button)
> There is musicbrainz doing that for music, see e.g. the album review of
> Hands at http://www.bbc.co.uk/music/reviews/wqmq which references the
> same album on discogs and wikipedia. (No 1-click buy for amazon there,
> but it could be easily added as the metadata is available)
> But I don't know of anything similar in the tv/movie domain.

This is exactly what Atlas aims to do!

For example, in the UK we're matching listings data (from Press
Association) to detailed data and ondemand locations from broadcasters
and VoD providers. This is how they get ondemand links and beautiful,
big images in RadioTImes these days.

We use a combination of exact ID matching and "fuzzy" matching based
on a wide range of signals. The fuzzy matching works at broadcast,
episode and series/brand levels, which enables us to lock on to
equivalent content by traversing the hierarchy. It's very reliable for
our mature data sources.

We have a pretty flexible data model, and a very big database of
currently ~10M items. There's plenty still to do, though, and help is
always welcome. At the moment it's still mostly people in our company
committing, although many in the UK broadcast industry have paid for
code that was released open source.

>> Would be really interested if anyone could shed some light on this
>> point. In our experience, the brand/series/episode hierarchy is really
>> important for lots of applications, and it would be good to understand
>> the extent to which this is supported by XMLTV implementations. Could
>> be important in our decision on whether to start outputting XMLTV
>> feeds.
>
>
> Xmltv does not support anything beyond "program has the same title".
> You can add the season and episode number to help with duplicate
> checking and there are common categories to signal "its an episode" and
> "its a movie" but thats about it. (its been good enough for >10 years,
> though.)
> There have been suggestions to add programid, seriesid and/or crid
> elements to the schema, but it never took off. Any programme id can be
> modeled as episode-num with system="<yoursystem>" but each system will
> need support from all applications. (e.g MythTV knows dd_progid and
> will use that as primary key for programs instead of making one up from
> the title and episode-num in system xmltv_ns)

Interesting - I suspected it was possible to include the data, but not
all clients would understand. Still, in theory we could include IDs we
have, and they may be useful to knowledgeable clients?

> If you want to model the hierarchy you need to follow something like
> the TV-Anytime data model where programs can be in all kinds of sets.

TV Anytime was the starting point for the Atlas model, so you will
find a lot of similarities. But we have a slightly more ridged model
(a profile in TVA language) which makes it practical to implement. All
TVA implementations follow a profile, implicit or explicit. We've
moved away from using TVA terms to describe ours, because they are
pretty unfriendly, and not natural to a typical developer from a web,
rather than broadcast background. We know a fair number of the
original guys on the TVA committees. Many of them are quite frank
about the difficulties in agreeing an elegant standard in a

Essentially in our model all episodes (but not 1-offs or films) exist
within a brand container, with an optional series attribute. We also
have more general collections and attributes that can group by wider
franchises, people, topics. We are working on some clearer
documentation for this.

Adam Sutton

unread,
Jan 26, 2012, 5:06:31 PM1/26/12
to atla...@googlegroups.com
Definitely sounds like people have given this a lot of thought. And I certainly get where you're all coming from. All this really cool linking across different shows that effectively link to the same series, or film series (or mixed), etc... Would be really cool in the long run.

However in the short term I'm merely trying to recreate the fairly simplistic series modelling provided by something like a Sky+ Box. Basically if I say record Eastenders on BBC HD, it will record each episode on that channel (and no others) in sequence ad infinitum. If I say the same for Merlin, it will do the same and finish when the series finishes (next step would be to pick up the next series when that starts, but I'm not aware my sky box does that now).

I believe the atlas data is more than sufficient to do this, although possibly some extra fields or use of alternative episode numbering scheme might be required.

All the other cool stuff can be added later, even if it means major reworking. I'm not trying to brush it all under the carpet (or maybe I am) or trivialise it.

It's just that I would like to replace my existiny sky+ box (at end of contract) with a PVR system using my media centre (XBMC) as the front end. But before I can do that I need to have as close a feature set to that which I currently have or it will not pass the Wife Approval Test!

But I welcome any suggestions and advice :)

Chris Jackson

unread,
Jan 26, 2012, 5:13:18 PM1/26/12
to atla...@googlegroups.com
On 26 January 2012 22:06, Adam Sutton <a...@adamsutton.me.uk> wrote:
> It's just that I would like to replace my existiny sky+ box (at end of
> contract) with a PVR system using my media centre (XBMC) as the front end.
> But before I can do that I need to have as close a feature set to that which
> I currently have or it will not pass the Wife Approval Test!
>
> But I welcome any suggestions and advice :)

Based on the discussion so far, I think you can just put the
brand/series IDs from the Atlas API in some non-standard (or at least
not widely supported) XMLTV attribute.

Hopefully shortly we can get to the point where we provide that data
without an API key/licence, so others can benefit from your good work.
One note: we're about to add more concise IDs, which might make for
the most elegant implementation, but that's just a nice to have, I
think.

Adam Sutton

unread,
Jan 27, 2012, 11:26:31 AM1/27/12
to Atlas
Well at first glance it looks like the only minor benefit to the PA
data is that we get the brand id which makes grouping series easier.
However the actual series/episode data is still not much good. Many
missing bits of info, lots of it plain wrong. All of this would likely
kill a simple implementation of series link, though I guess you could
simply attempt to record ALL episodes for a given brand that appear.
However this is itself tricky, do you just record those shows for
which this is the first airing? What if something got missed? Do you
simply record everything, think my hard drive would have something to
say about that!

Though I'm still trying to get my head around the whole structure and
where to get the data linked from the schedule, etc.. But I can
certainly see that there are title/subtitle errors, so I can see many
of the fixups used by the XMLTV grabber would still be required to get
a consistent on-screen output (though this worries me less than the
series info as its purely aesthetics, unless you have to rely on it as
a fallback).

I did do a little scan around the web and it seems that EPG errors are
one of the most common complaints about PVR systems. The only
exception seems to be Sky, presumably they must take some feed from
somewhere and either pay through the nose to make sure its right OR do
lots of in-house sanitizing. So I guess this is something that's
likely to persistent in any free EPG data for sometime. I know the
community can feedback mistakes, but unless someone is actually doing
automated checking this is unlikely to happen until after the event
(i.e. something fails to record).

I'll keep playing though.

Colin Moorcraft

unread,
Jan 27, 2012, 1:30:51 PM1/27/12
to atla...@googlegroups.com, atla...@googlegroups.com
To get on the Sky platform a broadcaster is contractually obliged, and technically coaxed, to provide user-friendly EPG data.

This is not the case for Freeview. It is odd, to put it mildly, that broadcasters that benefit from our licensing money are not obliged to provide rich, standardised freely sharable and reusable metadata so that as many license payers as possible can discover their programmes. Why they should need to be obliged is another mystery.

- Colin

Sent from my iPad

The Gareth

unread,
Jan 27, 2012, 1:34:56 PM1/27/12
to atla...@googlegroups.com

On 27 Jan 2012, at 18:30, Colin Moorcraft wrote:
> This is not the case for Freeview. It is odd, to put it mildly, that broadcasters that benefit from our licensing money are not obliged to provide rich, standardised freely sharable and reusable metadata so that as many license payers as possible can discover their programmes. Why they should need to be obliged is another mystery.


The only license fee funded broadcaster (BBC, if we ignore S4C which is in transition) has provided detail metadata via Backstage, and to the like of Atlas for many years. I don't see who you're talking about.

Anyway, yeah, the big freeview players submit series and episode CRIDs, but smaller players don't.

G

Karl Dietz

unread,
Jan 27, 2012, 3:26:33 AM1/27/12
to atla...@googlegroups.com
On 26.01.2012 23:13, Chris Jackson wrote:
> On 26 January 2012 22:06, Adam Sutton<a...@adamsutton.me.uk> wrote:
>> But before I can do that I need to have as close a feature set to that which
>> I currently have or it will not pass the Wife Approval Test!
...

> Based on the discussion so far, I think you can just put the
> brand/series IDs from the Atlas API in some non-standard (or at least
> not widely supported) XMLTV attribute.

After consulting my pillow I agree that this should be the way to go in
the short term.


Maybe as <programme-group-key system="crid">the id</...

Can there be multiple programme-group-keys? I do think so, but they
should be sorted according to importance. So a consumer that only
supports one key per programme will do something meaningful by just
using the first id. (similar to star-rating)


If I understood it correctly that would match well with emitting the
episode id as <episode-num system="crid">the id</... to get a feed that
matches what is broadcast via DVB-EIT so one can switch forth and back.

There can be additional <episode-num system="xmltv_ns">number</...
elements for backwards compatibility for e.g. MythTV to make up ids from
the program_type + title + episode_num.

Regards,
Karl

PS: here's how MythTV makes up the episode key
https://github.com/MythTV/mythtv/blob/master/mythtv/programs/mythfilldatabase/xmltvparser.cpp#L520

Colin Moorcraft

unread,
Jan 27, 2012, 2:05:16 PM1/27/12
to atla...@googlegroups.com, atla...@googlegroups.com
The non-BBC data in Freeview EPGs is far from brilliant. A Freeview-wide Backstage would be a big step forward. This is not a technical issue - it is a governance issue.

- Colin

Sent from my iPad

Chris Jackson

unread,
Jan 27, 2012, 2:18:44 PM1/27/12
to atla...@googlegroups.com
On 27 January 2012 16:26, Adam Sutton <a...@adamsutton.me.uk> wrote:
> Well at first glance it looks like the only minor benefit to the PA
> data is that we get the brand id which makes grouping series easier.
> However the actual series/episode data is still not much good.

I'm surprised to keep hearing of these issues on the group, since this
data is used by RadioTimes across their site. We know that minor shows
and minor channels have bad metadata. But at least the major shows on
major channels should be in good shape. If that isn't the case we
would really appreciate some sample bug reports at
http://issues.atlas.metabroadcast.com/. I know PA and others are keen
to help.

Chris Jackson

unread,
Jan 27, 2012, 2:28:13 PM1/27/12
to atla...@googlegroups.com
On 27 January 2012 19:05, Colin Moorcraft <colin.m...@gmail.com> wrote:
> The non-BBC data in Freeview EPGs is far from brilliant. A Freeview-wide Backstage would be a big step forward. This is not a technical issue - it is a governance issue.

Agreed. But the right technology can help to smooth governance
problems. This is our aim with Atlas.

Having spent years fighting governance battles as staff members within
broadcasters, we're not interested in discussing those here, and we
have stopped expecting a rapid change. If the industry multilaterally
agrees to improve/open up metadata we'd be over the moon. But
meanwhile we do what we can, on a unilateral basis, to work within the
status-quo.

Most centrally compiled listings databases (such as PA and RedBee)
suffer from a lack of resource to get all data perfect. It's just too
much work to curate perfect metadata across 500+ channels.

In our experience, the best metadata is within the databases of the
broadcasters. We've got access to almost all of them within Atlas. If
people want access we can often help to arrange it. It is our aim to
make access available by default to people working on personal /
non-commercial projects.

Let us know how we can help. If the PA data is not suitable for a
particular application, it might be that the broadcaster data is a
better option, at least for major channels.

Colin Moorcraft

unread,
Jan 27, 2012, 2:36:01 PM1/27/12
to atla...@googlegroups.com, atla...@googlegroups.com
I sympaphise with your earlier governance struggles, and note your polite hint not to bring them up in this forum. In future my comments will be strctly technical ...

- Colin

Sent from my iPad

Adam Sutton

unread,
Jan 27, 2012, 3:06:13 PM1/27/12
to atla...@googlegroups.com
Whoa, guess I might have stirred up a hornets nest :)

I think I might have been a little hasty in my previous statements. The presence of the brand identifier to help group episodes likely to be part of a show, does at least simplify the search space. However the specific logic of how to record the next show in the series (starting from a user defined position, i.e. click on EPG entry and say series link), is still not trivial.

One of the main issues is not so much the lack of the data, its more the randomness of it. The PA data certainly has many shows (I'm really only looking at BBC one at the moment, to limit the amount to trawl through, though its probably not the best place to start as its likely to be better than others) that have some very strange data. Series x with episodes, 1, 2, X, 4, 5. Easy enough for a human to know X (missing) is meant to be 3. But that's a simple error, some examples are far stranger, seemingly random series/episode numbering that bares no reflection on reality.

I could list specifics but there are too many to mention at this point. I think that if I'm going to use the PA data then another strategy other than using the series/episode info is likely to be required. What that is I'm not sure.

How many of the original broadcaster feeds are available outside of the big 4 free to air channels?

Chris Jackson

unread,
Jan 27, 2012, 3:28:30 PM1/27/12
to atla...@googlegroups.com
On 27 January 2012 19:36, Colin Moorcraft <colin.m...@gmail.com> wrote:
> I sympaphise with your earlier governance struggles, and note your polite hint not to bring them up in this forum. In future my comments will be strctly technical ...

Thanks, Colin. We appreciate your contribution.

Anyone interesting in the wider debate is very welcome to join us in
the pub/similar at some point (see calendar on
http://metabroadcast.com). We're really trying to keep this list to
topics of general interest around how we can provide an excellent,
free data service for all.

Chris Jackson

unread,
Jan 27, 2012, 3:42:41 PM1/27/12
to atla...@googlegroups.com
On 27 January 2012 20:06, Adam Sutton <a...@adamsutton.me.uk> wrote:
> Whoa, guess I might have stirred up a hornets nest :)

No worries, Adam. Good to have a discussion - we're just keen to keep
it to practical things we can do to make a difference. We think there
are quite a few of these.

> I could list specifics but there are too many to mention at this point. I
> think that if I'm going to use the PA data then another strategy other than
> using the series/episode info is likely to be required. What that is I'm not
> sure.

It would be helpful to have a few specifics listed in the issue
tracker, then we can investigate if the class of problem can be fixed
at source.

However, I think you're right that a different approach is needed to
do series linking reliably across more than just the main channels.
Maybe others can suggest?

> How many of the original broadcaster feeds are available outside of the big
> 4 free to air channels?

We currently have all data from the horse's mouth (so to speak) on all
channels from the big four terrestrial broadcasters. That covers the
vast majority of viewing on Freeview. We are constantly adding other
sources. It might be worth you doing a small examination of the BBC
data, as an alternative to PA.

Adam Sutton

unread,
Jan 27, 2012, 4:07:59 PM1/27/12
to atla...@googlegroups.com
I'm trying to compare the differences now. I assume for this I just specify publisher=bbc.co.uk?

Also I think you mentioned in one of our earlier discussions about overlaying of data from multiple publishers, I thought this was something that was happening within your system? But if I specify both pa and bbc (publisher=pressassociation.com,bbc.co.uk) I get two records per show and no obvious way to combine, other than based on aired time?

Also I asked in another thread about how to get the channel list (so I can do more checking later), I couldn't seem to find a list either published or an API call. I think I saw something in the Atlas source, but it wouldn't be ideal to have to keep my own hardcoded (self published) list of channels. I'm sure I must be missing something.

I'll try and publish some specific examples when I've got some better scripts for doing some comparisons.

Finally is there a more complete API definition somewhere? The website is very limited and the info I get back, especially for bbc.co.uk, appears to need augmenting with additional info (presumably based on the contents that are returned). I'm thinking show title for example, doesn't appear in data if I specify bbc.co.uk, the title field appears to contain a sub-title, date or episode number.

Chris Jackson

unread,
Jan 27, 2012, 4:29:18 PM1/27/12
to atla...@googlegroups.com
On 27 January 2012 21:07, Adam Sutton <a...@adamsutton.me.uk> wrote:
> I'm trying to compare the differences now. I assume for this I just specify
> publisher=bbc.co.uk?

Yes

> Also I think you mentioned in one of our earlier discussions about
> overlaying of data from multiple publishers, I thought this was something
> that was happening within your system? But if I specify both pa and bbc
> (publisher=pressassociation.com,bbc.co.uk) I get two records per show and no
> obvious way to combine, other than based on aired time?

We don't combine in the schedule API, since most users want to show a
schedule from a single supplier (who presumably makes sure it is
complete and non-overlapping). However, we will be adding an ID in the
next ~month, which would allow you to match equivalent items.

Data is overlaid from the content endpoint. This is an option we can
turn on for your key. Essentially it selects an precedence order for
publishers (ie sources). If two or more sources are available, the
lower priority sources just augment the data from the higher priority
sources. We actually have an admin interface for this in the codebase,
but it is being tested at the moment, and has not hit production.
Meanwhile, let us know if you want this set.

> Also I asked in another thread about how to get the channel list (so I can
> do more checking later), I couldn't seem to find a list either published or
> an API call. I think I saw something in the Atlas source, but it wouldn't be
> ideal to have to keep my own hardcoded (self published) list of channels.
> I'm sure I must be missing something.

We will answer on the other thread shortly.

> I'll try and publish some specific examples when I've got some better
> scripts for doing some comparisons.
>
> Finally is there a more complete API definition somewhere? The website is
> very limited and the info I get back, especially for bbc.co.uk, appears to
> need augmenting with additional info (presumably based on the contents that
> are returned). I'm thinking show title for example, doesn't appear in data
> if I specify bbc.co.uk, the title field appears to contain a sub-title, date
> or episode number.

We have code (again, not on production yet) to annotate the core data
with all sorts of extra info, including the brand (or show) level info
into the call. For now, if you need brand info, you need to make a
content call for the brand, eg:

http://atlas.metabroadcast.com/3.0/content.json?uri=http%3A%2F%2Fwww.bbc.co.uk%2Fprogrammes%2Fb006mvhd

The website is the documentation. We have tried to make the API
somewhat selt-explanatory, especially by using the API explorer and
the links to it throughout the website. We know we need to improve
both documentation and the explorer. Meanwhile, ask away here. It
really helps us because we learn how to structure the documents to
answer common questions. Thanks for your patience!

Adam Sutton

unread,
Jan 27, 2012, 5:44:58 PM1/27/12
to atla...@googlegroups.com
Ta Chris,

I'll take a look at the content.json I had somehow overlooked that. I definitely think that ultimately collating the full set of info into a single query might be helpful, however I can see the benefit of keeping things sparse since I would be able to cache the brand info as I guess the basic info isn't likely to change very often.

I'll assume I need to stick to a single provider at the moment. Though that leads me onto another question, is there anything in your system that provides the concept of a preferred supplier per channel? For example I'm going to be better of using bbc.co.uk for BBC content (series info looks better on the face of it), but presumably this isn't much good for Channel 4. So I'd need to keep some mapping of which provider is best for each. Maybe something to include in the channels API?

One thing that was mentioned somewhere in one of the many threads we've now had was the concept of detecting updates. This would be very beneficial in the API. Someway to check the last time data associated with a particular query, be it brand info, channels list, schedule info etc.. has changed.

One of the main issues with the existing XMLTV RT grabber is that its very monolothic in its processing. I.e. get the current 14 days of data for all required channels and then process as required. Even though likelihood is very little has changed in the first 13 days (that you've presumably already processed), or certainly in the first 2-3 where presumably the schedule is already pretty sorted. Thus wasting a lot of processing. This is particular problematic with the existing XMLTV RT grabber due to its poor performance (a grab on my PVR machine takes 1 hour for 64 channels) but this is mainly poor coding.

However that being said not everyone has fast broadband accounts and so reducing network load is also a benefit of a more modular approach.

Anyway consider it an informal feature request :) 

Dave Saville

unread,
Jan 28, 2012, 7:24:14 AM1/28/12
to atla...@googlegroups.com
On Fri, 27 Jan 2012 22:44:58 +0000 Adam Sutton wrote:
<snip>

>schedule is already pretty sorted. Thus wasting a lot of processing. This
>is particular problematic with the existing XMLTV RT grabber due to its
>poor performance (a grab on my PVR machine takes 1 hour for 64 channels)
>but this is mainly poor coding.

Hi Adam

How are you fetching the RT data? I had a similar problem in that it was
taking about the same length of time as you. Turned out that I was misusing
HTTP 1.1 to do the GET. Investigation showed it would download about 180K
of the data and then sit around for over five minutes before the rest would
appear.

It's a difference between HTTP 1.0 and 1.1 Where the latter defaults to
keep the connection open. What I needed to do was either fall back to HTTP
1.0 or add "Connection: close\r\n" to the HTTP 1.1 GET. Effectively I was
waiting for the server to time out every fetch, hence the long process
time. It now does 20 odd channels in seconds. Thanks due to a guy on the
perl list for pointing this out.

HTH

--
Kind regards

Dave Saville

Adam Sutton

unread,
Jan 28, 2012, 8:25:17 AM1/28/12
to atla...@googlegroups.com
Hi Dave,

It's nothing to do with network speed its a fundamental problem with the code of the grabber and I've pointed it out to the developers. It was CPU bound (100%) for 1 hour on an 800MHz machine, didn't take me long to realise that something was wrong. Network fetching time is negligible (seconds).

As a comparison my own version of the script (written in python) takes about 150s to complete the same operation. If your interested in the specifics they're detailed on the xmltv mailing list.

But that's all a bit off topic :)

Regards
Adam

Chris Jackson

unread,
Jan 28, 2012, 2:00:19 PM1/28/12
to atla...@googlegroups.com
On 28 January 2012 12:24, Dave Saville <da...@deezee.org> wrote:
> It's a difference between HTTP 1.0 and 1.1 Where the latter defaults to keep
> the connection open. What I needed to do was either fall back to HTTP 1.0 or
> add "Connection: close\r\n" to the HTTP 1.1 GET. Effectively I was waiting
> for the server to time out every fetch, hence the long process time. It now
> does 20 odd channels in seconds. Thanks due to a guy on the perl list for
> pointing this out.

Appreciate this isn't relevant to Adam's issue, but if anyone else is
still having issues downloading data it might be useful to know that
the RT feed is served directly from an Amazon S3 bucket. There's quite
a bit of discussion out there on the ins and outs of downloading
efficiently from S3 with various HTTP clients.

Adam Sutton

unread,
Jan 28, 2012, 2:08:11 PM1/28/12
to atla...@googlegroups.com
I did a little bit of playing with the BBC data last night and you were definitely right Chris, the series info is much better and I think it would be feasible to use this for a fairly reliable series link. Might still need some fall back modes.

One area where its definitely a bit "iffy" is cbeebies (probably gets watched more than any other in our house these days :) ). Though I think that is because they aren't really bothering to schedule complete series, kids don't care! So you can sometimes get a fairly random set. Interestingly this doesn't seem to upset the Sky+ box, so I'm guessing its merely doing brand + fuzzy time. Might make an interesting test case to see how my Sky box series links some of these series, might learn something :)

One of the interesting cases is how it might handle "missed" episodes. I know some stuff (mythtv) has some excellent recording features which mean it will pick up missed shows on +1 channels, repeats, other channels etc... Not something I'm bothered about at the moment but might give some interesting insights for future.

Did you have any feedback on how to obtain the preferred publisher for given channels?

Adam

Chris Jackson

unread,
Jan 28, 2012, 2:27:35 PM1/28/12
to atla...@googlegroups.com
On 28 January 2012 19:08, Adam Sutton <a...@adamsutton.me.uk> wrote:

> One area where its definitely a bit "iffy" is cbeebies (probably gets
> watched more than any other in our house these days :) ). Though I think
> that is because they aren't really bothering to schedule complete series,
> kids don't care! So you can sometimes get a fairly random set.

Which bit of the data is iffy? My guess is that you're right it's just
a random set of episodes, so the data is good and the scheduling is
abnormal by the standards of non-kids channels.

> Interestingly
> this doesn't seem to upset the Sky+ box, so I'm guessing its merely doing
> brand + fuzzy time. Might make an interesting test case to see how my Sky
> box series links some of these series, might learn something :)

Let us know what you find out. I would be quite surprised if the data
Sky has from the BBC is fundamentally better than ours, so we all
might all learn something!

> Did you have any feedback on how to obtain the preferred publisher for given
> channels?

My initial assumption here was that this is a per application/user
choice, so should probably come from a source other than Atlas. I
suppose we could try to follow the same rules as for precedence on the
content endpoint, but on a per-channel basis. But users will still
differ in their requirements (consistency from a single source for all
channels, vs best source on a per-channel basis). Also, we have to
think carefully that there are no gotchas - 1st priority always needs
to be delivering a contiguous schedule.

Adam Sutton

unread,
Jan 28, 2012, 2:35:43 PM1/28/12
to atla...@googlegroups.com
On 28 January 2012 19:27, Chris Jackson <ch...@metabroadcast.com> wrote:
On 28 January 2012 19:08, Adam Sutton <a...@adamsutton.me.uk> wrote:

> One area where its definitely a bit "iffy" is cbeebies (probably gets
> watched more than any other in our house these days :) ). Though I think
> that is because they aren't really bothering to schedule complete series,
> kids don't care! So you can sometimes get a fairly random set.

Which bit of the data is iffy? My guess is that you're right it's just
a random set of episodes, so the data is good and the scheduling is
abnormal by the standards of non-kids channels.

That's all I meant be iffy, not necessarily a fault of the data just strange scheduling. Just from the point of view of series link it throws up the possibility that some channels/series may need alternative approaches based on simply record brand+time?


> Interestingly
> this doesn't seem to upset the Sky+ box, so I'm guessing its merely doing
> brand + fuzzy time. Might make an interesting test case to see how my Sky
> box series links some of these series, might learn something :)

Let us know what you find out. I would be quite surprised if the data
Sky has from the BBC is fundamentally better than ours, so we all
might all learn something!

Yeah I agree I imagine the data is identical, it will just be interesting to learn how a sky box treats this non-linear set of episodes and whether its simply recording a show from the same brand at around the same time for these channels. Difficult to get a full understanding from reverse engineering without finding some anomalous data in the first place.
 

> Did you have any feedback on how to obtain the preferred publisher for given
> channels?

My initial assumption here was that this is a per application/user
choice, so should probably come from a source other than Atlas. I
suppose we could try to follow the same rules as for precedence on the
content endpoint, but on a per-channel basis. But users will still
differ in their requirements (consistency from a single source for all
channels, vs best source on a per-channel basis). Also, we have to
think carefully that there are no gotchas - 1st priority always needs
to be delivering a contiguous schedule.

Fair enough. I'll assume that for now I need to map these myself. Presumably BBC, ITV, Channel4 and Five will be best for their channel sets and PA as a fallback for the other channels?

Do any of these publishers (PA excluded) require licensing to access them? I know ITV are a bit of an arse for this?

mike

unread,
Jan 29, 2012, 3:30:14 PM1/29/12
to Atlas
Hi,

On Jan 27, 7:18 pm, Chris Jackson <ch...@metabroadcast.com> wrote:
> I'm surprised to keep hearing of these issues on the group, since this
> data is used by RadioTimes across their site. We know that minor shows
> and minor channels have bad metadata. But at least the major shows on
> major channels should be in good shape. If that isn't the case we
> would really appreciate some sample bug reports athttp://issues.atlas.metabroadcast.com/. I know PA and others are keen
> to help.

Ok, as for major channels with all my licence fee money, I'll single
out overnight BBC Parliament, and CBeebies listings as being
particularly bad (in the sense that long-standing errors persist).
It's my turn to be surprised - that the PA isn't checking, otherwise
they would have noticed the problems :-)

I will start adding reports, and many thanks for providing the
opportunity and mechanism to create them.

Karl Dietz

unread,
Jan 30, 2012, 3:44:22 AM1/30/12
to atla...@googlegroups.com
>>> One area where its definitely a bit "iffy" is cbeebies (probably gets
>>> watched more than any other in our house these days :) ). Though I think
>>> that is because they aren't really bothering to schedule complete series,
>>> kids don't care! So you can sometimes get a fairly random set.
>
>> Which bit of the data is iffy? My guess is that you're right it's just
>> a random set of episodes, so the data is good and the scheduling is
>> abnormal by the standards of non-kids channels.
>
> That's all I meant be iffy, not necessarily a fault of the data just
> strange scheduling. Just from the point of view of series link it throws
> up the possibility that some channels/series may need alternative
> approaches based on simply record brand+time?

I think you are mixing up series link with duplicate matching. You
could look at MythTV where the xmltv grabber's fixups plus duplicate
matching do a good job.
After all its two separate concepts to find all programmes that belong
to a series and to filter out upcoming episodes that you have already
viewed/recorded or are planning to view/record at another airing.

I don't think a random order of repeats is uncommon. Just think of all
the documentary series. I see all kinds of repeats that match topics
that are relevant at the moment regardless of the original airing order.
(one prominent example for such an event would be Christmas :)

Btw, we watch some foreign series in more or less random order over
here, so its not restricted to documentaries and kids stuff (e.g.
Midsomer Murder of which a random set of "good" episodes got dubbed and
once it had proven to be a good fit for our market all the other
episodes from the beginning got filled in as "new in Germany". We are
just trough season 11 regularly and the backfill has catched up to
season 7, but MythTV does a good job on eliminating duplicates)

> Yeah I agree I imagine the data is identical, it will just be
> interesting to learn how a sky box treats this non-linear set of
> episodes and whether its simply recording a show from the same brand at
> around the same time for these channels. Difficult to get a full
> understanding from reverse engineering without finding some anomalous
> data in the first place.

You could avoid finding anomalous data and simply look at how MythTV
handles it :-)

>>> Did you have any feedback on how to obtain the preferred publisher for given
>>> channels?
>
>> My initial assumption here was that this is a per application/user
>> choice, so should probably come from a source other than Atlas. I
>> suppose we could try to follow the same rules as for precedence on the
>> content endpoint, but on a per-channel basis. But users will still
>> differ in their requirements (consistency from a single source for all
>> channels, vs best source on a per-channel basis). Also, we have to
>> think carefully that there are no gotchas - 1st priority always needs
>> to be delivering a contiguous schedule.
>
> Fair enough. I'll assume that for now I need to map these myself.
> Presumably BBC, ITV, Channel4 and Five will be best for their channel
> sets and PA as a fallback for the other channels?

I agree that it would be good to have one edited channel list that can
be used by consumers. At my site I try to match the broadcasters
schedule to shared movie/series metadata to get accuracy and
consistency.

> Do any of these publishers (PA excluded) require licensing to access
> them? I know ITV are a bit of an arse for this?

Oh they joy of licensing... We have that over here, too. Do you keep
track of tainted schedule and tainted extended metadata separately?

Over here
* some channels don't want their schedule anywhere near a recording
machine while
* others simply want to charge for the pictures and texts, yet
* others only restrict your server location to relevant jurisdictions
for their pictures and texts.
* All do agree that the extended data is only to be used to promote
their upcoming schedule, which makes it hard to return one result set
for a series that is airing on multiple channels at the same time.

Thats another reason why I'm looking forward to a shared meta database.
To simplify the various restrictions by simply not using the additional
metadata if I can avoid it.

Regards,
Karl

PS: Sorry for beating the shared meta database horse so much, I just
think it deserves it :)

Adam Sutton

unread,
Jan 31, 2012, 4:50:53 AM1/31/12
to atla...@googlegroups.com
Hi Karl,

Thanks for the input, I've certainly heard good things about mythtv as a backend, especially in relation to its series linking (and associated recording) support.

However I believe that the first port of call is in looking at the source data to see if that can make things much easier to begin with.

Having looked at the atlas data in more detail I'm begging to get a better feel for things and realising that it does at least give some really useful clues as to "what" should likely be recorded.

However I still think I'd like to understand more how this maps to what the user "expects" to be recorded. These can be two very different things.

Specific comments below :)

Duplicate elimination is a trivial issue if the original source data is good. Though I accept this is not always going to be the case and therefore duplicate elimination is a useful thing to have. I've not looked in detail at how this is achieved on either tvheadend OR mythtv at the moment. I know tvheadend certainly tries to not record duplicates, but I don't know whether this works across channels. Something I believe mythtv copes with.

Just to clarify my first sentence. The data provided by atlas, at least with the data I've been using, includes a unique identifier for each episode. Therefore duplicate removal is simply a case of not recording two copies of the same episode id.

It's interesting to hear that random airing is common elsewhere. However it does throw up potential for requirement for user input into how to record a series.

If I start series linking a show halfway through the series, so that I don't miss it, I don't particularly want it to fill in "missing" shows that came before the point I started the series link. It's wasted space and use of the signal feed. So that basically means my requirements may be different to yours, or my next door neighbours, etc... So this might be something the PVR frontend/backend need to be aware of, but that's off topic for here :)
 
Yeah I agree I imagine the data is identical, it will just be
interesting to learn how a sky box treats this non-linear set of
episodes and whether its simply recording a show from the same brand at
around the same time for these channels. Difficult to get a full
understanding from reverse engineering without finding some anomalous
data in the first place.

You could avoid finding anomalous data and simply look at how MythTV
handles it :-)

Indeed. But I'm also interested in how something like a sky+ box handles things. I might hate paying for the sky subscription, especially since I rarely watch anything but free to air channels, but there is one thing I cannot fault and that's the Sky PVR system. And this is echoed by many comments online. It always records what I expect it to (or maybe I've come to expect what it will record ;) ). So I'm very interested to see how it handles these things.

From what I can see atlas breaks things into a 3-tiered setup:

brand
series
episode

The first two can be optional, i.e. films don't typically belong to a brand, and daily shows (soaps, news) don't have a series.

So if I series link something, do I mean tape anything in the associated series (or without a series if it has none). Or do I mean tape everything in the brand?

Though this still leaves open the earlier questions of what to do about "missing" episodes I don't really want as they've already been aired (or I've removed them to save space?). How long does the series link persist? Does it depend on the type of series I'm linking?
 


Did you have any feedback on how to obtain the preferred publisher for given
channels?

My initial assumption here was that this is a per application/user
choice, so should probably come from a source other than Atlas. I
suppose we could try to follow the same rules as for precedence on the
content endpoint, but on a per-channel basis. But users will still
differ in their requirements (consistency from a single source for all
channels, vs best source on a per-channel basis). Also, we have to
think carefully that there are no gotchas - 1st priority always needs
to be delivering a contiguous schedule.

Fair enough. I'll assume that for now I need to map these myself.
Presumably BBC, ITV, Channel4 and Five will be best for their channel
sets and PA as a fallback for the other channels?

I agree that it would be good to have one edited channel list that can
be used by consumers. At my site I try to match the broadcasters
schedule to shared movie/series metadata to get accuracy and
consistency.

Personally I think this is an area in which the atlas system can add real value. Aggregating and matching the various input feeds and providing a consistent output feed. Though what is output may need to be limited on a key based access (as it currently is).

To some extent this is already happening but its a little too "in your face" requiring the client to still do some of the overlaying, matching.
 

Do any of these publishers (PA excluded) require licensing to access
them? I know ITV are a bit of an arse for this?

Oh they joy of licensing... We have that over here, too. Do you keep
track of tainted schedule and tainted extended metadata separately?

Over here
* some channels don't want their schedule anywhere near a recording
 machine while
* others simply want to charge for the pictures and texts, yet
* others only restrict your server location to relevant jurisdictions
 for their pictures and texts.
* All do agree that the extended data is only to be used to promote
 their upcoming schedule, which makes it hard to return one result set
 for a series that is airing on multiple channels at the same time.

Thats another reason why I'm looking forward to a shared meta database.
To simplify the various restrictions by simply not using the additional
metadata if I can avoid it.

Yeah its generally a pain in the ass. Why the feel they need to protect this data so much is beyond me, its not likely there handing over the crown jewels! Hell all I really care about is getting accurate scheduling information with very basic metadata. Pictures, clips, etc... are nice to have, but far from essential.

One day we'll live in a nice utopian environment where all metadata is free to access in clear and concise formats :) at which point all of our systems will crash due to data overload :)

Adam

P.S.
I'll try and take a look at the mythtv series linking code at some point.

mike

unread,
Jan 31, 2012, 3:02:48 PM1/31/12
to Atlas
On Jan 31, 9:50 am, Adam Sutton <a...@adamsutton.me.uk> wrote:
[...]
> From what I can see atlas breaks things into a 3-tiered setup:
>
> brand
> series
> episode
>
> The first two can be optional, i.e. films don't typically belong to a
> brand, and daily shows (soaps, news) don't have a series.

Soaps really really should be a series, for consistency. From where
I'm sitting with our current data, episode 4367 of Eastenders is being
shown tonight on BBC1. It's just that some series don't have seasons
within them.

A simpler model has only 4 types of broadcast
(a) movie. Careful! Many movies have duplicate titles, sometimes
within the same year. However in these cases, it's never the same
director, so there's your hash right there.
(b) one-off special. Live event, recording of a concert, etc. Need to
be uniquely identified so that your PVR doesn't get duplicate
recordings of each.
(c) Episode of a series, with data so we know which one it is. Title
and/or season+episode numbers will do the job. Unique Id for each
episode is even better.
(d) Episode of a series, with no data so we're reluctantly forced to
just use a Generic synopsis. For a sub-set of these, it genuinely
doesn't matter (eg News). For the vast majority, the data was too poor
and should've been better.

For some things you have massage the data to link them more loosely
into a series, such as "formula 1" coverage where there were 3 or 4
"episodes" shown over a race weekend, and the series as a whole is
"formula 1 racing" or somesuch, where the broadcaster won't
necessarily help out by providing a consistent series title.

For some things which genuinely are series, and always used to be
perfectly fine, there's a sub-category where the broadcast thinks it's
clever to keep shuffling the title around to break PVRs - best example
is "The Money Programme" on BBC2; also see "Panorama" and "Horizon"
which often creep into start/end of the series title for no good
reason.

> So if I series link something, do I mean tape anything in the associated
> series (or without a series if it has none). Or do I mean tape everything
> in the brand?

I sincerely hope you're not "taping" anything in this day and age!
Surely you mean recording? :-)

If you set a series link, you should expect to get 1 copy (with no
duplicates) of every episode of that series from that moment on,
forever.

> Though this still leaves open the earlier questions of what to do about
> "missing" episodes I don't really want as they've already been aired (or
> I've removed them to save space?). How long does the series link persist?

Forever. Any system that loses them between seasons is badly broken.
This includes Sky's Sky+ which in the past didn't even used to be able
to record monthly programmes like The Sky At Night, or Crimewatch
properly. The whole point of a PVR is that you are telling your PVR
you like that programme and you want to see every episode, and you
don't want to have to read the schedules to work out when it's on.
Otherwise you'd have a VCR, a newspaper and a pen.

> Does it depend on the type of series I'm linking?

There are many use cases. Some series need to get renamed. Eg "Film
2011" becomes "Film 2012" in January. It should *not* be up to the PVR
user to spot this and manually intervene and fix the problem by
creating another new series link (Again, see VCR/pen scenario). You re-
use the series and change the title. (on topic - the PA data screwed
this up. Film 2011 was "season 2" - because they'd carried on with the
"Film 2010" relaunched series, and renamed it. But this year, they've
started another one called Film 2012 which is "season 1" again :-( )

Mike

Jonathan Tweed

unread,
Feb 7, 2012, 1:40:18 PM2/7/12
to atla...@googlegroups.com
On Monday, 30 January 2012 at 08:44, Karl Dietz wrote:
> PS: Sorry for beating the shared meta database horse so much, I just
> think it deserves it :)


You don't apologise for that on here ;)

We're more than happy to include data from any broadcaster anywhere in the world in Atlas - in fact we'd love to have it. We'd love to see Atlas become more and more like a MusicBrainz for tv - a place that has everything tv/radio/film related.

Here's me beating my horse a bit now:

Atlas is open source, so anyone who has data is welcome and encouraged to write an adapter to ingest that data into Atlas. We'll provide any support you need to get it up and running for development.

Cheers
Jonathan

Jonathan Tweed

unread,
Feb 7, 2012, 1:59:51 PM2/7/12
to atla...@googlegroups.com
On Tuesday, 31 January 2012 at 20:02, mike wrote:
> On Jan 31, 9:50 am, Adam Sutton <a...@adamsutton.me.uk> wrote:
> [...]
> > From what I can see atlas breaks things into a 3-tiered setup:
> >
> > brand
> > series
> > episode
> >
> > The first two can be optional, i.e. films don't typically belong to a
> > brand, and daily shows (soaps, news) don't have a series.
>
>
>
> Soaps really really should be a series, for consistency. From where
> I'm sitting with our current data, episode 4367 of Eastenders is being
> shown tonight on BBC1. It's just that some series don't have seasons
> within them.


Careful with language, EastEnders is as you want to be ;)

UK:US
brand:series/show
series:season

Seasons are something else entirely in the UK… (they're used for themed sets of broadcasts, e.g. the recent BBC Four American season).

So there is a EastEnders brand in Atlas (well, two - one from PA and one from the BBC).

> A simpler model has only 4 types of broadcast
> (a) movie. Careful! Many movies have duplicate titles, sometimes
> within the same year. However in these cases, it's never the same
> director, so there's your hash right there.
> (b) one-off special. Live event, recording of a concert, etc. Need to
> be uniquely identified so that your PVR doesn't get duplicate
> recordings of each.
> (c) Episode of a series, with data so we know which one it is. Title
> and/or season+episode numbers will do the job. Unique Id for each
> episode is even better.
> (d) Episode of a series, with no data so we're reluctantly forced to
> just use a Generic synopsis. For a sub-set of these, it genuinely
> doesn't matter (eg News). For the vast majority, the data was too poor
> and should've been better.


s/broadcast/item/ and you're mostly describing exactly what we've got.

> For some things you have massage the data to link them more loosely
> into a series, such as "formula 1" coverage where there were 3 or 4
> "episodes" shown over a race weekend, and the series as a whole is
> "formula 1 racing" or somesuch, where the broadcaster won't
> necessarily help out by providing a consistent series title.
>
> For some things which genuinely are series, and always used to be
> perfectly fine, there's a sub-category where the broadcast thinks it's
> clever to keep shuffling the title around to break PVRs - best example
> is "The Money Programme" on BBC2; also see "Panorama" and "Horizon"
> which often creep into start/end of the series title for no good
> reason.


Ah yes, the thing about domain models is that we technical types come up with them and then tv types seem to spend their days coming up with ever more creative ways of saying their programmes must be described ;)

There are legitimate (well, deliberate) examples in the BBC Programmes that have:

brand
series
sub-series
sub-series
episode

Which I think is probably a bit over the top for most people. Atlas does flatten the BBC data slightly when it goes this far. We do fully support all other possible hierarchies though.



> > Does it depend on the type of series I'm linking?
>
>
> There are many use cases. Some series need to get renamed. Eg "Film
> 2011" becomes "Film 2012" in January. It should *not* be up to the PVR
> user to spot this and manually intervene and fix the problem by
> creating another new series link (Again, see VCR/pen scenario). You re-
> use the series and change the title. (on topic - the PA data screwed
> this up. Film 2011 was "season 2" - because they'd carried on with the
> "Film 2010" relaunched series, and renamed it. But this year, they've
> started another one called Film 2012 which is "season 1" again :-( )


This can indeed be a problem. The other thing the BBC does is start with:

series
episode

If they then recommission that programme, they add a brand above the first series and create a second one:

brand
series
episode

This is a real programme when you've shared the URL for what is now series 1 everywhere as the URL for the programme…

Sorry, that's all a bit of an aside, but I hope it's useful. In short, the Atlas model is based on a slightly more pragmatic version of the BBC Programmes Ontology, which is in turn based on a data model that was derived from TV-Anytime. So we're good :)

Cheers
Jonathan

Reply all
Reply to author
Forward
0 new messages