Initial Thoughts on oer.txt

5 views
Skip to first unread message

Nathan Yergler

unread,
Nov 23, 2010, 6:42:19 PM11/23/10
to oertxt-wor...@googlegroups.com
I saw oer.txt on Twitter yesterday, and it was brought up later on the
Learning Registry list. I shared my thought there, and wanted to
follow up here in case others had responses. Overall oer.txt is an
interesting idea, and I can understand the desire to have a single
place to look for information. That said, I have a few concerns about
it as a general proposal.

Using a well known location for information can be useful as a
bootstrapping mechanism, but is a poor design principle for new
services. It implies that every time you want to describe resources
for a new information domain, you need a new well known location,
potentially duplicating the information in others. Sitemaps overcome
this to a certain extent by "hooking" robots.txt, but even that seems
like a partial solution.

oer.txt as currently proposed suffers from a lack a flexibility: the
current proposal doesn't say anything about why the resources linked
are relevant, and for what purpose (ie, for a particular subject? a
particular education level? Ideally this is published with the
resource, but you can imagine situations where other curators have
differing views). It also doesn't provide any self-description of
who's asserting that they're OER. You could conceivably look at the
domain registration for information about this, but that's further
muddled by the fact that you're [potentially] pointing at other
domains: which registration do you retrieve, and what's the
relationship of one to the other?

Finally, the use of a well known location at the domain level
effectively restricts who can publish these maps/assertions to people
who control domains. In a world of linked data, that seems like an
unnecessary limitation.

It seems like a better model for enabling discovery is using a machine
readable document linked from the resource or root page of a
collection. This could provide simple pointers to services, similar to
oer.txt, but could also provide richer information about the resources
on a site. Ideally this would be in a general format (I'm biased to
RDF, but you can imagine using POWDER or something else) that would
allow you to provide additional information about resources in
additional to labeling it as "educational".

NRY

----
Nathan R. Yergler
Chief Technology Officer
Creative Commons

http://wiki.creativecommons.org/User:Nathan_Yergler

Pierre Far

unread,
Nov 24, 2010, 7:29:08 AM11/24/10
to oertxt-wor...@googlegroups.com
Hi Nathan,

Thank you very much for this. Some of my thoughts are below, including
perhaps the seed of a better proposal.

Before this thread expands too much, I want to be clear on something:
I'm not married to the oer.txt solution and I would consider this
proposal a success if the community talked and reached any solution,
regardless of what that solution is. I get the sense we all agree we
need something like this but that oer.txt as proposed has weaknesses.
Fine, let's do better and kill it :)

By the way, when I say Sitemaps (capitalised) I'm referring to the
Sitemaps protocol and when I say sitemaps (lowercase) it's the sitemap
file.

> Using a well known location for information can be useful as a
> bootstrapping mechanism, but is a poor design principle for new
> services. It implies that every time you want to describe resources
> for a new information domain, you need a new well known location,
> potentially duplicating the information in others. Sitemaps overcome
> this to a certain extent by "hooking" robots.txt, but even that seems
> like a partial solution.

When I was thinking about this, I came up with exactly these two solutions:

1. The "general" one where we can hook into robots.txt exactly like
Sitemaps did. I discounted this idea for two reasons:

a. The Sitemaps protocol had the weight of the top 3 search engines
behind it from the get go. It created instant real demand overnight.
Our small industry doesn't have such heavyweights to create similar
demand.

b. Sitemaps specified an XML format that had to be strictly adhered to
or else it failed. The XML is not really "extensible" as far as I
know: you either strictly follow the Sitemaps schema or it's an
invalid sitemap. I'm pretty sure you can't extend it by introducing
new namespaces. For our purposes, my proposal would have been that we
need yet another Sitemaps-like XML format that we all agree to and it
serves the needs of all current OER publishers. Who's up for leading
that proposal? I'm not! And what are its chances of success given
point (1)? Minimal at best.

2. The "specialised" one: a domain-specific robots.txt-like solution
which is oer.txt. Given my reservations about the general approach,
this won. oer.txt is not about specifying a new format but is a way to
advertise what already exists. It's a business card to learn how to
get in touch.

However, I would love it if we can figure out a good general solution.

> oer.txt as currently proposed suffers from a lack a flexibility: the
> current proposal doesn't say anything about why the resources linked
> are relevant, and for what purpose (ie, for a particular subject? a
> particular education level? Ideally this is published with the
> resource, but you can imagine situations where other curators have
> differing views). It also doesn't provide any self-description of
> who's asserting that they're OER. You could conceivably look at the
> domain registration for information about this, but that's further
> muddled by the fact that you're [potentially] pointing at other
> domains: which registration do you retrieve, and what's the
> relationship of one to the other?

This type of flexibility is intentionally outside the scope of
oer.txt. The way I see oer.txt is that it's a guide. Imagine reaching
a road junction and you see sign that says "Cambridge" pointing left,
one sign saying "London" pointing right, and one saying "Airport" up
ahead. The signs don't care what Cambridge or London or Airport mean,
but it matters where they are. oer.txt is exactly like that: an OER
road sign that says "there is RSS at this URL and there is an OAI
endpoint at that URL". If you want Cambridge, you turn left and if you
want RSS you go to that URL.

Describing the content is up to the URLs oer.txt points to. For
example, MIT has two RSS main feeds (one for video & audio and for
text), Stanford has an RSS feed for each course, and you can get
YouTube Edu videos using RSS and the API.

As to who's asserting it's OER, it's the content producer as trusted
by the consumer, which is a bit of a weak point in this proposal.
Content consumers need to trust the website when it says it has OER.
Intuitively, I would trust someuniversity.edu saying it has OER more
than I would trust (say) cnn.com. So there is some responsibility put
on the consumers.

I'd also point out that "lying" about having OER is a form of spam and
if it gets too big then it will be a problem. For a large content
consumer like Google, they have a dedicated web spam team (headed by
Matt Cutts who you might have heard of).

> Finally, the use of a well known location at the domain level
> effectively restricts who can publish these maps/assertions to people
> who control domains. In a world of linked data, that seems like an
> unnecessary limitation.

The restrictions to those who run domains is a very valid concern. I
can only point out that I've yet to see an OER website that does not
have a robots.txt file. MIT for example: http://ocw.mit.edu/robots.txt
and Connexions: http://cnx.org/robots.txt. All the ones that I can't
think of one hosted as a subdirectory.

However, you triggered an idea for a solution: how favicons are
handled by browsers. By convention, browsers expect to find a
website's favicon as a file called favicon.ico at the domain's root,
but webmasters can specify an alternate location in the HTML. Perhaps
this is a good analogy to learn from? Suppose that we say oer.txt
should live at the website's root but if you want, in the HTML you
state the actual location if you want. This opens up the proposal to
every single OER producer. See http://en.wikipedia.org/wiki/Favicon
for reference.

> It seems like a better model for enabling discovery is using a machine
> readable document linked from the resource or root page of a
> collection. This could provide simple pointers to services, similar to
> oer.txt, but could also provide richer information about the resources
> on a site. Ideally this would be in a general format (I'm biased to
> RDF, but you can imagine using POWDER or something else) that would
> allow you to provide additional information about resources in
> additional to labeling it as "educational".

How would you know the URLs of the roots of collections? We could make
this work if we agree how to automatically identify a collection's
root - perhaps a special sitemap? But we'd still need a way to specify
the URLs of tools like OAI harvesting and search.

This line of thinking forces us to recognize that there are two types
of URLs oer.txt can advertise:
1. A collection-specific URL, like an RSS feed of the course lectures
2. A cross-collection URL, like a search engine or a meta data harvesting API

Should we handle these differently? Perhaps oer.txt is better suited
for (2) but autodiscovery from the root home pages is best for (1).
Thoughts?

Thanks again for the comments!

Pierre

--
Pierre Far, PhD

About me: http://www.pierrefar.com/

NEW! OpenCourseWare Search: http://www.ocwsearch.com/
Webmaster and SEO resources: http://ekstreme.com
Blog of Science: http://blogsci.com

Nathan Yergler

unread,
Nov 29, 2010, 2:53:45 PM11/29/10
to oertxt-wor...@googlegroups.com
Hi Pierre,

Thanks for the detailed reply; comments inline below.

I'm not entirely sure that I see the distinction between your general
and specialized approaches: even though oer.txt does not define a new
format for the resources themselves, it does have a specific format
for describing where to look. It also seems like the absence of a
heavy weight equally applies in any of these scenarios: publishers are
going to wonder what's in it for them, regardless of whether we're
asking them to edit robots.txt, oer.txt, or something else. I don't
have a good answer to the "heavyweight" question, although I do think
that many in the community are willing to experiment and publish
pointers to existing services, if asked.

I think the answer to general vs. specific is describing things using
a general format with domain specific semantics. To draw an analogy to
some of our past work, we pushed development of RDFa at the W3 as a
general solution so we could use it to address domain specific
problems (labeling licenses). This has allowed us to continue to build
on that technology in a very flexible manner.

See below for more concrete suggestions in this vein.

>
>> oer.txt as currently proposed suffers from a lack a flexibility: the
>> current proposal doesn't say anything about why the resources linked
>> are relevant, and for what purpose (ie, for a particular subject? a
>> particular education level? Ideally this is published with the
>> resource, but you can imagine situations where other curators have
>> differing views). It also doesn't provide any self-description of
>> who's asserting that they're OER. You could conceivably look at the
>> domain registration for information about this, but that's further
>> muddled by the fact that you're [potentially] pointing at other
>> domains: which registration do you retrieve, and what's the
>> relationship of one to the other?
>
> This type of flexibility is intentionally outside the scope of
> oer.txt. The way I see oer.txt is that it's a guide. Imagine reaching
> a road junction and you see sign that says "Cambridge" pointing left,
> one sign saying "London" pointing right, and one saying "Airport" up
> ahead. The signs don't care what Cambridge or London or Airport mean,
> but it matters where they are. oer.txt is exactly like that: an OER
> road sign that says "there is RSS at this URL and there is an OAI
> endpoint at that URL". If you want Cambridge, you turn left and if you
> want RSS you go to that URL.

That's useful; I think some explicit setting of scope would be helpful
for any proposal in this space. I also think that a solution that
allows you to [optionally] add information regarding why it's OER,
who's making the statement, etc should be preferred.

>
> Describing the content is up to the URLs oer.txt points to. For
> example, MIT has two RSS main feeds (one for video & audio and for
> text), Stanford has an RSS feed for each course, and you can get
> YouTube Edu videos using RSS and the API.
>
> As to who's asserting it's OER, it's the content producer as trusted
> by the consumer, which is a bit of a weak point in this proposal.
> Content consumers need to trust the website when it says it has OER.
> Intuitively, I would trust someuniversity.edu saying it has OER more
> than I would trust (say) cnn.com. So there is some responsibility put
> on the consumers.

Completely concur that "normal" trust mechanisms apply. I think that's
true regardless of how much information a publisher/curator provides
as background.

>
> I'd also point out that "lying" about having OER is a form of spam and
> if it gets too big then it will be a problem. For a large content
> consumer like Google, they have a dedicated web spam team (headed by
> Matt Cutts who you might have heard of).
>
>> Finally, the use of a well known location at the domain level
>> effectively restricts who can publish these maps/assertions to people
>> who control domains. In a world of linked data, that seems like an
>> unnecessary limitation.
>
> The restrictions to those who run domains is a very valid concern. I
> can only point out that I've yet to see an OER website that does not
> have a robots.txt file. MIT for example: http://ocw.mit.edu/robots.txt
> and Connexions: http://cnx.org/robots.txt. All the ones that I can't
> think of one hosted as a subdirectory.

Right. Not saying that people won't publish to robots.txt, or oer.txt,
just that it'd be nice to enable individuals, etc (at the
"sub-directory level", so to speak) to also publish pointers (I'll
admit this may again be an issue of me not understanding scope
entirely).

>
> However, you triggered an idea for a solution: how favicons are
> handled by browsers. By convention, browsers expect to find a
> website's favicon as a file called favicon.ico at the domain's root,
> but webmasters can specify an alternate location in the HTML. Perhaps
> this is a good analogy to learn from? Suppose that we say oer.txt
> should live at the website's root but if you want, in the HTML you
> state the actual location if you want. This opens up the proposal to
> every single OER producer. See http://en.wikipedia.org/wiki/Favicon
> for reference.

Yes; see below.

>
>> It seems like a better model for enabling discovery is using a machine
>> readable document linked from the resource or root page of a
>> collection. This could provide simple pointers to services, similar to
>> oer.txt, but could also provide richer information about the resources
>> on a site. Ideally this would be in a general format (I'm biased to
>> RDF, but you can imagine using POWDER or something else) that would
>> allow you to provide additional  information about resources in
>> additional to labeling it as "educational".
>
> How would you know the URLs of the roots of collections? We could make
> this work if we agree how to automatically identify a collection's
> root - perhaps a special sitemap? But we'd still need a way to specify
> the URLs of tools like OAI harvesting and search.

I guess "root of collection" isn't quite clear; I meant that instead
of retrieving oer.txt from ocw.mit.edu (for example), you retrieve
http://ocw.mit.edu/ and look for information there (RDFa, POWDER, etc)
that points to specific services (ie, an RSS feed, OAI-PMH endpoint,
etc), or a separate resource that describes a set (preferably in a
similar language). The SIOC (http://sioc-project.org/) services module
defines one way this could be described using has_service and
service_protocol; we use this on CC Network with RDFa in the document
header:

<link about="/" rel="sioc_service:has_service" href="/r/lookup" />
<link about="/r/lookup" rel="sioc_service:service_protocol"
href="http://wiki.creativecommons.org/work-lookup" />

This is similar to the favicon approach, but instead of saying
"default to retrieving /oer.txt", you say "retrieve / and we'll tell
you what to do from there."

>
> This line of thinking forces us to recognize that there are two types
> of URLs oer.txt can advertise:
> 1. A collection-specific URL, like an RSS feed of the course lectures
> 2. A cross-collection URL, like a search engine or a meta data harvesting API
>
> Should we handle these differently? Perhaps oer.txt is better suited
> for (2) but autodiscovery from the root home pages is best for (1).
> Thoughts?

I'm not convinced the two cases are all that different, and I think
that looking for a solution that accommodates both will help guide the
general v. specialized path.

Thanks for taking the time to put together your draft and helping
everyone think about these issues. I hope this above are helpful and
clarify my thoughts.

Best,

Nathan

Steve Midgley

unread,
Nov 30, 2010, 5:59:12 PM11/30/10
to oer.txt Working Group
Hi Nathan and Pierre,

I'll pile on here, hopefully with something useful..

The oer.txt file has the benefit of simplicity which recommends it. It
seems like you could solve everything that it does with Powder/RDFa
but at the expense of complexity.

If I follow Nathan's argument, he's asking for "extensible
simplicity." Simple/entry use-cases get a simple implementation.
Complex requirements get complex implementations, and maybe oer.txt is
missing the ability to get more complex as the problem scope
increases?

I have a big question about this relating to discovery, which Nathan
knows I'm a bit obsessed with right now. With robots.txt, there are
basically three orgs (Y, G and MS) who access them at scale. Who would
access the oer.txt files besides these three? And of those others, who
wouldn't already know what they need to know about the site (meaning
they already harvest stuff from your site)?

Providing some service descriptors via SIOC makes some sense to help
people who already know about your site get oriented to where to get
stuff from you, programatically.

But oer.txt seems oriented towards crawlers who happen across your
site. My question is, who/how do folks get to that oer.txt file to
begin with? If the answer is the same as with robots.txt (and semantic
linked data) then it's hard for me to see what the actual problem is
that's being solved by oer.txt. Maybe I'm not really understanding the
problem case (we all seem to be saying that on this thread!). Is it
one of these?

1) A sitemap of OER resources on the site (like Sitemaps)
2) A link to service end-points for those resources (like to a
repository)
3) A link to a list of OER resources on the site (like an RSS feed)

Or something else?

Best,
Steve
> > every single OER producer. Seehttp://en.wikipedia.org/wiki/Favicon
> > for reference.
>
> Yes; see below.
>
>
>
> >> It seems like a better model for enabling discovery is using a machine
> >> readable document linked from the resource or root page of a
> >> collection. This could provide simple pointers to services, similar to
> >> oer.txt, but could also provide richer information about the resources
> >> on a site. Ideally this would be in a general format (I'm biased to
> >> RDF, but you can imagine using POWDER or something else) that would
> >> allow you to provide additional  information about resources in
> >> additional to labeling it as "educational".
>
> > How would you know the URLs of the roots of collections? We could make
> > this work if we agree how to automatically identify a collection's
> > root - perhaps a special sitemap? But we'd still need a way to specify
> > the URLs of tools like OAI harvesting and search.
>
> I guess "root of collection" isn't quite clear; I meant that instead
> of retrieving oer.txt from ocw.mit.edu (for example), you retrievehttp://ocw.mit.edu/and look for information there (RDFa, POWDER, etc)
> that points to specific services (ie, an RSS feed, OAI-PMH endpoint,
> etc), or a separate resource that describes a set (preferably in a
> similar language). The SIOC (http://sioc-project.org/) services module
> defines one way this could be described using has_service and
> service_protocol; we use this on CC Network with RDFa in the document
> header:
>
>   <link about="/" rel="sioc_service:has_service" href="/r/lookup" />
>   <link about="/r/lookup" rel="sioc_service:service_protocol"
> href="http://wiki.creativecommons.org/work-lookup" />
>
> This is similar to the favicon approach, but instead of saying
> "default to retrieving /oer.txt", you say "retrieve / and we'll tell
> you what to do from there."
>
>
>
> > This line of thinking forces us to recognize that there are two types
> > of URLs oer.txt can advertise:
> > 1. A collection-specific URL, like an RSS feed of the course lectures
> > 2. A cross-collection URL, like a search engine or a meta data harvesting API
>
> > Should we handle these differently? Perhaps oer.txt is better suited
> > for (2) but autodiscovery from the root home pages is best for (1).
> > Thoughts?
>
> I'm not convinced the two cases are all that ...
>
> read more »

Kathi Fletcher

unread,
Nov 30, 2010, 6:16:33 PM11/30/10
to oertxt-wor...@googlegroups.com

I have a big question about this relating to discovery, which Nathan
knows I'm a bit obsessed with right now. With robots.txt, there are
basically three orgs (Y, G and MS) who access them at scale. Who would
access the oer.txt files besides these three? And of those others, who
wouldn't already know what they need to know about the site (meaning
they already harvest stuff from your site)?

 
From Connexions perspective, we would like a simple way to advertise feeds and protocols for harvesting to new OER services and tools that come online. Currently you have to hunt around or talk to one of the project personnel to find out about all the ways you can get feeds from the site. So I see this as more than an advertisement for the big search engines. 

Kathi 
 
--
Katherine Fletcher, Technical Director and Project Manager, Connexions, MS 375 (713) 348-3662
Web: http://cnx.org Email: k...@rice.edu, kathi.f...@gmail.com
Connexions Community: http://conference.cnx.org, http://blog.cnx.org , http://twitter.com/cnxorg, http://www.facebook.com/cnx.org, http://cnxconsortium.org/, http://devblog.cnx.org

Midgley, Steve

unread,
Dec 1, 2010, 1:26:39 PM12/1/10
to oertxt-wor...@googlegroups.com

Thanks. Seems like there are a few objectives floating around?

 

1)      Service/end-point discovery. Provide a way for sites to tell folks where their feeds are located and in what formats they can be retrieved. E.g., “You can find an RSS of OER at [/xyz]. You can find an OAI-PMH search interface to OER at [/abc].”

2)      OER iteration. This is more like a traditional site map, that lets all comers iterate over a structured list of all resources on the site. E.g., “Here’s a file where you can find info on all the resources on our site in [abc] format.”

3)      Declare that there exist OER on the site and what license they are in. Provide some namespace hints as to where they are. E.g., “This site has OER licensed CC-by-30. You’ll find them under [/xyz].”

 

#2 could be subsumed into #1 but I’d guess that oer.txt is proposed for #2 b/c #1 is a little cumbersome/complicated for very simple instances? #3 is mostly analogous to robots.txt but for OER.

 

Not sure if I’ve captured the elements here, but hopefully helpful.

 

Best,

Steve

Nathan Yergler

unread,
Dec 8, 2010, 2:02:05 PM12/8/10
to oertxt-wor...@googlegroups.com
On Tue, Nov 30, 2010 at 3:16 PM, Kathi Fletcher <k...@rice.edu> wrote:
>>
>> I have a big question about this relating to discovery, which Nathan
>> knows I'm a bit obsessed with right now. With robots.txt, there are
>> basically three orgs (Y, G and MS) who access them at scale. Who would
>> access the oer.txt files besides these three? And of those others, who
>> wouldn't already know what they need to know about the site (meaning
>> they already harvest stuff from your site)?
>>
>
> From Connexions perspective, we would like a simple way to advertise feeds
> and protocols for harvesting to new OER services and tools that come online.
> Currently you have to hunt around or talk to one of the project personnel to
> find out about all the ways you can get feeds from the site. So I see this
> as more than an advertisement for the big search engines.

Is there reason to believe big search engines will use this
information if they can discover it? (Actually curious -- are there
conversations people have had that indicate this is a blocker?).

NRY

Nathan Yergler

unread,
Dec 8, 2010, 2:03:09 PM12/8/10
to oertxt-wor...@googlegroups.com
Those seem like the obvious use cases/objectives to me. Pierre, are
those the issues you were hoping to address? It seems like the
existing proposal focuses on 1 and 2 below.

NRY

Scott Wilson

unread,
Dec 10, 2010, 9:48:15 AM12/10/10
to oer.txt Working Group
There is already a format that meets all the UCs, and that is: OPML.

Its not a wonderful or beautiful or semantically well thought-out
spec. But its here today, implemented on a very large scale, and used
for exactly the UCs discussed for "oer.txt".

I've used it for harvesting OER podcasts already. Its trivial for
producers to implement and trivial to write a parser and harvester
for.

Even Outlook supports it!

So no need to invent a new spec and fragment the web.

On Dec 8, 7:02 pm, Nathan Yergler <nat...@creativecommons.org> wrote:
> On Tue, Nov 30, 2010 at 3:16 PM, Kathi Fletcher <k...@rice.edu> wrote:
>
> >> I have a big question about this relating to discovery, which Nathan
> >> knows I'm a bit obsessed with right now. With robots.txt, there are
> >> basically three orgs (Y, G and MS) who access them at scale. Who would
> >> access the oer.txt files besides these three? And of those others, who
> >> wouldn't already know what they need to know about the site (meaning
> >> they already harvest stuff from your site)?
>
> > From Connexions perspective, we would like a simple way to advertise feeds
> > and protocols for harvesting to new OER services and tools that come online.
> > Currently you have to hunt around or talk to one of the project personnel to
> > find out about all the ways you can get feeds from the site. So I see this
> > as more than an advertisement for the big search engines.
>
> Is there reason to believe big search engines will use this
> information if they can discover it? (Actually curious -- are there
> conversations people have had that indicate this is a blocker?).
>
> NRY
>
>
>
> > Kathi
>
> > --
> > Katherine Fletcher, Technical Director and Project Manager, Connexions, MS
> > 375 (713) 348-3662
> > Web:http://cnx.orgEmail: k...@rice.edu, kathi.fletc...@gmail.com
> > Connexions Community:http://conference.cnx.org,http://blog.cnx.org,
> >http://twitter.com/cnxorg,http://www.facebook.com/cnx.org,
> >http://cnxconsortium.org/,http://devblog.cnx.org

Midgley, Steve

unread,
Dec 10, 2010, 11:27:30 AM12/10/10
to oertxt-wor...@googlegroups.com
Interesting - I hadn't run across OPML before. Looking online, it seems like it's kind of stale though? No updates on the home page since 2006, with a referenced version 2 suggested but not finished?

Holding that aside for this discussion, I do like their design goals for sure. Related to this, we have been looking a little at POWDER which seems to provide similar capabilities? OPML seems nicer to me b/c it's easier to understand (and presumably implement). Do you have any opinions on the difference between the two?

Moving a little further, it seems like OPML could meet all the criteria for oer.txt but only if you accept that you want to present all your OER for your in a hierarchical list? I could imagine wanting to share info differently (RDF-ish maybe), though the OMPL method does seem like it would be simpler and easier (both to write and read).

With that line of thinking in mind, what are the limitations of RSS itself for this purpose? Why not just publish an RSS feed of OER for a site? What does OPML get you that is valuable over RSS?

And more broadly what does this group want from oer.txt that either RSS or OPML can't do (if anything)?

Best,
Steve

________________________________________
From: oertxt-wor...@googlegroups.com [oertxt-wor...@googlegroups.com] On Behalf Of Scott Wilson [scott.brad...@gmail.com]
Sent: Friday, December 10, 2010 9:48 AM
To: oer.txt Working Group


Subject: Re: Initial Thoughts on oer.txt

There is already a format that meets all the UCs, and that is: OPML.

Scott Wilson

unread,
Dec 10, 2010, 11:48:05 AM12/10/10
to oertxt-wor...@googlegroups.com
> Interesting - I hadn't run across OPML before. Looking online, it seems like it's kind of stale though? No updates on the home page since 2006, with a referenced version 2 suggested but not finished?

Its stale because it has been very widely adopted and hasn't needed to be changed. Its more a set of conventions than a specification - in many ways its a "bad" specification, but is very successful despite that!

Here are some of the applications that currently support OPML:

Google Reader
iGoogle
Bloglines
Netvibes
Yahoo
Windows Live
Blogger
Wordpress
Outlook
Drupal
Internet Explorer (!)
Sony PSP (!)
Firefox
iTunes

Anywhere on the web you see a term like "subscriptions" "subscription list" or "blogroll" its referring to OPML. In any application that says something like "import list of feeds" it usually means "in OPML".

On 10 Dec 2010, at 16:27, Midgley, Steve wrote:

>
> Holding that aside for this discussion, I do like their design goals for sure. Related to this, we have been looking a little at POWDER which seems to provide similar capabilities? OPML seems nicer to me b/c it's easier to understand (and presumably implement). Do you have any opinions on the difference between the two?
>
> Moving a little further, it seems like OPML could meet all the criteria for oer.txt but only if you accept that you want to present all your OER for your in a hierarchical list? I could imagine wanting to share info differently (RDF-ish maybe), though the OMPL method does seem like it would be simpler and easier (both to write and read).
>
> With that line of thinking in mind, what are the limitations of RSS itself for this purpose? Why not just publish an RSS feed of OER for a site? What does OPML get you that is valuable over RSS?

Not a lot. Its not really about the format, more the levels of adoption. For example, if OERs are exported as an OPML file (as a collection feeds), you can immediately import them into one the above without modification.

RSS has very similar capabilities, it just isn't used for the same purpose in existing software, so clicking the "import subscriptions" button won't usually give you an RSS option.

I think the use cases are:

- if OERs are exposed as a flat list of individual resources, use RSS
- if OERs are exposed as a list of RSS feeds each of which is a list of resources (e.g. albums or courses), use OPML

Midgley, Steve

unread,
Dec 10, 2010, 11:56:14 AM12/10/10
to oertxt-wor...@googlegroups.com
I'm getting it. Inre Kathi's specific point about wanting to share existing feeds, OPML does just that. I was focused on the hierarchical aspect, but it's the pointers to feeds part that is important for this conversation..

Thanks!

Steve

________________________________________
From: oertxt-wor...@googlegroups.com [oertxt-wor...@googlegroups.com] On Behalf Of Scott Wilson [scott.brad...@gmail.com]

Sent: Friday, December 10, 2010 11:48 AM
To: oertxt-wor...@googlegroups.com

David F. Flanders

unread,
Dec 12, 2010, 5:49:23 AM12/12/10
to oertxt-wor...@googlegroups.com

Just to back up scott and nathan here, use of sword for scholarly research objects (which uses atom) is what we would want to see built upon (used worldwide already). Far too many tools/libs for rss/atom/opml family to go and have YATP (yet another transport protocol) re ore.txt to build stuff for; though agree exploration of rdf shoehorning would be worth building upon (previous work includes ore to atom) and sword 2 is experimenting w predicates in atom. Speak w richard jones and stuart lewis. /dff

On 10 Dec 2010 16:48, "Scott Wilson" <scott.brad...@gmail.com> wrote:

Pierre Far

unread,
Dec 12, 2010, 6:42:24 AM12/12/10
to oertxt-wor...@googlegroups.com
Hello all,

Thanks everyone. I'm really enjoying the ideas coming out of the
discussion, hence my silence.

A couple of points:

1. The OPML suggestion has been slowly growing on me but I think it's missing two things we should think about:

a. What mechanism do we have to tag an OPML file as OER? I'm not
talking about autodiscovery here but about singling out an OPML file
on a website as OER. This is because I can imagine a website having
many OPML files for different purposes. Of course we can agree a
convention that simultaneously enables autodiscovery and communicating
as OER. Like the Sitemaps protocol, we can have in robots.txt a line
that says:

OER: http://www.example.com/oer.opml

This solves both problems and side steps earlier concerns raised about having a separate oer.txt.

b. The <outline> tag has a requirement for the text attribute and
allows additional attributes. We have an opportunity here to define
these additional attributes to make OPML more useful for OER. For
example, we can use define attributes to communicate educational
level, attributes about resource (course home page, module, etc),
format, an ID (which multiple formats share to enable easy
disambiguation), etc.

The fact that OPML has hierarchy built in would make these attributes
really powerful. For example, this OPML snippet is communicates quite
a bit:

<outline text="An OCW Course" id="1234" type="course-home"
license="http://creativecommons.org/licenses/by-nc-sa/3.0/us/">

<outline text="Course Videos" type="modules">
<outline text="..." type="module" format="video/mpeg" url="..." id="..."/>
<outline text="..." type="module" format="video/mpeg" url="..." id="..."/>
</outline>

<outline text="Course Text" type="modules">
<outline text="..." type="module" format="text/html" url="..." id="..."/>
<outline text="..." type="module" format="text/html" url="..." id="..."/>
</outline>

</outline>

<outline text="OAI-PMH Feed" type="oai-pmh" url="..."/>

With a bit of thinking, we can make such a system really powerful and
cover all the use cases we're talking about here. But we should not
make so loose that each content producer has their own variant
implementation.

2. I do not want us to lose sight of the trade-off we're implicitly
making by accepting OPML or any other system. The whole point of
oer.txt was to be a simple way to advertise what already exists, and
the richer the communication we propose, the more simplicity we lose.

That's not necessarily a bad thing if both content producers and
consumers accept something "simple enough" but "useful". OPML might
just be the right balance in this instance.

Thanks,
Pierre

David F. Flanders

unread,
Dec 17, 2010, 4:02:45 AM12/17/10
to oertxt-wor...@googlegroups.com
options re naming conventions and strict/loose: a.) deal with the sprawl of people doing their own thing via human means (meetings, discussion, etc) <- needs lot of money and politics to get people face to face, or b.) shoehorn linkeddata predicates into OPML, i.e. don't force people to use a naming convention for the fields (some systems might want to spit out a [UUID].opml file (or rather OER means nothing to the rest of the world so why use a non cool uri?), rather agree convention for shoehorning predicates into the transport protocol (e.g. OPML) and then let people use the predicates they want and map later, e.g. http://purl.org/dc/education/2010-12-10/oer 

Scott Wilson

unread,
Dec 17, 2010, 6:02:41 AM12/17/10
to oertxt-wor...@googlegroups.com
Or just use OPML exactly as its used already in feed reader subscription lists, and put any extra stuff within the Atom/RSS files that you link from it?

See, e.g. my CC license parsing "algorithm" for feeds assuming that everyone will do it slightly differently:


S
Reply all
Reply to author
Forward
0 new messages