Subject Area Taxonomy

39 views
Skip to first unread message

joe hobson

unread,
May 12, 2014, 7:45:21 PM5/12/14
to learning-regis...@googlegroups.com, learnin...@googlegroups.com
At the beginning of this year we started actively supporting the EasyPublish tool, for publishers that wanted to put a set of resources into the Learning Registry without investing too much time or energy into the more technical ways of connecting their sites or tools to LR APIs. We also started working on a search widget to replace the one you now see on FREE.ed.gov and the Learning Registry website. Currently, all 3 of these share the same subject area directory, which we immediately found to be incomplete. For instance, Countries & Continents has sub-topics for 1) Africa, 2) Artic [sic], Antarctica, and 3) Other Countries and Continents.... and that's it.

With that said, we first set out to find a widely accepted subject area taxonomy to adopt for use in EasyPublish and the LR search tool. We were surprised when we couldn't find one, at least not one that we found to be detailed enough for tools like these. So we started a new list, using CCSS, NGSS, C3, and other standards frameworks as an authoritative reference point, realizing that the point was to create an easily browseable, categorized list of subject areas for educators to use when looking for learning resources (or when classifying existing ones). 

Now that we have a list together, we'd like additional feedback to make sure we didn't miss anything. You can review the list as a Google spreadsheet, separated into different sheets by top-level subject.

Public commenting is enabled, so feel free to comment in-line on specific cells if you think topics need to be added, modified, removed, or split up. For instance, in HSS, we have a topic of Civics, which will cascade to 3 sub-topics as shown in the spreadsheet. If you feel that a big, top level area of Civil Disobedience and Dissent is not a sub-sub-topic of one of those 3 and should be on equal level to those as a 4th area, let us know. You can also comment on this group directly if that's easier. These are not meant to be micro-granular and comprehensive so we are not looking at adding small precise items like say, "synonyms and homonyms" or "comma usage", but as I said, we are trying to at least offer a macro listing of the major areas within a discipline.

We'd really like to get more feedback soon so the list is ready when we release the LR search widget in the next few weeks. The subjects used here will have implications with how publishers choose to categorize their resources so they show up quickly in FREE.ed.gov and the LR site, and the list could ultimately be adopted by other similar projects in the future. Thanks for your feedback. ... .joe

-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-
joe hobson, director of technology & innovation
   Navigation North Learning

Jerome Grimmer

unread,
May 13, 2014, 3:34:40 PM5/13/14
to learning-regis...@googlegroups.com, learnin...@googlegroups.com

I cannot comment on the completeness or incompleteness of the list, as I am not a domain expert.  I’m very happy to see Career Clusters in there, and also think (at the moment) that keeping English Language Development (I think of it more as English as a Second Language) and English Language Arts separate was a good idea.  I had a comment or two on the Science sheet, and one on the Math sheet.

 

Speaking as a developer, you’re going to have to strike a balance between comprehensiveness, granularity, and ease of use.  Our experience in Illinois has been that this is a difficult balance to find.  Some in the working group wanted fine granularity, which was great until people tried to use it and were simply overwhelmed by the number of choices.  In other areas, we found we needed to expand the list as it didn’t cover everything we needed it to (much as you all found out with your original subject list).  How the list is presented to the user is key as well, so they don’t feel overwhelmed, yet can easily find what they’re looking for.

 

Good luck, I’ll follow the discussion and see how things develop.  Thanks for all your work in putting this list together.

 

Jerome Grimmer

Applications Analyst,

Southern Illinois University Carbondale

"If you think you're too small to make a difference, you've never spent a night with a mosquito." - An African Proverb

--
--
You received this message because you are subscribed to the Google
Groups "Learning Registry: Collaborate" group.
 
To post: learning-regis...@googlegroups.com
To unsubscribe:learning-registry-co...@googlegroups.com
 
For more options, visit this group at
http://groups.google.com/group/learning-registry-collaborate?hl=en?hl=en

---
You received this message because you are subscribed to the Google Groups "Learning Registry: Collaborate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to learning-registry-co...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jim G

unread,
May 14, 2014, 1:08:58 PM5/14/14
to learning-regis...@googlegroups.com, learnin...@googlegroups.com
One thing to consider here is how Subject Area will be used for discovery of resources, i.e. filter/sort by subject area.  If the use case is that the user selects from a list of high level subject areas to search within, then the subjects should be just at that high level rather than multiple levels of granularity.  If, however, the use case is to filter on a detailed topic, or at more than one level in a taxonomy, then using the association to an item within an educationalFramework ("This Resource Assesses/Teaches/Requires") may be the best solution.

If at the high level, there are some good options.  At the highest level (column A) if the scope is US K12 education then something like SCED Subject Codes might cover it. 

One step below that would be the level at which courses are defined (SCED course codes, CIP codes, etc.) or the strand within an educational framework, such as the CCSS.ELA-Literacy.  In your spreadsheet the columns B/C look like strands in CCSS.ELA-Literacy.RI/RL/RF/W/SL/L... just organized a little differently. 

Columns D and E seem to be apples-oranges, i.e. skill categories that vary in nature by the strand, and types of literature.  My suggestion would be to handle these as separate alignments to an educationalFramework, one for literature types, and another for skills...for skills align to existing framework.  The complexity of the framework could be hid behind the tool

I respect what you are doing, trying to make it easier for the tagger by grouping skills that cross grade levels--compared to the complexity of tagging specific skills in an educational framework.  In some cases resources that need to be very specifically aligned to a specific skill/standard/learning-objective, such as formative assessment items, in other cases a less granular alignment works.  (Having the granular alignment is best but we can't expect everyone to do that.)  From a LR metadata perspective it seems to make sense to have those learning progression skill areas as threads that span grade levels. Skill areas like "spelling" and "reading fluency" categorize sets of competencies (learning standards) that span grade levels.  However, any attempt to define such categories as metadata standards should be done with the subject matter and learning sciences experts at the table.  And these categories could/should be mapped to the predominant learning standards frameworks, e.g. set of all CCSS.ELA-Literature standards for "Spelling" skills.

So, in my opinion the Subject Area should be kept at a high level and alignmentObject for the more granular...at least behind the scenes tags that would go into the LR.

-jg

Jon Fronza

unread,
May 14, 2014, 1:23:30 PM5/14/14
to learnin...@googlegroups.com, learning-regis...@googlegroups.com
Is the search widget you are working on the one found here? https://github.com/navnorth/lr-search-widget/tree/widget-redesign

Is there a target release date that you are shooting for?

Thanks!

joe hobson

unread,
May 14, 2014, 3:28:56 PM5/14/14
to learnin...@googlegroups.com, learning-regis...@googlegroups.com
Yes, that is the search widget. We don't currently have a release date set, but would like to get beta testers to install the widget on their websites in the next few weeks. If you or someone you know runs a site where this could be useful, please send me an email. Thanks. ... .joe

-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-
joe hobson, director of technology & innovation
   Navigation North Learning




--
You received this message because you are subscribed to the Google Groups "Learning Registry Developers List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to learningreg-d...@googlegroups.com.

Joshua Marks

unread,
May 31, 2014, 8:34:38 PM5/31/14
to learning-regis...@googlegroups.com, learnin...@googlegroups.com

Joe,

 

Welcome to a mine field. Your lists seems like a good one, but there is no correct list. This is a big part of the problem, each taxonomy is crafted for some specific use or methodology. They have similar structures but different levels of granularity and ways to approach segmentation. None the less, having something everyone can point to, use and then extended is a really good idea. Perhaps mining all the subjects already published in the LR and seeing how common they are (Or are not) might be a good idea.

 

-Joshua Marks

Curriki

PCG

 

From: learning-regis...@googlegroups.com [mailto:learning-regis...@googlegroups.com] On Behalf Of joe hobson
Sent: Monday, May 12, 2014 4:45 PM
To: learning-regis...@googlegroups.com; learnin...@googlegroups.com
Subject: [Learning Registry: Collaborate] Subject Area Taxonomy

 

At the beginning of this year we started actively supporting the EasyPublish tool, for publishers that wanted to put a set of resources into the Learning Registry without investing too much time or energy into the more technical ways of connecting their sites or tools to LR APIs. We also started working on a search widget to replace the one you now see on FREE.ed.gov and the Learning Registry website. Currently, all 3 of these share the same subject area directory, which we immediately found to be incomplete. For instance, Countries & Continents has sub-topics for 1) Africa, 2) Artic [sic], Antarctica, and 3) Other Countries and Continents.... and that's it.

--

--
You received this message because you are subscribed to the Google

You received this message because you are subscribed to the Google Groups "Learning Registry: Collaborate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to learning-registry-co...@googlegroups.com.

Jim Goodell

unread,
Jun 2, 2014, 6:05:56 PM6/2/14
to learnin...@googlegroups.com, learning-regis...@googlegroups.com
Tagging of Subject Area can be simplified, standardized, and comprehensive if using educationalAlignment and referencing an external controlled vocabulary.  On the consumer side and for users of the easyPublish tool the complexity can be hidden.  (I imagine a UI with a dropdown to select the taxonomy then a dropdown to select the subject area, although it could be done other ways.)
As Joshua noted in an earlier post "each taxonomy is crafted for some specific use or methodology" so the first selection help match the relevant use.

 

On Monday, June 2, 2014 1:25:35 PM UTC-4, Nathan Argo wrote:
This is indeed a tricky matter. Attempting to categorize all human knowledge is not going to be easy or universally agreeable.  I think the best approach is to leave this as a free-text field, let the user enter whatever they think is best, and treat it as one or more keywords/phrases.  Otherwise you'll end up with a list that, at best, resembles a hierarchy of learning standards for every subject...and at worst, a site map of Wikipedia. 

If you don't want to go free-text, then I'd say keep it simple--come up with a very broad, concise list of subjects that learning is generally divided up into, with maybe one layer of depth below that (e.g., algebra, calculus, etc., under Math), and you will have a good set of browsing-friendly topics that people aren't going to be overwhelmed by.  I work with Jerome, and in our experience, as he indicated, having too many features (and scaring users off from using the tool at all) is often worse than having too few (and getting weaker granularity). 

Steve Midgley

unread,
Jun 2, 2014, 6:13:19 PM6/2/14
to <learning-registry-collaborate@googlegroups.com>, learnin...@googlegroups.com
If I understand what Joe is doing is basically selecting a "default" taxonomy per Jim's concept, and showing that in EZ Publish. 

I believe that one could use totally free text tags as an alternative (per Nathan's input). Possibly it could be expanded to include alternative taxonomies.

The main point is that Joe (I think) deals with a lot of groups who want to upload stuff, but don't have strong subject taxonomies. Given them one basic default taxonomy won't help everyone but it will normalize a ton of groups around some basic, easy to understand tags.

I don't think he's suggesting this is a definitive taxonomy of subjects. But since we all seem to want more normalization of inputs into LR, this is a step in that direction. Alternative normalizing efforts would always be welcome. Hopefully we can settle on just a few that cover 80% of the common tagging needs, which would still be a big improvement right?

Steve



Jim Goodell

unread,
Sep 10, 2014, 9:33:21 AM9/10/14
to learnin...@googlegroups.com, learning-regis...@googlegroups.com
One of the more challenging subjects the educational metadata community has discussed is "Subject" or "Subject Area" classifications.  I'm bringing it up again because it is a problem worth solving.  Different organizations have adopted different subject categories that work for their own context, localization, education level (K-12, postsecondary,etc.), academic vs. non-academic, etc.

Someone looking at separate lists of terms might be able to infer equivalents, e.g. math (US) and maths (UK), however there are others that may use the same term to mean something different in different contexts, e.g. "technology".

So, I'd like to propose a shorter and longer term solution.  First, when tagging is done using a tool such as EZPublish, always provide a listbox/dropdown instead of a open text field.  Put the selection from the local controlled vocabulary into the tag for Subject (schema.org:about) instead of free form text. 

Second, have the tool insert two additional tags (or have the system add the tags before sharing to a wider context, e.g. via the Learning Registry):

 Learning Resource Subject Code
(ceds: https://ceds.ed.gov/CEDSElementDetails.aspx?TermxTopicId=20373)

Learning Resource Subject Code System
(ceds: https://ceds.ed.gov/CEDSElementDetails.aspx?TermxTopicId=20374)

(I envision tools like EZPublish used for multiple contexts to have a dropdown to select the Subject Code System and then a second dropdown dynamically populated with the Subject terms for that system.)

By identifying the "Subject Code System" it allows the consuming system to interpret the context of the subject/about tag.  For example, if ISLE pushes metadata to the Learning Registry for a resource about "Technology".  The consuming system could use a crosswalk to apply the correct logic when associating other subjects that use the term "technology" and others that don't, e.g. SCED "Engineering and Technology" isPartOf the ISLE Subject "Technology" AND SCED Subject Area "Computer and Information Sciences" isPartOf ISLE "Technology" .

I think the longer-term solution is to put the URL into the subject/about tag or use the alignmentObject to reference a unique targetUrl for one or more applicable subject areas.  (Phil Barker posted a good example of "about" with both a URL and label recently on the LRMI thread.)  I think for this to be most effective the targetUrl must resolve to a page that is tagged with the (1) name of the subject code system / subject taxonomy / framework, (2) the subject label / node name, (3) a definition statement describing the subject.  Search tools need both a unique identifier (url) for the subject within the context of a framework AND the human readable labels of the framework and the subject/topic within that framework.



On Wednesday, August 20, 2014 11:16:28 AM UTC-4, Nathan Argo wrote:
Apologies for the delay in posting this--We've been pretty swamped here.

During last week's dev call, I was asked to post the categories and tags the Illinois team is using for our metadata.  We have that posted here:
http://ilsharedlearning.org/DevDoc/SitePages/OERMetadata.aspx

Note that not all of this is fully implemented yet--our project is still a work in progress.  We're also considering dropping fields like Group Type, so this list is by no means final.  But there's the information for anyone that would like it.

Stuart Sutton

unread,
Sep 14, 2014, 6:25:29 PM9/14/14
to learning-regis...@googlegroups.com, learnin...@googlegroups.com
I think the longer-term solution is to put the URL into the subject/about tag or use the alignmentObject to reference a unique targetUrl for one or more applicable subject areas.  (Phil Barker posted a good example of "about" with both a URL and label recently on the LRMI thread.)  I think for this to be most effective the targetUrl must resolve to a page that is tagged with the (1) name of the subject code system / subject taxonomy / framework, (2) the subject label / node name, (3) a definition statement describing the subject.  Search tools need both a unique identifier (url) for the subject within the context of a framework AND the human readable labels of the framework and the subject/topic within that framework.

My take on this is that the "long-term solution" starts unfolding at some time and that time might as well be right now.  What are we waiting for?

I went looking for Phil's post without any luck but wasn't exactly sure what I was looking for.  

Both the conceptual and technical root of this problem is the continuing reliance on a "term-based" approach to vocabulary development that has dominated the world of knowledge organization systems until recently --i.e., taxonomies, thesauri, simple controlled lists etc. In term-based approaches, the first-order element in identifying meaning in controlled vocabularies and authority files is text--i.e., like "technology" in your example, Jim.   While you are right that your suggestion to use the intersection of one or more text-based values to determine unique identity and thus disambiguate otherwise ambiguous text is one way to do it.  But it doesn't get us far in reaching that "long-term" solution. Getting there actually means dumping term-based approaches in the vocabularies we want to share (globally?) and embracing instead a concept-based approach that identifies concepts by URI and supports different materializations of those concepts in human language(s) (i.e., lexicalizations).  "Technology" is in fact a concept to which we English-speaking people attach (although inconsistently) the English text label "technology". BUT, what about "تكنولوجيا", "Technologie", "Τεχνολογία", "Technologies", Tecnología", "Technologie", "प्रौद्योगिकी", "Teknologi",  "テクノロジ, "기술", "Hangarau", "Teknoloji", "Công nghệ", "技术"? What matters in all these is that the same concept--concept is first-order and labels are second-order


The W3C's concept-based Simple Knowledge Organization System (SKOS) has been mentioned on these lists on various occasions. The uptake in the use of SKOS to define value spaces and to map legacy value spaces is very high.  SKOS is becoming the order of the day among major players in the Web-linked world who have controlled vocabularies of a magnitude that makes anything we might discuss here seem trivial including the United Nations Food and Agriculture Organization, the Library of Congress, the Getty (Art & Architecture Thesaurus (AAT), the British and French national libraries, and, starting this month, Getty's Thesaurus of Geographic Names (TGN)).  Some are sharing name authority files across nations through VIAF. 

Check out an example concept by just throwing this Library of Congress URI into a browser: http://id.loc.gov/authorities/subjects/sh85133147. Since you are a human, it will land you on a pretty HTML page through the browsers content negotiation. But, if you are an application of some sort, it will oblige and hand you usable RDF data...beautiful data you can use locally in many, many ways!

Using your example for "Technology", Jim, I pulled up a concept in Australia's SKOS-based Schools Online Thesaurus (SCoT) and get the following URI: http://vocabulary.curriculum.edu.au/scot/358. Toss it into a browser and content negotiation will hand you a pretty HTML page  and RDF data if you are a machine. I have attached an N3 version of the SCoT RDF data underneath the pretty technology concept page...take a moment and check out the lexicalization of labels near the bottom of the file denoting the concept in 19 languages (some above).

are so, so unfriendly. They give me absolutely no clue as to what they "mean"...that is such bad practice, right? Bad practice, perhaps, if these were URLs that people want/should/can try to remember but need some mnemonic or other kind of help.  Point blank, applications that deal with URI don't need mnemonic help!  While people fight over this all the time, I stick with Sir Tim Berners-Lee's assertion that URI should be opaque, i.e., that no URI should try to telegraph meaning and no application should ever try to "interpret" a URI by dissecting it.  We don't say "how unfriendly" that is when we have the misfortune to look at a UUID.  Why don't we?  Because, in practice, humans don't have to ever deal with URI anymore than they have to deal directly with UUIDs or the URL's under every link they click on a page.

Semanticizing  http://id.loc.gov/authorities/subjects/sh85133147 into http://id.loc.gov/authories/subjects/technology may feel like such a meaningful thing to do until you consider that it's (perhaps) meaningful for English-speaking peoples and likely as opaque to a Russian (human) "user" as "85133147".  No winning that one with broadly shared vocabularies.

Stuart
Technology-SCoT.txt

Steve Midgley

unread,
Oct 16, 2014, 10:43:24 AM10/16/14
to learnin...@googlegroups.com, <learning-registry-collaborate@googlegroups.com>
Hi,

The LR easy publish tool does provide a pre-defined set of terms for tagging which is something. (learningregistry.org/easypublish). But it's not a formal vocab and alternative term sets are not yet provided for in the UI. 

That is something I'd like to see in future versions for sure - same thing for the standards selector - right now it just supports common core but it would be great if it could support other competency and curricular frameworks as well, and the user could select their preferred one from a list somehow..

Steve


On Mon, Oct 13, 2014 at 5:01 AM, Renato <rmcort...@gmail.com> wrote:

Dear all,

I couldn't agree more on the vision expressed by Stuart.


Yet, for a number of understandable reasons, there is plenty of metadata already available (and more added every day) based on the traditional “term-based” approach, rather than the “concept-based” (or semantic, or linked data) approach.


Metadata enrichment is a potential solution to bridge from “term-based” (unstructured) metadata to “concept-based” (structured) metadata, basically (automatically / semi automatically) associating concepts (formal meaning) to terms (free text).


This technique could be used to enhance legacy metadata.


Possibly, it could be used also to simplify the tagging of new resources, by mapping terms to concepts in the tagging tool. Or, in this specific case, the tagging tool could just let the user select a formal vocabulary, navigate through it's structure and identify the (human readable) target concept?


Any comment/experience on that?


Thank you,

Renato

--
You received this message because you are subscribed to the Google Groups "Learning Registry Developers List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to learningreg-d...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages