Creating Taxonomies

101 views
Skip to first unread message

m.gor...@gmail.com

unread,
Jan 17, 2019, 6:03:28 PM1/17/19
to AtoM Users
My brain is MARC-oriented, so I am confused on the rules (if any) for what constitutes a legit broad term/narrow term relationship.  I'd want to build our taxonomies with controlled vocabularies such as LCSH and TGM.  I'm a little confused on how these are supposed to be constructed.  As I see it, in the SKOS approach, I wouldn't want to mix taxonomies.  Such as, unlike MARC where I could have a geographic subfield (narrower term) under a topical subject term, in AtoM I would not want to mix geographic and subject terms.  Is this understanding correct?

Would History be a legit narrow term for Southern Illinois University Carbondale?  Or would History need to be a 'main entry' in its own right, and any narrower term beneath SIUC need to be some academic department or unit of the university?

What if I had a collection about the history of higher education?  Using the LCSH term "Education, Higher" would History be a legit narrower term?  Are narrower terms supposed to only be used for 'type of' relationships like Trees (BT) - Oak Trees (NT)?

I'm just afraid of doing it wrong.

Thanks,
Matt G
SIU Carbondale

Creighton Barrett

unread,
Jan 18, 2019, 10:48:34 AM1/18/19
to ica-ato...@googlegroups.com
Hi Matt,

Yesterday, I responded to a similar question of yours over at the CMT Section forum. Did that help? You can add broader and narrower terms to subjects, but you're right that you would only want to use topical subject terms, not geographic places or actor names. If you are viewing a subject in AtoM, it will show you all results with that term and from there you can use the facets to narrow results by geographic place or genre.

Cheers,
Creighton

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/af2e6ad4-ec0b-4e8e-844d-9402b4d4a569%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dan Gillean

unread,
Jan 18, 2019, 11:24:08 AM1/18/19
to ICA-AtoM Users
Hi Matt, 

If you are building a locally controlled vocabulary, there's not really going to be a "wrong way" if it works for you. That said, I understand that you want to model your local terms on existing controlled vocabularies. 

I will point out that if you are curious, many vocabularies such as LCSH are available as SKOS RDF/XML downloads, so you can look at how they are organized - it's a 16.GB file when unzipped, but you can grab the LCSH terms at the bottom of this page: 
In the LCSH SKOS file, subdivisions are just hardcoded as single terms - an example: 

    <ns0:prefLabel xmlns:ns0="http://www.w3.org/2004/02/skos/core#" xml:lang="en">Nursing home patients--Abuse of--Massachusetts</ns0:prefLabel>

There are still some broader/narrower term relations in the LCSH SKOS file, so without going into a deep analysis of how it's organized, it does seem like a mixed strategy is used. 

As was mentioned in a previous thread, you could certainly choose to implement your terms this way if desired. Once again I was curious to see how ArchivesSpace (which allows you to create subdivisions) handles things, and as far as I can tell (though I could be wrong!), there is no way in ASpace to organize subject terms hierarchically. If the hierarchical possibilities of AtoM's taxonomy term management are just adding confusion, there's no reason you can't qualify and subdivide and hardcode your terms and keep your vocabulary flat if that works best for you. 

One alternative for the first example you provided would be to create an authority record (rather than a subject term) for Southern Illinois University Carbondale, and add that as a name (subject) access point to the record, with a separate subject access point for History. The authority records allow for fairly complex relations to be created, so you could technically create authorities for each department and then add qualified relationships to them - see: 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory


Matt Gorzalski

unread,
Jan 18, 2019, 4:01:37 PM1/18/19
to ica-ato...@googlegroups.com
If I understand correctly, I could download and integrate the LCSH SKOS file into AtoM so the terms would be in the taxonomies, and would auto-generate when linking terms to collections?

It seems to me that the LCSH SKOS file generates the same results as though I typed out the string manually, or we imported the entire string from Archon like was mentioned previously.  If we migrate into AtoM, I think I prefer to not use subject strings and go with the BT/NT relationships and keep the taxonomies separate. 

I see how SIUC and History would be in separate taxonomies.  I remain unclear on the BT/NT relationships between topical subjects.  I only see it making sense if you are doing "type of" relationships like Fruit (BT) - Apple (NT), as opposed to something like College sports (BT) - History (NT).  Does having a narrower term assigned to different broader terms affect search/browsing negatively?  Example: Professional sports (BT) - Basketball (NT) vs. College sports (BT) - Basketball (NT)?  

Matt


You received this message because you are subscribed to a topic in the Google Groups "AtoM Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ica-atom-users/CztbAsM5uhM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ica-atom-user...@googlegroups.com.

To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.

Dan Gillean

unread,
Jan 18, 2019, 5:43:26 PM1/18/19
to ICA-AtoM Users
Hi Matt, 

If I understand correctly, I could download and integrate the LCSH SKOS file into AtoM so the terms would be in the taxonomies, and would auto-generate when linking terms to collections?

We did this for a client once, and while it is technically possible... I have to say, I strongly recommend against importing the entire LCSH vocabulary into your AtoM instance. The sheer size of it makes it unwieldy from a data load perspective, a performance perspective, and also an end-user usability perspective. 

In terms of loading it, I wasn't directly involved with the project, but I believe it took a bit of hackery, and throwing a lot of system resources (memory in particular) at it to be able to get it to import at all. 

Performance wise, once you start having hundreds of thousands of terms in a single taxonomy, it can become difficult to manage. Most of AtoM's functionality is still performed synchronously - that is, via the web browser in real time once an action is submitted. Browsers tend to have a built-in timeout limit of about 1 minute, to prevent long-running tasks from running endlessly and/or consuming all available client resources. If you tried to edit a term deep within a hierarchy that had thousands of children and siblings and potentially many levels of terms above it, odds are high that the web browser will time out before AtoM can make all the necessary updates to the related terms if you start editing the preferred form of name, for example. 

From a usability perspective, it became a nightmare for both archivists and researchers. The institution in question found that it rendered browsing by term virtually useless for themselves and the public - the vast majority of terms were unused and the LCSH SKOS file rarely if ever includes scope notes, so you end up with thousands of pages of results with no related descriptions showing up. Similarly, finding the right term to link in the autocomplete dropdowns on descriptions (i.e. subject access point field) became very difficult. 

There are likely other gotchas as well, but as I mentioned, I wasn't directly involved with this project. In the end however, we are currently working on a project with the client to remove LCSH and replace it with a locally modified version of FAST topical terms instead, which is much much leaner. 

I personally think it much wiser to use LCSH or other large controlled vocabularies as guidance, and to create a curated subset of terms locally for use. You can still use the sourceNote field to link them back to the source vocabulary, but this way you are not overwhelming your staff and researchers with a huge and largely unused list of terms.  

Just some food for thought. 



It seems to me that the LCSH SKOS file generates the same results as though I typed out the string manually, or we imported the entire string from Archon like was mentioned previously.  

Yes, that was my conclusion as well. 



I see how SIUC and History would be in separate taxonomies.  I remain unclear on the BT/NT relationships between topical subjects.  I only see it making sense if you are doing "type of" relationships like Fruit (BT) - Apple (NT), as opposed to something like College sports (BT) - History (NT). 

I see no reason why you couldn't use a mix of the two strategies - use broader/narrower relations where it makes sense (such as "type of" relationships), and hardcode qualified/pre-coordinated terms when needed. You could choose to make your coordinated terms as either children or siblings of the non-coordinated versions of them - e.g. either:
  • Coffee
  • Coffee -- 18th Century
  • Coffee -- 18th Century -- Manufacturing
or: 
  • Coffee
    • Coffee -- 18th Century
      • Coffee -- 18th Century -- Manufacturing

The most important thing, in my opinion, will be to make decisions about your local conventions in advance, establish clear policies and workflows around them, and stick to that for consistency and easier management over time. 

 
Does having a narrower term assigned to different broader terms affect search/browsing negatively?  Example: Professional sports (BT) - Basketball (NT) vs. College sports (BT) - Basketball (NT)?  

On the descriptions search/browse page, faceting and filtering is done on single terms, independent of parentage. So it doesn't really affect anything there. 

On the term browse pages for example, if you go to Browse > Subjects, and then click through to a subject term), the related descriptions shown will include descriptions where the term is inherited, but users can limit the results to only those where the term has been directly applied. 

For example, if you have hierarchy like this: 
  • Canada
    • Ontario
      • Sudbury, District of
        • Sudbury, Town of
Then viewing the term "Ontario" will include results that were tagged with "Sudbury, Town of" and "Sudbury, District of" by default (because the relationship is inherited) - however, users can also limit the results to only descriptions where the term in question has been applied directly: 

term-browse-inheritance.png

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

Reply all
Reply to author
Forward
0 new messages