UAT Excel file

18 views
Skip to first unread message

Katie Frey

unread,
Sep 10, 2013, 3:43:29 PM9/10/13
to uat-...@googlegroups.com
Hello everyone,

If you have been looking at and commenting on the Excel version of the UAT, please download the attached file and use this one instead.  It was pointed out to me by Michael Roberts that the previous version was missing some terms and was only about half as deep as the actual thesaurus.  The attached file, created by Michael Roberts, has been updated to include all of the missing terms and is a much more accurate picture of the UAT.

Thanks everyone for your comments thus far, we really appreciate hearing your keen observations!

Katie

--
Katie E. Frey
John G. Wolbach Library
Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS-56, Cambridge, MA 02138
kf...@cfa.harvard.edu
617-496-7675


http://astrothesaurus.org

http://www.cfa.harvard.edu/lib/
UAT Hierarchy beta 002.xls

Katie Frey

unread,
Sep 11, 2013, 6:00:09 PM9/11/13
to uat-...@googlegroups.com
In response to an inquiry by Heinz, I analyzed the difference between the old Excel file and the new one.

Something important to keep in mind is that each row of the Excel file should best be thought of as a unique path to any given term.  Each row does NOT in itself represent a unique term.  The UAT has a polyhierarchy structure, meaning that some terms can be reached by following more than one path.

For example, these are all different ways to follow the hierarchy down to the term "Binary stars":

Astronomical objects > Star systems > Multiple star systems > Multiple stars > Binary stars
Astronomical objects > Stars > Multiple star systems > Multiple stars > Binary stars
Astronomical objects > Stars > Multiple stars > Binary stars
Astronomical objects > Stars > Binary stars
Astronomical objects > Binary systems > Binary stars

Even though there is only one term "Binary stars" in the thesaurus, the Excel file has it listed at the end of at least five different paths.

The new Excel file accurately reflects the true depth of the thesaurus, which is 12 levels deep.  The older file was only 7 levels deep.  This meant that many paths were cut short at 7 levels deep and were subsequently deleted as they just appeared to be duplicates of existing paths.

The new Excel file includes 218 new paths (or rows) that were previously missing, most of which end with terms that existed elsewhere in the file.  I could only find 3 terms that were completely missing from the old Excel file.  These terms are three named Plutinos:
Huya
Ixion
Orcus

These terms were always included in the SKOS version of the UAT, as well as the browseable versions found on the website.  It was just an error with the Excel version.  And I hope this helps to make sense of the difference between the Excel versions.


Katie

--
Katie E. Frey
John G. Wolbach Library
Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS-56, Cambridge, MA 02138
kf...@cfa.harvard.edu
617-496-7675


http://astrothesaurus.org

http://www.cfa.harvard.edu/lib/



Bas Braams

unread,
Sep 12, 2013, 10:53:48 AM9/12/13
to uat-...@googlegroups.com
Hi All,

Do I see in the just posted example a violation of Section 14.3.g of the International Standard ISO 25964-1:2011(E) that governs Thesauri for information retrieval? Let me raise this in case it should be of concern.

The example, with identifiers [1]-[5] added, is:

[1] Astronomical objects > Star systems > Multiple star systems > Multiple stars > Binary stars
[2] Astronomical objects > Stars > Multiple star systems > Multiple stars > Binary stars
[3] Astronomical objects > Stars > Multiple stars > Binary stars
[4] Astronomical objects > Stars > Binary stars
[5] Astronomical objects > Binary systems > Binary stars

It is understood that multiple paths are allowed. A concept can have more than narrower term (NT) and it can also have more than one broader term (BT). Nothing in the standard prohibits the simultaneous occurrence of [1], any single one of [2]-[4], and [5]. But I believe that no pair of [2], [3] and [4] are simultaneously allowed.

<< 14.3.g: Validation checks should prevent entry of inadmissible relationship combinations, as follows: [...] If concept A has BT Concept B, none of the concepts in the BT hierarchy above Concept B should be admissible as BT, NT or RT of Concept A. >>

Let's instantiate that with "Binary Stars" for Concept A and "Multiple Stars" for Concept B; lines [1]-[3] show that Concept A has BT Concept B. Then according to [2] or [3] "Stars" is a concept in the BT hierarchy above concept B and according to [4] it is a BT of Concept A. We have a conflict.

Bas Braams
Reply all
Reply to author
Forward
0 new messages