re: validation check of inadmissible relationships in the UAT

16 views
Skip to first unread message

Katie Frey

unread,
Oct 30, 2013, 4:09:59 PM10/30/13
to uat-...@googlegroups.com
Bas Braams,

I sent your question about the inadmissible relationships section of ISO 25964 on to Jack Bruce, one of the thesaurus experts at Access Innovations.  He came back with the comments below, and he also suggests a course of action for clearing up some of the terms if that is desired.

----

Section 14.3 of ISO 25964-1 states that "If two terms or concepts already have one of the basic relationships, no other basic relationship between the same terms or concepts is admissible". The validation check must detect whether Term A and Term B contain a proper relationship between one another.

If Concept A (Binary stars) has Concept B (Multiple stars) as a broader term, then Concept B cannot have Concept A as a broader term simultaneously. This would create an infinite recursion within the hierarchy, which is the main concern for the validation check. A recursion would extend endlessly throughout the vocabulary.

(ex.  Astronomical objects > Star systems > Multiple star systems > Multiple stars > Binary stars Multiple stars > Binary stars > Multiple stars > Binary stars > ... )

Additionally, if one term subsumes the other, then an associative relationship (RT) would not be appropriate between the two.

The example given in 14.3 appears to be much more strict in terms of functionality and checking for valid relationships. The line "If Concept A has BT Concept B, none of the concepts in the BT hierarchy above Concept B should be admissible as BT, NT or RT of Concept A" suggests that terms in the hierarchy above Multiple stars must not contain Binary stars as a narrower term or related term. It may be impossible to detect whether or not the RTs would be appropriate.

In the UAT, examples 2,3, and 4 do not create a recursion. Binary stars do not contain Multiple stars, Stars, or Binary systems as a Narrower Term. However, for a reviewer of the thesaurus, it may or may not be appropriate for the term Binary stars and Multiple stars to be nested under Stars. Removing Stars as a Broader Term for Binary stars and Multiple stars would resolve the issue this reviewer is facing. For polyhierarchy, it may be difficult to determine whether or not these terms should be "aunts and uncles" of themselves, but if it is jarring to a reviewer, then I would suggest removing them. If their relationship causes a recursion, then they must be removed immediately.

There are exceptions (as stated by 14.1 General section of ISO 25964-1), but I believe that the important factor for 14.3(g) is to prevent an infinite recursion within the hierarchy. The relationships may be invalid based on the concepts themselves (such as a misplaced term), but I don't think these polyhierarchical terms are of critical importance. It may be easier to make the suggestion stated above and move on to the other sections of the thesaurus.

----

Any thoughts on his response or what action should be taken, if any, from here?

Best regards,
Katie

--
Katie E. Frey
John G. Wolbach Library
Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS-56, Cambridge, MA 02138
kf...@cfa.harvard.edu
617-496-7579

http://astrothesaurus.org
http://www.cfa.harvard.edu/lib/
http://www.adsabs.harvard.edu/

"Surprising what you can dig out of books if you read long enough, isn’t it?”
- Rand al'Thor (in Robert Jordan's The Shadow Rising, Book Four of the Wheel of Time)


On Thu, Sep 12, 2013 at 10:53 AM, Bas Braams <bjbr...@gmail.com> wrote:
Hi All,

Do I see in the just posted example a violation of Section 14.3.g of the International Standard ISO 25964-1:2011(E) that governs Thesauri for information retrieval? Let me raise this in case it should be of concern.

The example, with identifiers [1]-[5] added, is:

[1] Astronomical objects > Star systems > Multiple star systems > Multiple stars > Binary stars
[2] Astronomical objects > Stars > Multiple star systems > Multiple stars > Binary stars
[3] Astronomical objects > Stars > Multiple stars > Binary stars
[4] Astronomical objects > Stars > Binary stars
[5] Astronomical objects > Binary systems > Binary stars

It is understood that multiple paths are allowed. A concept can have more than narrower term (NT) and it can also have more than one broader term (BT). Nothing in the standard prohibits the simultaneous occurrence of [1], any single one of [2]-[4], and [5]. But I believe that no pair of [2], [3] and [4] are simultaneously allowed.

<< 14.3.g: Validation checks should prevent entry of inadmissible relationship combinations, as follows: [...] If concept A has BT Concept B, none of the concepts in the BT hierarchy above Concept B should be admissible as BT, NT or RT of Concept A. >>

Let's instantiate that with "Binary Stars" for Concept A and "Multiple Stars" for Concept B; lines [1]-[3] show that Concept A has BT Concept B. Then according to [2] or [3] "Stars" is a concept in the BT hierarchy above concept B and according to [4] it is a BT of Concept A. We have a conflict.

Bas Braams
Message has been deleted

Bas Braams

unread,
Oct 30, 2013, 4:54:27 PM10/30/13
to uat-...@googlegroups.com
Dear Katie et al.,

I think that the authors of ISO 25964-1 could have been more clear about the rules that govern the occurrence of NT, BT and RT relations, but the rules (including the one in Section 14.3) make sense and are workable.

Let me provide a mathematical view. There is in mathematics the concept of a directed acyclic graph and there is the concept of an order relation. Every directed acyclic graph gives rise to an order relation by the process of "transitive closure". Conversely, for finite sets, every order relation can be represented in a unique way by a minimal directed acyclic graph (which may be called the transitive reduction); the transitive closure of this directed acyclic graph is the original order relation, but if any edge is removed then that property is lost.

The authors of ISO 25964-1 would have done us all a service if they had stated clearly that the NT relation represents the directed edges in a directed graph with two rules: (1) no cycles, and (2) no shortcuts. The BT relation are then the same edges with the direction of the arrow reversed. The "no cycles" rule is the obvious one that we would all expect to be present, but the "no shortcuts" rule might be unexpected. It is in that Section 14.3.

Of course the language "narrower term" and "broader term" suggests that we have an order relation. If Alice is narrower than Bob and Bob is narrower than Charlie then Alice is narrower than Charlie. So it needs to be hammered into our heads that NT and BT of ISO 25964-1 are not order relations, they are only edges in a directed graph; or, just as well, they represent a transitively reduced order relation.

The transitive closure concept is occasionally used in ISO 25964-1 and related documents. Associated with the NT (or, just as well, BT) relation there is a proper order relation, namely the transitive closure of NT (or of BT). In that transitive closure, concept A is narrower than concept Z if there is a path A, B, C, ..., Z (all distinct) such that A is an NT of B and B is an NT of C and, so on through Z. However, such a path (if it involves at least one intermediate term) does not make A an NT of Z, and the "shortcut" that says that A is an NT of Z is prohibited if there is a longer path.

Once again, there are two rules. The "no cycles" rule is very natural. The "no shortcuts" rule is matter of choice; I think that the authors of ISO 25964-1 could have chosen to allow shortcuts. However, it is perfectly reasonable to prohibit the shortcuts; they are not needed to define the order relation; they might create more of a mess in applications if one is doing backtracking, and in any case it is easy enough to avoid them.

My recommendation is to adhere to the standard. Respect Section 14.3, no shortcuts. I believe that standard thesaurus management software checks for adherence to this rule. In any case; if it claims adherence to the ISO standard then it should check this rule as well.

Bas
Reply all
Reply to author
Forward
0 new messages