Re: [uat-users] comments on UAT Beta v2

10 views
Skip to first unread message

Katie Frey

unread,
Dec 5, 2013, 5:45:46 PM12/5/13
to uat-...@googlegroups.com
Hi again Christian,

Thank you for your comments.  The UAT strives to follow thesauri standards in regards to formatting conventions.

You have a good point regarding the use of capitalization.  The UAT that we originally received as the result of a merger between the IoP and AIP vocabularies came with the first letter of every term capitalized, and, personally, I haven't given that much thought.  Reviewing the standard briefly just now it does say that terms should be predominately lower case and then capitalized for proper names, etc.  This is something we should address moving forward.

As for plural vs. singular case; the standards indicate that most terms should be plural, but there are, of course, cases where singular makes the most sense (such as when referring to the planet Earth).  In fact the standards also outline examples in which the singular and plural versions of a term both exist within a thesaurus and have different meanings!  I'm not sure how often we will come across this issue, but over all the UAT will likely remain with most terms in plural (using "galaxies" instead of "galaxy").

Lastly, I am only a little familiar with the use of masculine, feminine, and neutral nouns in other languages.  I'm not sure we want to make an outright decision on this now, but instead leave it to our editors with the ability to translate once we get to that point.  If/when we establish a multilingual UAT, I think we would use preferred terms that follow the conventions of their language in the most natural way.

Overall, making the UAT machine readable has always been a goal for our project.  Towards that end, we've been maintaining our files in standard machine readable formats such as SKOS and RDF, and several institutions, including IoP, AIP, and the SAO/NASA ADS, have expressed a strong interest in using the UAT for indexing articles.

Is the synthetic writing tool related to the R language dataframe project you mentioned in a previous email?  I would be interested to hear more about both tools!

Best regards,
Katie

--
Katie E. Frey
John G. Wolbach Library
Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS-56, Cambridge, MA 02138
kf...@cfa.harvard.edu
617-496-7579

http://astrothesaurus.org
http://www.cfa.harvard.edu/lib/
http://www.adsabs.harvard.edu/

"Surprising what you can dig out of books if you read long enough, isn’t it?”
- Rand al'Thor (in Robert Jordan's The Shadow Rising, Book Four of the Wheel of Time)


On Wed, Dec 4, 2013 at 11:02 PM, Christian Tzurcanu <christian...@gmail.com> wrote:
Hi all,

I'd second that with:

the following computational advice: the terms should be (where possible):
-without capitals unless eponym
-at singular as number
-at masculine as gender (but in English you don't have this problem, but in translations you will)

Why? Because you want to make this classification machine-readable. There is minimal manipulation to make this form human-readable where the reverse is not true.

Why you want this to be machine-readable? So we can use tools like the synthetic writing tool at:

also good for indexing research papers :D

Thanks,
Christian Tzurcanu, volunteer


On Thursday, 5 September 2013 05:15:00 UTC+3, Heinz Andernach wrote:
Hi Katie,

your email with the comments by Bas Braams reminded me that I had scribbled
comments on my printout of the UAT_Beta_v2 which I had downloaded on July 8, 2013
but never made it to write them down in an email.  Here they go.

I'm afraid that it appears to me that V2 of the UAT has never been
revised by a researcher attentively.  I can't claim that I've looked
at all details, but concentrated on my area (galaxies, radio sources,
and large-scale structure) in addition to a few obvious general topics.
This version of the UAT needs a great deal of overhaul, homogenization,
and straightening out of logical errors. The homogenization e.g. would
avoid that sometimes words are used in singular, sometimes in plural...
I think the index pages for the major (large-volume) astronomical
textbooks could be of great help in this task.  Looking forward to further
feedback.

Regards,

Heinz Andernach
Depto. de Astronomia, Univ. Guanajuato  tel: +52-473-732-9548 or 732-9607 (ext. 2505)
Apartado Postal 144                     FAX: +52-473-732-0253
Guanajuato, C.P. 36000, GTO, Mexico     Email: he...@astro.ugto.mx

Christian Tzurcanu

unread,
Dec 6, 2013, 6:53:32 PM12/6/13
to uat-...@googlegroups.com
Hi Katie,

I am sorry, I just got introduced to SKOS. Until now I was producing my own ontology tools. Imagine that I have reached almost the same conclusions as W3C and ISO without looking through their research :)) I will have to do the homework of reading their docs sometime..
 
Now I will answer on each point:


On Friday, December 6, 2013 12:45:46 AM UTC+2, Katie Frey wrote:
Hi again Christian,

Thank you for your comments.  The UAT strives to follow thesauri standards in regards to formatting conventions.

You have a good point regarding the use of capitalization.  The UAT that we originally received as the result of a merger between the IoP and AIP vocabularies came with the first letter of every term capitalized, and, personally, I haven't given that much thought.  Reviewing the standard briefly just now it does say that terms should be predominately lower case and then capitalized for proper names, etc.  This is something we should address moving forward.

As for plural vs. singular case; the standards indicate that most terms should be plural, but there are, of course, cases where singular makes the most sense (such as when referring to the planet Earth).  In fact the standards also outline examples in which the singular and plural versions of a term both exist within a thesaurus and have different meanings!  I'm not sure how often we will come across this issue, but over all the UAT will likely remain with most terms in plural (using "galaxies" instead of "galaxy").
 
Then I will have to revise my general ontology too. What is the source link for this standard?
 
Lastly, I am only a little familiar with the use of masculine, feminine, and neutral nouns in other languages.  I'm not sure we want to make an outright decision on this now, but instead leave it to our editors with the ability to translate once we get to that point.  If/when we establish a multilingual UAT, I think we would use preferred terms that follow the conventions of their language in the most natural way.

Overall, making the UAT machine readable has always been a goal for our project.  Towards that end, we've been maintaining our files in standard machine readable formats such as SKOS and RDF, and several institutions, including IoP, AIP, and the SAO/NASA ADS, have expressed a strong interest in using the UAT for indexing articles.

Is the synthetic writing tool related to the R language dataframe project you mentioned in a previous email?  I would be interested to hear more about both tools!

The data that I have demoed in the tool is not the last version. It does not respect even the standards that I have proposed for UAT. Please forgive.

Now the answer:
Yes. Same data will be available in R and the tool. Also the data will be replicable as open source knowledge at the click of the button. If one trustworthy source modifies the R dataframe for one term, all research papers that use the term and were composed with the synthetic tool will be upgraded. Good idea? :D

We have another sum of even better ideas for research in general. But we, the project, lack connections. How can we put forth these ideas for peer review in a more accessible form for a wider audience of researchers?


Best regards,
Katie
 
Thank you for interest,
Christian Tzurcanu
Reply all
Reply to author
Forward
0 new messages