UAT in R as a dataframe

15 views
Skip to first unread message

Christian Tzurcanu

unread,
Dec 4, 2013, 9:08:45 PM12/4/13
to uat-...@googlegroups.com
Dear UAT users,

My name is Christian Tzurcanu and I volunteer for an open source project that intends to compile a tree of al fields of science and make it a R language dataframe. We would like to include UAT in this data.
Up to this point we have about 10.000 terms from other fields like math, computer science, medicine. Each record/observation has pointer to its source (eg: many math fields point to http://www.ams.org/mathscinet/msc/msc2010.html ) and will be released under a Creative Commons Attribution-ShareAlike 3.0

When will your data become ready to be shared to the world? (we are anxious to include it early)

Thank you for your time,
Christian Tzurcanu

Katie Frey

unread,
Dec 5, 2013, 5:00:03 PM12/5/13
to uat-...@googlegroups.com
Hi Christian,

Your R language dataframe project sounds very interesting, and we would be happy to be included!  Is it online somewhere that we can look at it?

The UAT is currently available for download in several formats, you can get information on that here:
http://astrothesaurus.org/thesaurus/

It is, however, still in "beta" as we currently searching for the right management tools and finding volunteers to edit the UAT.  I do not have a time frame one when "version 1" might be released, but we will certainly keep everyone on this list informed.

I might be able to provide the UAT in some other formats, if you need something in specific, but it would depend on what you are looking for.

Best regards,
Katie


--
Katie E. Frey
John G. Wolbach Library
Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS-56, Cambridge, MA 02138
kf...@cfa.harvard.edu
617-496-7579

http://astrothesaurus.org
http://www.cfa.harvard.edu/lib/
http://www.adsabs.harvard.edu/

"Surprising what you can dig out of books if you read long enough, isn’t it?”
- Rand al'Thor (in Robert Jordan's The Shadow Rising, Book Four of the Wheel of Time)


--
You received this message because you are subscribed to the Google Groups "UAT users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to uat-users+...@googlegroups.com.
To post to this group, send an email to uat-...@googlegroups.com.
Visit this group at http://groups.google.com/group/uat-users.
For more options, visit https://groups.google.com/groups/opt_out.

Christian Tzurcanu

unread,
Dec 6, 2013, 6:34:40 PM12/6/13
to uat-...@googlegroups.com
Hi Katie,

not online yet. I am working on harmonizing other thesauri ATM. Then they will be together exported in R. About 22.000 terms not including UAT. All in this group will be invited as soon as I make a first version :)
One example of our work on Terminologia Anatomica (not yet in R, but just for you to know the direction):
as you can see: 7 languages 1 geometry each level served on demand

in R there will be a library for managing updates of terms, calculation of exact translation, etc.

Christian Tzurcanu

unread,
Dec 6, 2013, 9:25:18 PM12/6/13
to uat-...@googlegroups.com
Hello again,

Since I had seen interest on our R project, I would like to offer some general view of what we want to deliver and why:

R language is very good at managing, calculating quantitative data at this point. Our goal is to make it able to handle qualitative data in the form of controlled vocabulary and common terminology. We want ability to process the following more interesting functions:

On the whole dataframe:
-export/import to the cloud
-verify integrity
-upgrade to the last accepted definition
(in our case the cloud is a constellation (as a matter of speaking :)) of database servers)

On each observation/record:
-translate to a new language
-update/correct term locally and replicate the update into the cloud
-calculate plural, singular, masculine, feminine, likely synonyms, antonyms
-provide a complete history of updates and their authors since the creation of the record

on a text in a known natural language:
-recognize the terms of a chosen controlled vocabulary and mark them with a pointer to the current version of the terms.
-calculate a map of meanings (and other functions of Semantic Web)
-perform related statistical analysis of those 2 operations

This is not an exhaustive list of interesting features.

Thank you,
Christian Tzurcanu
Reply all
Reply to author
Forward
0 new messages