query predicate vocabulary

Hilmar Lapp

unread,

Oct 2, 2010, 6:15:04 PM10/2/10

to TreeBASE devel, PhyloWS Google Group, Nico Cellinese

(Sorry for cross-posting - this does concern both treebase and phylows
groups though)

Rutger et al,

I assume that the TreeBASE terms and predicate vocabulary terms
spreadsheet [1] is the basis for the treebase.owl [2] ontology in the
TreeBASE codebase? If so, I was happy to see the tb. prefix removed
from the property labels in the OWL file. If not, what is the relation
between the two? (BTW neither vocabulary artifact is linked from the
TreeBASE API documentation [3] - shouldn't they, or at least one of
them, possibly the OWL file?)

One of the outcomes of the Phylogenetics Standards working meeting at
the 2010 TDWG conference [4] is that we need to move forward on a
standard PhyloWS query predicate vocabulary. Obviously, the one
created for TreeBASE would be a template for that, and ultimately
TreeBASE could import that vocabulary, and add its own custom
predicates (similarly as it now imports Dublin Core and then adds its
own predicates). I think this approach (standard ontology that
individual data providers import into theirs and then add to) would
also allow data providers to add annotations indicating which
predicates they actually support.

Are there concerns or considerations that could make this a bad idea?

Assuming for a moment that it's not a bad idea, here are some initial
thoughts I had when looking at the treebase.owl file as the presumed
starting point.
1) The pattern for constructing the label uses a dot to delimit
"words" (e.g., identifier.tree), whereas normally the pattern I've
seen uses CamelCase (which would yield treeIdentifier). For the
standard vocabulary I'd rather stick with common conventions, so what
were the reasons or examples that motivated the dot pattern?
2) None of the properties seems to have a definition. I think in the
standard one we should aim for all properties to have good
definitions. Do others agree that this is worthwhile, or do you think
this wouldn't gain anything. (There are initial definitions in the
spreadsheet, for example.)
3) The spreadsheet has additional information that looks actually
useful, such as the Xpath expression for NeXML. Wouldn't that be worth
retaining to?

Nico - I don't know whether you're subscribed to the phylows group -
if not, you may want to if you want to stay in the loop on this.

-hilmar

[1] https://spreadsheets.google.com/pub?key=0Av8UW3JvZsgcckwtLU83cHloUjhGY25uRzUtb2ZBbHc&hl=en&single=true&gid=0&output=html
[2] http://treebase.svn.sourceforge.net/viewvc/treebase/trunk/treebase-core/src/main/resources/treebase.owl
[3] https://sourceforge.net/apps/mediawiki/treebase/index.php?title=API
[4] http://wiki.tdwg.org/twiki/bin/view/Phylogenetics/WorkingMeeting2010#Further_develop_thePhyloreferenc
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org :
===========================================================

Rutger Vos

unread,

Oct 4, 2010, 6:24:51 AM10/4/10

to Hilmar Lapp, TreeBASE devel, PhyloWS Google Group, Nico Cellinese

> 1) The pattern for constructing the label uses a dot to delimit "words"
> (e.g., identifier.tree), whereas normally the pattern I've seen uses
> CamelCase (which would yield treeIdentifier). For the standard vocabulary
> I'd rather stick with common conventions, so what were the reasons or
> examples that motivated the dot pattern?

No reason. If CamelCase is the convention, we should stick with that.
I simply did not know that.

> 2) None of the properties seems to have a definition. I think in the
> standard one we should aim for all properties to have good definitions. Do
> others agree that this is worthwhile, or do you think this wouldn't gain
> anything. (There are initial definitions in the spreadsheet, for example.)

I think this would be useful.

> 3) The spreadsheet has additional information that looks actually useful,
> such as the Xpath expression for NeXML. Wouldn't that be worth retaining to?

Yes.

--
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading
RG6 6BX
United Kingdom
Tel: +44 (0) 118 378 7535
http://www.nexml.org
http://rutgervos.blogspot.com

Arlin Stoltzfus

unread,

Oct 4, 2010, 4:32:43 PM10/4/10

to TreeBASE devel, PhyloWS Google Group

Are all those references to "identifer" (tree identifier, matrix
identifier) going to be anchored in some way by dcterms:identifier?
And likewise with all those references to "title"?

Terms such as "study" appear in various ontologies (search http://bioportal.bioontology.org
).

Arlin

> --
> You received this message because you are subscribed to the Google
> Groups "PhyloWS" group.
> To post to this group, send email to phy...@googlegroups.com.
> To unsubscribe from this group, send email to phylows+u...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/phylows?hl=en
> .
>

-------
Arlin Stoltzfus (ar...@umd.edu)
Fellow, IBBR; Adj. Assoc. Prof., UMCP; Research Biologist, NIST
IBBR, 9600 Gudelsky Drive, Rockville, MD
tel: 240 314 6208; web: www.molevol.org

Hilmar Lapp

unread,

Oct 4, 2010, 5:02:04 PM10/4/10

to Arlin Stoltzfus, TreeBASE devel, PhyloWS Google Group

On Oct 4, 2010, at 4:32 PM, Arlin Stoltzfus wrote:

> Are all those references to "identifer" (tree identifier, matrix
> identifier) going to be anchored in some way by
> dcterms:identifier? And likewise with all those references to
> "title"?

Yes. They are already declared as sub-properties of dc:identifier and
dc:title, respectively.

-hilmar

William Piel

unread,

Oct 8, 2010, 12:41:43 AM10/8/10

to PhyloWS Google Group, TreeBASE devel

On Oct 4, 2010, at 6:24 AM, Rutger Vos wrote:

>> 1) The pattern for constructing the label uses a dot to delimit "words"
>> (e.g., identifier.tree), whereas normally the pattern I've seen uses
>> CamelCase (which would yield treeIdentifier). For the standard vocabulary
>> I'd rather stick with common conventions, so what were the reasons or
>> examples that motivated the dot pattern?
>
> No reason. If CamelCase is the convention, we should stick with that.
> I simply did not know that.

Speaking of case... I notice that in this recent Am Nat article:

http://dx.doi.org/10.1086/656486

... the PDF has the TreeBASE URI displayed at the top of the page:

http://www.journals.uchicago.edu/doi/pdf/10.1086/656486

... which is nice, but while the URI is correctly written:

http://purl.org/phylo/treebase/phylows/study/TB2:S10645

... the clickable link actually returns this string in lower case:

http://purl.org/phylo/treebase/phylows/study/tb2:s10645

... which causes a "HTTP Status 400 - Bad ID string" error. i.e., our phylows implementation is case sensitive when it comes to identifiers.

So I was wondering... is case sensitivity (e.g. the CamelCase discussed) an expectation for Linked Data identifiers and URIs? Or should TreeBASE be modified to accept both S10645 and s10645?

bp

Rutger Vos

unread,

Oct 8, 2010, 3:40:32 AM10/8/10

to William Piel, PhyloWS Google Group, TreeBASE devel

> So I was wondering... is case sensitivity (e.g. the CamelCase discussed) an expectation for Linked Data identifiers and URIs? Or should TreeBASE be modified to accept both S10645 and s10645?
>

The W3 says:

"Two RDF URI references are equal if and only if they compare as
equal, character by character, as Unicode strings."
(http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref)

Allowing case-insensitive URIs in PhyloWS is likely to give problems
when they occur in RDF serializations so I am no fan of allowing them.

Hilmar Lapp

unread,

Oct 8, 2010, 9:27:53 AM10/8/10

to William Piel, PhyloWS Google Group, TreeBASE devel

On Oct 8, 2010, at 12:41 AM, William Piel wrote:

> So I was wondering... is case sensitivity (e.g. the CamelCase
> discussed) an expectation for Linked Data identifiers and URIs?

Yes. Which doesn't mean that some identifier schemes can't decide to
be case-insensitive. But someone linking by identifier should never
assume that the identifier is case-insensitive.

I'm wondering actually what the rule is for DOIs - Ryan, can you
enlighten us on that?

> Or should TreeBASE be modified to accept both S10645 and s10645?

You may wish to decide that independently of the issue at hand. No
matter what the decision, I would first get in touch with Am Nat (or
Chicago Press) whether they can fix that. I was going to pull out an
article that links to a dataset in Dryad, but it occurs to me that
Dryad DOIs are entirely lowercase, and so it wouldn't show anything.
Though you could take that as a hint to make all TB identifiers
lowercase as well, which would avoid this issue.

-hilmar

Rutger Vos

unread,

Oct 8, 2010, 9:31:10 AM10/8/10

to Hilmar Lapp, William Piel, PhyloWS Google Group, TreeBASE devel

> I'm wondering actually what the rule is for DOIs - Ryan, can you enlighten
> us on that?

I was wondering about that too, turns out they are - to my surprise -
case-insensitive (or so says wikipedia...)

Reply all

Reply to author

Forward