NIST Common Platform Enumeration

10 views
Skip to first unread message

TK

unread,
Mar 1, 2009, 10:56:13 PM3/1/09
to TopBraid Composer Users
Hello all,
I'm doing some modeling using an emerging standards from MITRE called
CPE. (http://cpe.mitre.org/)
In short, they hope to name all applications with a unique URI. I
have a question about using a very complex name in TBC.

The names in CPE are based on URI's but when placed into RDF, I
believe I am going to have to do a lot of %-escaping. Let me start
with an example.

The CPE name representing the Apache HTTP Server version 1.3.30 is:
cpe:/a:apache:http_server:1.3.30

If I were to create a CPE class in TBC and then go to create an
instance where the name reads 'cpe:/a:apache:http_server:1.3.30' it
would not fly. The only way I can get it in to TBC is to escape like
so:
cpe%3A%2Fa%3Aapache%3Ahttp_server%3A1.3.30

When having to deal with complex names, especially RDF SUBJECTS which
must be in the form of a RDF URI Reference, what is the best practice
with TBC users? I guessing that there is no other choice but to %-
escape.

Thanks,
--tk



Jeremy Carroll

unread,
Mar 1, 2009, 11:36:15 PM3/1/09
to topbraid-co...@googlegroups.com, ewal...@nist.gov, Jeremy Carroll

I'll let Scott or Holger answer about how best to use this URI in TBC, but I'll comment on the URI itself:


http://cpe.mitre.org/files/cpe-specification_2.1.pdf
[[
A CPE Name is a percent-encoded URI with each name starting with the prefix (the URI
scheme name) “cpe:”. Note that the scheme "cpe:" is not registered as an official URI scheme
with IANA.
]]

Thus cpe: is a scheme name, *not* a namespace prefix.

& thus when percent-encoding your example, you were over-enthusiastic.
Definitely the first, and second items should not be percent encoded.

cpe:/a%3Aapache%3Ahttp_server%3A1.3.30

In fact, pasting the unencoded URI into composer (current developer build), as a URI seems OK, and my reasonably well-honed intuitions about what is and what isn't a legal URI suggest that too.

cpe:/a:apache:http_server:1.3.30

However, and this is feedback that probably should be sent to CPE/NIST, (and I'm cc-ing Evan Wallace, on the SemWeb side), this particular format with the version number at the end, is problematic with the semantic web qname convention, since there is no appropriate rightmost NCName (I haven't checked about the '.'s, I don't think they are NCStartChar's but not totally sure).

[Evan feel free to respond either on some other list, please CC me, or assuming you don't have post permission, if not I can forward any response to this list]

This means that you cannot use this URI as a property name in RDF/XML, and using it as a class name may hit bugs in RDF libraries such as Jena.

Thus I strongly recommend that all projects using such URIs exclusively use N3 format.
Given the use of qnames in the TBC UI, Holger may need to advise as to whether that would be sufficient for successful use within TBC.

Further note: while the percent-encoded form does end in an NCName A1.3.30, the left hand side:
cpe:/a%3Aapache%3Ahttp_server%3 is not a URI (the bad % escape at the end) and hence cannot be a namespace name. I expect the namespace split point code in Jena (which I believe TBC is reliant on) gets this example wrong, (unless it has been improved since I left the Jena team).

It is fairly hard to predict how several of the layers of software will react to these URIs.

One work-around, might be to always use a blank node for these things, and then a datatype property, with range xsd:anyURI to link to the cpe URI. This prevents any suggestion that the namespace split point algorithm should or could be used for these URIs, at the cost of distorting your modeling.

Jeremy

tk blast

unread,
Mar 2, 2009, 12:35:18 AM3/2/09
to topbraid-co...@googlegroups.com
Jeremy, thanks for the education. 
If there is a more appropriate forum for this discussion, please direct me.
--tk
--

"The nervous system organizes itself so as to compute a stable reality" - Maturana & Varela

Holger Knublauch

unread,
Mar 2, 2009, 1:17:39 PM3/2/09
to topbraid-co...@googlegroups.com, ewal...@nist.gov

On Mar 1, 2009, at 8:36 PM, Jeremy Carroll wrote:

>
>
> I'll let Scott or Holger answer about how best to use this URI in
> TBC, but I'll comment on the URI itself:

Thanks Jeremy for the technical background. I just noticed that TBC
did not allow entering URIs without a matching prefix in the create
class/property/instance dialogs. I have generalized this for the new
beta, so that arbitrary URIs can be directly entered between <...>
brackets. This just as an aside.

My understanding is that the URIs suggested by TK will not be used for
property names, and as such should be OK for tools like TopBraid's RDF/
XML exporter (which indeed uses Jena's split algorithm). So as long as
you are happy with reading URIs like <cpe:/a%3Aapache%3Ahttp_server
%3A1.3.30> on your screen, it's technically possible.

The next beta will (finally) have a feature that allows you to switch
the whole display between qnames/URIs and human-readable labels
(rdfs:label and sub-properties thereof). Then, a good practice will be
to have the URIs in whatever (ugly?) form you want, but at least have
human-readable strings such as "Apache HTTP Server version 1.3.30" as
rdfs:labels of these URI resources.

Beta 2 is scheduled for early next week; we are planning to freeze
code for testing tomorrow.

Regards,
Holger

Reply all
Reply to author
Forward
0 new messages