I'm trying to represent some statistics which are essentially counts.
The SDMX cross domain concept "UNIT_MEASURE" is the right property to
use to denote this and is required to be denoted using a code list. Yet
on the sdmx.org site there is no code list for UNIT_MEASURE, seems to be
listed in the "future work" section.
Is there some other source of agreed SDMX code lists other than those on
I assume counts are pretty common in existing SDMX usage, is there a de
facto common practice for the UNIT_MEASURE code list which includes this
I don't know what the common practice in SDMX is.
I looked into the general question of units of measurement though, to
see if there's a good set of URIs for units that we could use straight
away. I'll share what i found below.
The quick summary: As a stopgap measure, we could use the UN/CEFACT
Rec20 codes listed in this Excel file here, as literals:
Or, if we really really want to use URIs straight away, we could use
where the unit name comes from here:
The code for a unit-less count would be "C62" or http://data.nasa.gov/qudt/owl/unit#Number
More gory details pasted below -- I plan to write this up as a blog
The UN/CEFACT Recommendation 20 list of units of measurements looks
great, it's available as an Excel sheet here:
The code for “one; piece; unit” is “C62”. So the codes are not the
Unfortunately, in absence of a licensing statement I don't think it's
legally possible to create derivatives (like a SKOS ConceptScheme,
which is the intended RDF representation of codelists).
As a precedent: These codes, as literal values, are used in
GoodRelations, a popular RDF vocabulary for eCommerce. The maintainer
of GoodRelations told me that he thoroughly investigated the area and
settled on this approach.
There is a code for Pint (PTI), but none for Teaspoon.
Another prominent code list is UCUM:
This looks very well thought out. I didn't dig to find out what the
UCUM Organization is and what the license conditions are.
In UCUM it appears that the unit for count is “1” (the “default
unit”), but it's not clear to me wether that's a “real” unit in the
Teaspoon is [tsp_us], a pint is [pt_br].
SI units are of course as standard as one could possibly wish. They
have standard symbols, but no standard codes. Many of the symbols are
outside of the US-ASCII character set and thus cannot easily be typed
or become part of URIs. Cubic micrometers, for example. There is no
symbol for the dimensionless unit (count).
Starting with URI sets. There's the NASA QUDT Unit Ontology:
The namespace is <http://data.nasa.gov/qudt/owl/unit#>, commonly
abbreviated as “unit:”, and for counts you'd use unit:Number. This
looks pretty good, but the data.nasa.gov URIs don't resolve and hence
are not exactly linked data friendly. Someone involved in the project
told me on Twitter that they are working on making them resolvable,
but registering a subdomain takes a while at nasa.gov.
A nice thing is that this contains currencies as well (common in
statistical data). Coverage there is limited to currencies that are
still in use, so it has the Euro but doesn't have the German Mark.
And unit:Teaspoon exists, and so does unit:PintImperial!
The Open Geospatial Consortium has registered a URN namespace for
units of measures. The W3C's Semantic Sensor Networks Incubator Group
will be using them. Two sub-namespaces are registered, one for SI
units and one for the UCUM code list. I could not find an
authoritative list of the units recognised by the OGC, they appear to
just defer to the authorities for the sub-namespaces. I could not find
information on how to encode special characters that are common in SI
unit symbols and not allowed in URNs. The relevant web pages:
OGC unit URNs look like this:
I couldn't find out how symbols outside of US-ASCII should be handled
in the SI namespace.
SI and UCUM don't really have a code for counts. I found the following
in the OGC's own URN resolver, but according to the official OGC's URN
policy there is no :OGC: sub-namespace:
Altogether, the experience with the OGC URNs strengthens my dislike of
URNs as identifiers. The namespace is underspecified, documentation is
lacking, and management of the namespace seems to be a bit lax for a
There's a few other options, but they all have various shortcomings
(most significantly, no major organisational backing) and I would
consider them inferior to the options above:
Linked Data Technologist • Linked Data Research Centre
Digital Enterprise Research Institute (DERI), NUI Galway, Ireland
[Sent from phone.]
On 11 Mar 2010, at 00:41, Richard Cyganiak <richard....@deri.org>