I would use hierarchical code lists to express this. Given that all
code lists in Data Cube are skos:ConceptSchemes, you can express
hierarchy with SKOS (e.g., skos:narrower, skos:broader). In your case,
you would have:
<UK> skos:narrower <England>, <Scotland>, <Wales>, <NorthernIreland> .
Best,
Jindrich
--
Jindrich Mynarz
@jindrichmynarz
<http://keg.vse.cz/resource/person/jindrich-mynarz>
the simple answer is to use broader skos:Concepts (from the code lists
coding the Data Cube dimensions) for the aggregated observations.
I.e., for your and Bill's combined example, it would be something
like:
<ukPopulation2009Observation> ex:geoArea <UK> .
where ...
<UK> skos:narrower <England>, <Scotland>, <Wales>, <NorthernIreland> .
In this way, the measures are comparable through the code list
(skos:ConceptScheme).
Best,
Jindrich
2011/12/30 Keith Alexander <k.j.w.a...@gmail.com>:
I think a specific extension vocabulary is need for this. There are
several requirements:
(1) Need to be able to specify a relationship to use for hierarchical
dimensions other than the skos relationships. In particular we
frequently want to use geospatial data where is already containment
relation that should be reused. [2]
(2) Need to be able to specify when such a relationship gives a disjoint
cover so that aggregation would be meaningful.
(3) There may also be a need for defining a specific type of slice to
allow aggregations be asserted about data which isn't strictly hierarchical.
(4) We need a general way to express relationships between measures.
E.g. is it common to have a measure expressed as a count then have a
separate measure to indicate % of that measure against some denominator.
It's possible that the same machinery could/should be used to express
aggregations.
Dave
[1] Though since it is currently unfunded "spare time" effort the
timescales aren't guaranteed :)
[2] Yes, I know that geospatial areas *can* be treated as SKOS concepts
but the the skos:narrower/broader relations aren't really appropriate
and in any case the requirement is to reuse the existing relationships
which have been specified and asserted by third parties.
This is a subject which has also been much-discussed within the SDMX
community as a whole. There has been (to my knowledge) at least one
implementation of a system designed to capture these relationships, and I
think more.
What has been done in the past is to attach an attribute to a slice (or
entire data set) containing an equation expressing the relationship between
the various parts.
While the SDMX information model contains a model for how statistical data
is processed, this was never given a standard notation of any sort. I know
that at least one organization has developed a "syntax-neutral" way of
expressing the processing being performed, based on the SDMX model -
essentially a programming language with the capabilities of addressing
specific observations as variables within the equation.
This will not probably help with your geography example - SDMX has never had
a concept of geographical hierarchies, although this is included in the DDI
model, based on the ISO 19115 (etc.) family of standards. I would comment
that we want to have as exact a notation of specific aggregation
relationships as possible - we are dealing with statistics, and they tend to
have quite precise and well-known relationships.
I guess my main concern is that whatever Data Cube does in this area is
aligned with how SDMX is used more broadly, so that it is always possible to
produce valid Data Cube RDF from any SDMX data set. If it would be useful,
as Data Cube moves forward through W3C, I can put you in touch with the guy
who is heading up the SDMX Technical Working Group, so that input could be
more formally collected from within the SDMX community on this subject.
Cheers - and happy New Year!
Arofan
All good input, thanks.
The notion of a complete embedded programming language would be beyond
what I had in mind - *definitely* wouldn't want to create yet another
rule language in W3C!
I agree we want to keep Data Cube as aligned with SDMX as we can. Though
I do want to make sure this doesn't turn into too ambitious an exercise
otherwise it won't get done.
Cheers,
Dave