Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Expressing "part of" relationships between observations
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
A. Gregory  
View profile  
 More options Dec 31 2011, 9:35 am
From: "A. Gregory" <arofan.greg...@earthlink.net>
Date: Sat, 31 Dec 2011 07:35:45 -0700
Local: Sat, Dec 31 2011 9:35 am
Subject: RE: [publishing-statistical-data] Expressing "part of" relationships between observations
Dave:

This is a subject which has also been much-discussed within the SDMX
community as a whole. There has been (to my knowledge) at least one
implementation of a system designed to capture these relationships, and I
think more.

What has been done in the past is to attach an attribute to a slice (or
entire data set) containing an equation expressing the relationship between
the various parts.

While the SDMX information model contains a model for how statistical data
is processed, this was never given a standard notation of any sort. I know
that at least one organization has developed a "syntax-neutral" way of
expressing the processing being performed, based on the SDMX model -
essentially a programming language with the capabilities of addressing
specific observations as variables within the equation.

This will not probably help with your geography example - SDMX has never had
a concept of geographical hierarchies, although this is included in the DDI
model, based on the ISO 19115 (etc.) family of standards. I would comment
that we want to have as exact a notation of specific aggregation
relationships as possible - we are dealing with statistics, and they tend to
have quite precise and well-known relationships.

I guess my main concern is that whatever Data Cube does in this area is
aligned with how SDMX is used more broadly, so that it is always possible to
produce valid Data Cube RDF from any SDMX data set. If it would be useful,
as Data Cube moves forward through W3C, I can put you in touch with the guy
who is heading up the SDMX Technical Working Group, so that input could be
more formally collected from within the SDMX community on this subject.

Cheers - and happy New Year!

Arofan

-----Original Message-----
From: publishing-statistical-data@googlegroups.com

[mailto:publishing-statistical-data@googlegroups.com] On Behalf Of Dave
Reynolds
Sent: Saturday, December 31, 2011 6:23 AM
To: publishing-statistical-data@googlegroups.com
Subject: Re: [publishing-statistical-data] Expressing "part of"
relationships between observations

The general area of expressing aggregation relationships in Data Cube is
something I put on the proposed work programme for the next phase when last
discussing it with Richard. Hopefully we can get this done under the W3C GLD
task group [1].

I think a specific extension vocabulary is need for this. There are several
requirements:

(1) Need to be able to specify a relationship to use for hierarchical
dimensions other than the skos relationships. In particular we frequently
want to use geospatial data where is already containment relation that
should be reused. [2]

(2) Need to be able to specify when such a relationship gives a disjoint
cover so that aggregation would be meaningful.

(3) There may also be a need for defining a specific type of slice to allow
aggregations be asserted about data which isn't strictly hierarchical.

(4) We need a general way to express relationships between measures.
E.g. is it common to have a measure expressed as a count then have a
separate measure to indicate % of that measure against some denominator.
It's possible that the same machinery could/should be used to express
aggregations.

Dave

[1] Though since it is currently unfunded "spare time" effort the timescales
aren't guaranteed :)

[2] Yes, I know that geospatial areas *can* be treated as SKOS concepts but
the the skos:narrower/broader relations aren't really appropriate and in any
case the requirement is to reuse the existing relationships which have been
specified and asserted by third parties.

On 27/12/2011 17:42, BillRoberts wrote:
> It often comes up in statistical data that you have some kind of
> 'overall' figure, which is then broken down into parts. To Supposing I
> have a set of population observations, expressed with the Data Cube
> vocabulary - something like (in pseudo-turtle)

> ex:obs1
>    sdmx:refArea<UK>;
>    sdmx:refPeriod "2011";
>    ex:population "60" .

> ex:obs2
>    sdmx:refArea<England>;
>    sdmx:refPeriod "2011";
>    ex:population "50" .

> ex:obs3
>    sdmx:refArea<Scotland>;
>    sdmx:refPeriod "2011";
>    ex:population "5" .

> ex:obs4
>    sdmx:refArea<Wales>;
>    sdmx:refPeriod "2011";
>    ex:population "3" .

> ex:obs5
>    sdmx:refArea<NorthernIreland>;
>    sdmx:refPeriod "2011";
>    ex:population "2" .

> What is the best way (in the context of the RDF/Data Cube/SDMX
> approach) to express that the values for the England/Scotland/Wales/
> Northern Ireland ought to add up to the value for the UK and
> constitute a more detailed breakdown of the overall UK figure?

> I might also have population figures for France, Germany, EU27,
> etc...so it's not as simple as just taking a qb:Slice where you fix
> the time period and the measure.

> Suggestions welcome!

> Bill


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.