sdmx-measure:obsValue Vs qb:MeasureProperty

132 views
Skip to first unread message

Irene P.

unread,
Dec 13, 2013, 3:56:13 PM12/13/13
to publishing-st...@googlegroups.com

Dear all,

I am currently working on a project of converting Census data as LOD using Data Cube Vocabulary.Also, wherever applicable SDMX dimensions, attributes, etc. are reused, e.g. sdmx-dimension:sex, sdmx-dimension:age .

My question concerns measures. Across the datasets there is a large number of measures to define, e.g. number of households, resident population, registered population, number of children, various indices, etc.

To capture the measure value, I am not sure whether:

(a)    To use sdmx-measure:obsValue along with a custom defined, e.g. indicatorDimension (using qb:DimensionProperty) and a list with all the measures from which the dimension withdraws its values (simple ConceptScheme) or

(b)   to define a separate qb:MeasureProperty for each one of these measures.

By using (a), it seems that qb:MeasureProperty becomes ‘redundant’ with no use in this implementation. However, one DSD may apply for various datasets, since there is no need of changing the measure attached. 

By using (b), it looks more ‘complicated’, since a new qb:MeasureProperty for every separate/new measure needs to be defined rather than simply adding a new member to the list of measure types. Also, a new DSD needs to be defined in case where the only difference among datasets is the measure type.

Are there any advantages of using qb:MeasureProperty in this case or is it simply a design decision?

Does anyone have any suggestion or faced a similar issue?

Thank you.

Irene

Richard Cyganiak

unread,
Dec 13, 2013, 5:03:41 PM12/13/13
to Publishing Statistical Data Group
Irene,

It’s a design decision. I’ve used both (in different projects).

Using an indicator dimension leads to a simpler big-picture structure, but the individual observations look more complicated and are perhaps harder to understand for the user.

Using different measure properties makes the individual observations easier to understand, but makes building applications harder because they need to be designed so that they select the right measure property. It allows declaring a different range on each property, in case your measured values have different datatypes (e.g., some integers and some decimals).

I can’t say which option I ultimately prefer. It may not matter all that much.

By the way, my team has helped Ireland’s CSO to publish their 2011 census as linked data:
http://data.cso.ie
http://data.cso.ie/census-2011/page/dataset/households-internet/CTY/C02;broadband

We used multiple measures in this case. We more or less tossed a coin to make the decision...

Best,
Richard

Irene P.

unread,
Dec 16, 2013, 7:08:53 AM12/16/13
to publishing-st...@googlegroups.com

Dear Richard,

Thank you for your comment. Yes, you are right it may not matter that much which one to choose.

However, let’s assume that I have chosen to use multiple measures.

There I have a following question for similar measures, e.g. family members, family members aged less than 15 years, family members aged 65 and over.

For these, I was thinking to either to use 3 separate measures or one measure ‘family members’ and capture the age difference using an ‘age’ attribute attached to the general measure ‘family members’.

I have seen a similar property from the Ireland’s CSO you mentioned concerning children:

[a] http://data.cso.ie/census-2011/page/property/children

Here, an alternative solution is used where one measure is used for ‘children’ however using various labels (correct me if I am wrong).

A sample observation using this property is:

[b] http://data.cso.ie/census-2011/page/dataset/children-by-size-of-family/ED/E01001;3

Q1: Since two different labels are used how does the user know what is actually being measured in this case, meaning whether the measure corresponds to children (of all ages) or children under 15 years old?

Q2: By defining the property [a] as qb:measureProperty and a qb:DimensionProperty isn’t this ambiguous? You can distinguish that you are using it as a dimension or a measure through the qb:dimension or qb:measure when attached to the DSD. But when the property is used as a measure what happens with the rdfs:range which is a skos:Concept but it actually ranges an integer? 

Best regards,

Irene

Richard Cyganiak

unread,
Dec 16, 2013, 1:18:36 PM12/16/13
to Publishing Statistical Data Group
Irene,

On 16 Dec 2013, at 12:08, Irene P. <petrou...@gmail.com> wrote:
> However, let’s assume that I have chosen to use multiple measures.
>
> There I have a following question for similar measures, e.g. family members, family members aged less than 15 years, family members aged 65 and over.
>
> For these, I was thinking to either to use 3 separate measures or one measure ‘family members’ and capture the age difference using an ‘age’ attribute attached to the general measure ‘family members’.
>
> I have seen a similar property from the Ireland’s CSO you mentioned concerning children:
>
> [a] http://data.cso.ie/census-2011/page/property/children
>
> Here, an alternative solution is used where one measure is used for ‘children’ however using various labels (correct me if I am wrong).
>
> A sample observation using this property is:
>
> [b] http://data.cso.ie/census-2011/page/dataset/children-by-size-of-family/ED/E01001;3
>
> Q1: Since two different labels are used how does the user know what is actually being measured in this case, meaning whether the measure corresponds to children (of all ages) or children under 15 years old?

It’s a bug in our data. It should be two different properties. Some of the properties were manually named and that resulted in a clash due to carlesness. We’ll fix it.

> Q2: By defining the property [a] as qb:measureProperty and a qb:DimensionProperty isn’t this ambiguous?

Yes it is, and it needs to be fixed.

(Well done spotting the one property where we’ve screwed up! ;-)

Best,
Richard
> --
> You received this message because you are subscribed to the Google Groups "Publishing Statistical Data" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to publishing-statisti...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Irene P.

unread,
Jan 24, 2014, 6:30:18 AM1/24/14
to publishing-st...@googlegroups.com
Dear Richard,

Now that you mentioned the 'clash' problem. I found that in some cases you may wish to use the same concept as a dimension and some times as an attribute, or measure. 
For example, age groups. In some datasets you may have 'age group' as a dimension and therefore you define a qb:DimensionProperty and 
in some cases you may want to attach an age group as an attribute at the dataset level (as the dataset concerns only one specific age group). 
In this case you need 'age group' as an attribute, hence you define a qb:AttributeProperty. 
Let's also assume that you do not wish to keep this information only in the title/label of the dataset, so that when you query for that specific group to get the whole dataset as a result too. 
If you classify all properties under the same pattern, taking as an example the data.cso.ie pattern


how would you deal with these kind of conflicts? 

Would it be better to categorize each type of property, meaning 


to avoid future conflicts (as you may not have all the datasets available from beginning), rather than checking each time if we have already used the name for another type of property? 

Best,

Irene 



> To unsubscribe from this group and stop receiving emails from it, send an email to publishing-statistical-data+unsub...@googlegroups.com.

Dave Reynolds

unread,
Jan 24, 2014, 7:03:15 AM1/24/14
to publishing-st...@googlegroups.com
On 24/01/14 11:30, Irene P. wrote:
> Dear Richard,
>
> Now that you mentioned the 'clash' problem. I found that in some cases
> you may wish to use the same concept as a dimension and some times as an
> attribute, or measure.
> For example, age groups. In some datasets you may have 'age group' as a
> dimension and therefore you define a qb:DimensionProperty and
> in some cases you may want to attach an age group as an attribute at the
> dataset level (as the dataset concerns only one specific age group).
> In this case you need 'age group' as an attribute, hence you define a
> qb:AttributeProperty.
> Let's also assume that you do not wish to keep this information only in
> the title/label of the dataset, so that when you query for that specific
> group to get the whole dataset as a result too.
> If you classify all properties under the same pattern, taking as an
> example the data.cso.ie pattern
>
> http://data.cso.ie/census-2011/property/
> <http://www.google.com/url?q=http%3A%2F%2Fdata.cso.ie%2Fcensus-2011%2Fpage%2Fproperty%2Fchildren&sa=D&sntz=1&usg=AFQjCNEG9zOttl9pHT8-5B2HvlE_JlAdfQ>{propertyName}
>
> how would you deal with these kind of conflicts?
>
> Would it be better to categorize each type of property, meaning
>
> http://data.cso.ie/census-2011/property/
> <http://www.google.com/url?q=http%3A%2F%2Fdata.cso.ie%2Fcensus-2011%2Fpage%2Fproperty%2Fchildren&sa=D&sntz=1&usg=AFQjCNEG9zOttl9pHT8-5B2HvlE_JlAdfQ>dimension/{propertyName}
> http://data.cso.ie/census-2011/property/
> <http://www.google.com/url?q=http%3A%2F%2Fdata.cso.ie%2Fcensus-2011%2Fpage%2Fproperty%2Fchildren&sa=D&sntz=1&usg=AFQjCNEG9zOttl9pHT8-5B2HvlE_JlAdfQ>measure/{propertyName}
> http://data.cso.ie/census-2011/property/
> <http://www.google.com/url?q=http%3A%2F%2Fdata.cso.ie%2Fcensus-2011%2Fpage%2Fproperty%2Fchildren&sa=D&sntz=1&usg=AFQjCNEG9zOttl9pHT8-5B2HvlE_JlAdfQ>attribute/{propertyName}
>
> to avoid future conflicts (as you may not have all the datasets
> available from beginning), rather than checking each time if we have
> already used the name for another type of property?

Yes, having a URI pattern then helps you keep these separate is a good
approach, that's the type pattern we used for the SMDX COG translation.

Note that you can use the qb:concept property to link the
dimension/measure/attribute property to the common underlying concept.

For example in the COG translation then we provide a dimension, measure
and attribute for currency all linked to sdmx-concept:currency.

Dave

Sarven Capadisli

unread,
Jan 24, 2014, 8:59:12 AM1/24/14
to publishing-st...@googlegroups.com
If this is about designing the "safest" URI pattern approach, one should
consider the following as well:

Are the concepts versioned? If so, the labels alone may not be
sufficient to distinguish one concept (and eventually the property) from
another. This is because each DSD may refer to a specific concept with a
version (within a specific concept scheme with a version). Consequently,
a property URI needs to be unique like e.g.,
property/dimension/cog/2009/currency/2009. Yes, that's fugly.

In Linked SDMX's URI Patterns [1], I went with what I've observed in the
DSDs from some of the statistical agencies *.270a.info, and weighed that
against the complexity of having an absolutely unique property per DSD.
So, settled with property:{conceptID}. I'm not completely happy with
that, but, it appears to be sufficient - famous last words - so far.

[1] http://csarven.ca/linked-sdmx-data#uri-patterns

-Sarven
http://csarven.ca/#i

Reply all
Reply to author
Forward
0 new messages