Slices with dimension fixed to more than one value

Benedikt Kämpgen

unread,

Feb 7, 2012, 6:47:44 PM2/7/12

to publishing-st...@googlegroups.com

Hello,

Typically, slices of datasets may fix a dimension to more than one value,
e.g., a specific slice with observations that were reported either 2004,
2005, or 2006:

eg:refPeriod
<http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y>
;
OR eg:refPeriod
<http://reference.data.gov.uk/id/gregorian-interval/2005-01-01T00:00:00/P3Y>
;
OR eg:refPeriod
<http://reference.data.gov.uk/id/gregorian-interval/2006-01-01T00:00:00/P3Y>
;

In the specification I read "[Slices are] not intended to represent
arbitrary selections from the observations but uniform slices through the
cube.", however I am wondering whether it is still possible to represent
such slices in QB.

Best,

Benedikt

--
AIFB, Karlsruhe Institute of Technology (KIT)
Phone: +49 721 608-47946
Email: benedikt...@kit.edu
Web: http://www.aifb.kit.edu/web/Hauptseite/en

Dave Reynolds

unread,

Feb 8, 2012, 11:27:18 AM2/8/12

to publishing-st...@googlegroups.com, Benedikt Kämpgen

Hi Benedikt,

On 07/02/12 23:47, Benedikt K�mpgen wrote:
> Hello,
>
> Typically, slices of datasets may fix a dimension to more than one value,
> e.g., a specific slice with observations that were reported either 2004,
> 2005, or 2006:
>
> eg:refPeriod
> <http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y>
> ;
> OR eg:refPeriod
> <http://reference.data.gov.uk/id/gregorian-interval/2005-01-01T00:00:00/P3Y>
> ;
> OR eg:refPeriod
> <http://reference.data.gov.uk/id/gregorian-interval/2006-01-01T00:00:00/P3Y>
> ;

I would argue that is simply a subset of the data not a slice in QB terms.

> In the specification I read "[Slices are] not intended to represent
> arbitrary selections from the observations but uniform slices through the
> cube.", however I am wondering whether it is still possible to represent
> such slices in QB.

Possibly, depending on what aspect of this you want to capture and only
by "bending" the rules ...

The point of qb:Slice is to organize the data to (a) indicate preferred
presentation and (b) support abbreviated form. It's supposed to directly
mirror the SDMX support for organizing datasets as Time Series or Cross
Sections through the use of GroupKeys.

So there's no support for qb:SliceKeys which use disjunctions or indeed
any filtering patterns beyond fixed values. That's partly because we
felt such things are better done with the standard tools (e.g. SPARQL
queries) and partly because it would be of no use for abbreviation.

That said, in practice we have found uses for computed data subsets
ourselves (e.g. "latest values") and have been guilty of misusing Slices
for this. Essentially we used an under-constrained SliceKey and created
slices with only a specific subset of observations that match the
SliceKey. That way abbreviations would work (though we don't use them in
that case) but the slices are not a complete cover of the data.

I don't claim that's a good approach but it has worked in practice on
some projects.

It does mean that the precise nature of the slice is defined out of band
and isn't in the DSD. This may or may not be fatal but is certainly not
a Good Thing.

The W3C GLD work should probably examine this and decide whether such
computed slices are allowable and if not whether to provide an
alternative grouping mechanism.

Dave

Benedikt Kämpgen

unread,

Feb 9, 2012, 5:44:20 AM2/9/12

to Dave Reynolds, publishing-st...@googlegroups.com

Hello Dave,

Thanks for your answer.

One other reason to take such arbitrary slices into account would be that
they can be the result of operations on Data Cubes. Operations on Data Cubes
result in new Data Cubes, and as such probably should be possible to be
represented with QB.

I try to look into this further. Maybe we could have a specific use case
similar to the one at [1] around this?

Best,

Benedikt

[1]
<http://www.w3.org/2011/gld/wiki/Use_Cases_-_Data_Cube_Vocabulary#Expressing
_aggregation_relationships_in_Data_Cube>

--
AIFB, Karlsruhe Institute of Technology (KIT)
Phone: +49 721 608-47946
Email: benedikt...@kit.edu
Web: http://www.aifb.kit.edu/web/Hauptseite/en

> -----Original Message-----
> From: Dave Reynolds [mailto:dave.e....@gmail.com]
> Sent: Wednesday, February 08, 2012 4:27 PM
> To: publishing-st...@googlegroups.com
> Cc: Benedikt Kämpgen
> Subject: Re: [publishing-statistical-data] Slices with dimension fixed to
more
> than one value
>
> Hi Benedikt,
>

Dave Reynolds

unread,

Feb 9, 2012, 7:22:58 AM2/9/12

to Benedikt Kämpgen, publishing-st...@googlegroups.com

Hi Benedikt,

On 09/02/12 10:44, Benedikt K�mpgen wrote:

> One other reason to take such arbitrary slices into account would be that
> they can be the result of operations on Data Cubes. Operations on Data Cubes
> result in new Data Cubes, and as such probably should be possible to be
> represented with QB.

Maybe.

I had suggested that representing general relationships between cubes
should be one area to consider in updating Data Cube. Arofan pointed out
that there has been considerable activity on that in the SDMX world and
that coordination would be needed there. I haven't had time to study
the SDMX work but that might mean it would premature to include that in
this round of Data Cube updates.

> I try to look into this further. Maybe we could have a specific use case
> similar to the one at [1] around this?

Capturing as a use case seems reasonable.

Though we will need to triage and prioritise those use cases :)

Dave

Reply all

Reply to author

Forward