aliases in hierarchies

2 views
Skip to first unread message

Bob Jolliffe

unread,
Mar 8, 2010, 10:09:41 AM3/8/10
to sdm...@googlegroups.com
Putting this question out there for any sdmx experts - can anyone
explain what is the value of using aliases in hierarchies? The
coderefs in a hierarchy are obliged to refer to aliases of codelists.
That is not difficult to do but I am at a loss as to what value is
added by this level of indirection. I wonder why it wasn't designed
so as to allow coderefs to simply refer to codelists directly.
Obviously there was some good intention here, I just can't see it.

How it arises - in sdmx-hd examples there has been a tendency to
always give codelist CL_XXX and alias of AL_XXX. So for example we
have AL_INDICATOR, AL_TIME_PERIOD etc representing CL_INDICATOR,
CL_TIME_PERIOD .. But of course it doesn't have to be like this. I
recently came across an example where CL_GEOGRAHICAL_PLACE_NAME was
given an alias of AL_GEO. Now that should be ok. In fact its even
something of a (weak) justification for having aliases. Gives rise to
a question for ourselves:

Should we enforce the naming convention between CL_XXX and AL_XXX or
should we leave it open to call your alias whatever you like? I'm
kind of neutral on this, particularly given that I don't see the
justification for aliases anyway other than the fact that sdmx makes
us use them. Having them the use the same naming convention is easier
to validate.

Any preferences?

Regards
Bob

Gary Patchen

unread,
Mar 8, 2010, 10:47:09 AM3/8/10
to sdm...@googlegroups.com
Hi Bob,

This is one of the reasons, although there may be others:

When you specify the alias you map the Code-list ID, Version and AgencyID to 1 alias so if you didn't specify an alias in the hierarchy you would need each hierarchical node to specify the Code-list ID, Version and AgencyID.
So specifying an alias cuts down largely the size of the hierarchical XML you would build. At the moment the hierarchy we have in COMMON are all the same AgencyID (SDMX-HD) and all the same version (1.0) so the alias does seem a little redundant. However, later when we have more than 1 version its benefit will no doubt be clearer.

regards

gary patchen
lead consultant

direct + 41.(0)58.307.7094
mobile +41.(0)79.333.1339
gary.p...@b-i.com
www.b-i.com

blue-infinity headquarters
+41(0)58.307.7000
+41(0)58.307.7001
INTERNATIONAL
t +800.307.70.000
f +800.307.70.001

b-i  branding.technology.integration.

The information in this e-mail, and those ensuing, is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, please destroy this message and notify us immediately.
 
Please think of the environment before printing this email

Any preferences?

Regards
Bob

--
You received this message because you are subscribed to the Google Groups "SDMX-HD (Health Domain)" group.
To post to this group, send email to sdm...@googlegroups.com.
To unsubscribe from this group, send email to sdmx_hd+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sdmx_hd?hl=en.

Bob Jolliffe

unread,
Mar 8, 2010, 10:56:49 AM3/8/10
to sdm...@googlegroups.com
On 8 March 2010 15:47, Gary Patchen <gary.p...@b-i.com> wrote:
> Hi Bob,
>
> This is one of the reasons, although there may be others:
>
> When you specify the alias you  map the Code-list ID, Version and AgencyID to 1 alias so if you didn't specify an alias in the hierarchy you would need each hierarchical node to specify the Code-list ID, Version and AgencyID.
> So specifying an alias cuts down largely the size of the hierarchical XML you would build. At the moment the hierarchy we have in COMMON are all the same AgencyID (SDMX-HD) and all the same version (1.0) so the alias does seem a little redundant. However, later when we have more than 1 version its benefit will no doubt be clearer.

Thanks Gary. That makes good sense.

Any thoughts on the AL_GEO thing? It hit me a bit as I was previously
doing a fairly naive test for validy of node types in a hierarchy.
For example (HY_TIME_PERIOD), instead of testing

st:CodelistAliasRef='AL_TIME_PERIOD'

which is what I was doing, I actually should be testing:

key('volatile_aliases',st:CodelistAliasRef)/st:CodelistID='CL_TIME_PERIOD'

Its not important that a node is AL_TIME_PERIOD. It could just as
well be AL_BLUE_FISH. What is important is that when you
"dereference" the alias you get to CL_TIME_PERIOD. I'm changing all
those tests now :-(

Bob

Gary Patchen

unread,
Mar 8, 2010, 11:16:07 AM3/8/10
to sdm...@googlegroups.com
Hi Bob,

Unfortunately I don't think we can impose an alias name being the code-list ID with CL_ replaced by AL_ for this reason:

If we have :

CL_LIST v1.0, Agency ABC
CL_LIST v1.1, Agency ABC

and we now wish to create 1 alias that uses version 1.0 and another that users 1.1 we must create the aliases with different names:

CL_LIST v1.0, Agency ABC = AL_LIST
CL_LIST v1.1, Agency ABC = AL_LIST_1

The only rule we could state is that the Alias uses the same name of the code-list with _1,2,3,4 if multiple version exists

Bob Jolliffe

unread,
Mar 8, 2010, 11:25:26 AM3/8/10
to sdm...@googlegroups.com
On 8 March 2010 16:16, Gary Patchen <gary.p...@b-i.com> wrote:
> Hi Bob,
>
> Unfortunately I don't think we can impose an alias name being the code-list ID with CL_ replaced by AL_ for this reason:

Yes this is the correct conclusion. The problem comes from trying to
interpret a spec by example where 99% of the examples do one thing -
the 1% sooner or later jumps up and bites you.

I've fixed all my assertions now - it doesn't matter what you call the
alias. As indeed it shouldn't.

Cheers
Bob

Bob Jolliffe

unread,
Mar 9, 2010, 6:02:45 AM3/9/10
to Gary Patchen, sdm...@googlegroups.com
Hi Gary

On 9 March 2010 10:13, Gary Patchen <gary.p...@b-i.com> wrote:
> Hi Bob,
>

> Can you clarify the situation:
>
>        Are you planning to create 1 word document with sch2html output in the document, or are you just intending to publish the sch2html output as a online html      file?
>

Currently I am just specifying and documenting and producing the html
version automagically. Thinking realistically, the first draft will
be in html form. Once we are substantially agreed on that we can
create a pdf or doc formatted version.

>
>
> Finally, here is some feedback on the Schematron Rules:
>
> 1. "It is an error if a CodelistAliasRef refers to an alias of a codelist which is not used as a Dimension in the KeyFamily"
>
>        This is wrong as you may have metadata structure that do not related to dimension.

OK. Loose text. What it should say is "It is an error if a
CodelistAliasRef which appears below the value type node refers to an
alias of a codelist which is not used as a Dimension in the
KeyFamily". I've just committed the fix.

Anyway I'm not sure if it shouldn't be the other way - ie a test on
the dimensions in the keyfamily to ensure that they appear in the
hierarchy.

But you worry me a bit with talk of metadata structure. My
understanding is that the disaggregation hierarchy is used to specify
disaggregation. What other metadata structures might you have?


>
>
> 2. "A DSD may specify a geographical place name dimension which takes its values from ISO/3166-2 codes in the code list with id=CL_GEOGRAPHIC_PLACE_NAME. This hierarchy provides producers with a mechanism to structure the place name codes."
>
>        Geographical place name vary immensely and are not solely based on ISO/3166-2 i.e I can create a geographical place called "my desk" if I want.

I do agree. But I have taken this from your text describing
CL_GEOGRAPHIC_PLACE_NAME. I'm happy for you to suggest different
words but it must also be consistent. (and don't move your desk :-)


>
> 3. The doesn't seem to be a rule that checks that the aliases used in the hierarchies where actually specified as code-lists. Often we have cases where a code-list is loaded with AgencyID = "SDMX-HD" yet specified as AgencyID="ABC" in the hierarchy which is an error.

True. I'll add that.

>
> 4. As HY_INDICATOR_DISAGGREGATION is mandatory then CL_PROGRAM,CL_ISET,CL_INDICATOR,CL_DISAGG should also be mandatory code-lists in the DSD.

Good point. I'll specify that. Though I have doubts about CL_DISAGG.
CL_DISAGG is currently an optional hack for you to do "partial
configuration". It would only need to be in the DSD if it is referred
to. In the same context as 2 above we need to specify that all
codelistRefs in all hierarchies must have been declared with the
correct signature (id, agencyid and version).

>
> 5. Each Code-list=>code used in a hierarchy should actually exist in the code-list

Good point. I'll specify that.

Thanks for the feedback.

Bob

>
> Will continue later...


>
>
> regards
>
>
> gary patchen
> lead consultant
>
> direct + 41.(0)58.307.7094
> mobile +41.(0)79.333.1339
> gary.p...@b-i.com
> www.b-i.com
>
> blue-infinity headquarters
> t +41(0)58.307.7000
> f +41(0)58.307.7001
> INTERNATIONAL
> t +800.307.70.000
> f +800.307.70.001
>
> b-i  branding.technology.integration.
>
> The information in this e-mail, and those ensuing, is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, please destroy this message and notify us immediately.
>
> Please think of the environment before printing this email
>
>
>
> -----Original Message-----

> From: Bob Jolliffe [mailto:bobjo...@gmail.com]
> Sent: mardi 9 mars 2010 09:15
> To: Gary Patchen
> Cc: Xavier Bocken
> Subject: SUSPECT: Re: SUSPECT: Re: SUSPECT: Re: SUSPECT: aliases in hierarchies
>
> You have the latest schema.  if you open it in your browser you will
> see how the normative text will appear.
>
> Regards
> Bob.


>
> On 9 March 2010 07:56, Gary Patchen <gary.p...@b-i.com> wrote:
>> Hi Bob,
>>

>> Can you send me the latest normative document as I would like to have a full and final read before we publish it to everyone.
>> On that very note, are you aiming for the Wednesday meeting (tomorrow)?


>>
>> regards
>>
>> gary patchen
>> lead consultant
>>
>> direct + 41.(0)58.307.7094
>> mobile +41.(0)79.333.1339
>> gary.p...@b-i.com
>> www.b-i.com
>>
>> blue-infinity headquarters
>> t +41(0)58.307.7000
>> f +41(0)58.307.7001
>> INTERNATIONAL
>> t +800.307.70.000
>> f +800.307.70.001
>>
>> b-i  branding.technology.integration.
>>
>> The information in this e-mail, and those ensuing, is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, please destroy this message and notify us immediately.
>>
>> Please think of the environment before printing this email
>>
>>
>>
>> -----Original Message-----

>> From: Bob Jolliffe [mailto:bobjo...@gmail.com]
>> Sent: lundi 8 mars 2010 18:06
>> To: Gary Patchen
>> Subject: SUSPECT: Re: SUSPECT: Re: SUSPECT: aliases in hierarchies
>>
>> Committed to svn - thanks.  You can keep yourself up to date with 'svn
>> co http://svn.sdmx-hd.org/sdmxhd/schematron-sdmx-hd'  and 'svn up' as
>> usual.  Any chance of getting our evolving data dictionary in there?
>> Will help a lot to track changes.
>>
>> Cheers
>> Bob

Bob Jolliffe

unread,
Mar 9, 2010, 7:40:21 AM3/9/10
to sdm...@googlegroups.com

<sch:rule context="msg:HierarchicalCodelists/st:HierarchicalCodelist/st:CodelistRef"
role="Codelist references">
<sch:let name="id" value="st:CodelistID"/>
<sch:let name="agency" value="st:AgencyID"/>
<sch:let name="version" value="st:Version" />
<sch:assert
test="count(/msg:Structure/msg:CodeLists/st:CodeList[@id=$id and
@agencyID=$agency and @version=$version])=1">It is
an error if a CodelistRef used in
a hierarchy does not refer to an existing code list
defined in msg:CodeLists</sch:assert>
</sch:rule>

This rule takes care of 3 and 4 above. Though it is not really
sdmx-hd specific - I guess it is something which should always be true
to be a valid sdmx structure message, but no harm in being explicit.
Having the machine executable test is useful because I am finding that
there seems not to be an error free dsd in existence! Its so easy to
make errors when hand editing these DSDs that a schema like this is
absolutely necessary.

Bob.

Reply all
Reply to author
Forward
0 new messages