CL_RACE codelist

3 views
Skip to first unread message

Bob Jolliffe

unread,
Mar 12, 2010, 11:08:24 AM3/12/10
to sdm...@googlegroups.com
Looking (unusually) at codelist content. Where on earth did we come
up with the codes for CL_RACE?
(http://www.sdmx-hd.org/docs/SDMX/SDMX-HD-v1.0/COMMON/SDMX-HD/v1.0/codelists/CL_RACE+SDMX-HD+1.0.xml).
It looks more like something that was peeled from the back of a US
police station wall :-)

For the most part this looks quite inappropriate for use in an
international-oriented standard. Who needs this? If we really must
have a racial classification codelist (and I would be really reluctant
to try and standardise one myself) is there some pre-existing,
acceptable list that anyone is aware of? This one looks like it may
be US government standard in which case it might exist under
agencyID="USG" or the like, but I don't think we can mandate it as an
SDMX-HD standard list.

Regards
Bob

Gary Patchen

unread,
Mar 12, 2010, 11:15:10 AM3/12/10
to sdm...@googlegroups.com
Hi Bob,

I am pretty sure Patrick request we delete this. Checking my emails now....

regards

gary patchen
lead consultant

direct + 41.(0)58.307.7094
mobile +41.(0)79.333.1339
gary.p...@b-i.com
www.b-i.com

blue-infinity headquarters
+41(0)58.307.7000
+41(0)58.307.7001
INTERNATIONAL
t +800.307.70.000
f +800.307.70.001

b-i  branding.technology.integration.

The information in this e-mail, and those ensuing, is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, please destroy this message and notify us immediately.
 
Please think of the environment before printing this email

Regards
Bob

--
You received this message because you are subscribed to the Google Groups "SDMX-HD (Health Domain)" group.
To post to this group, send email to sdm...@googlegroups.com.
To unsubscribe from this group, send email to sdmx_hd+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sdmx_hd?hl=en.

Gary Patchen

unread,
Mar 12, 2010, 11:22:25 AM3/12/10
to sdm...@googlegroups.com
Hi Bob,

Just to confirm the RACE concept was deleted, so it look like the code-list was missed.
I will delete it now.

Bob Jolliffe

unread,
Mar 12, 2010, 11:30:15 AM3/12/10
to sdm...@googlegroups.com
OK. I saw it is still here
http://www.sdmx-hd.org/docs/SDMX/SDMX-HD-v1.0/COMMON/SDMX-HD/v1.0/conceptschemes/COMMON+SDMX-HD+1.0.xml
but I guess these will be replaced shortly by the newly grouped
concept lists.

Regards
Bob

Bob Jolliffe

unread,
Mar 12, 2010, 11:31:48 AM3/12/10
to sdm...@googlegroups.com
On 12 March 2010 16:15, Gary Patchen <gary.p...@b-i.com> wrote:
> Hi Bob,
>
> I am pretty sure Patrick request we delete this. Checking my emails now....

Good. Thanks. I will second Patrick on this. Implementors are of
course free to create whatever codelists they like.

Gary Patchen

unread,
Mar 12, 2010, 11:32:53 AM3/12/10
to sdm...@googlegroups.com
Hi Bob,

Remember to see the new v1.0 code-list you need to look here:

http://www.sdmx-hd.org/plugin_assets/dom/xml/codelists/CL.xml

The address you emailed has the current online version that will soon updated.

Regards

Beatriz de Faria Leão

unread,
Mar 12, 2010, 11:49:48 AM3/12/10
to sdm...@googlegroups.com
Although I have little participation on the list in writing but I do follow all mails.
A brief note to endorse Patrick - race should be defined locally. In Brazil it was a federal mandate - we have national unique identifiers with a 170 Million people registered and race is defined by "WHITE / BLACK / YELLOW / RED (INDIGENOUS) / MIXED / OTHER"
we also have one doctoral thesis that shows that depending on where you are in the country people categorize race differently.

Thanks,
Beatriz
-----------------------------------
Beatriz de Faria Leao, MD, PhD
Health Architect
Zilics Health Systems
São Paulo, SP Brazil

Beatriz de Faria Leão
beatri...@zilics.com.br

r.friedman

unread,
Mar 12, 2010, 6:47:51 PM3/12/10
to sdm...@googlegroups.com
List --
It is quite impossible to know what demographic categories a country will
consider important, let alone the codelist they think is appropriate for it.
You can go with the flow or try to change things or put it on the to-do list
or whatever strategy makes sense given you and them. At least countries are
actually using their codelists, the experts are still arguing over flavors
of null.
I am worried that a communication protocol definition should purport to
be the authority on code lists. There is no universal semantics, it is up
to the communicating parties to harmonize their categories or to determine
if a particular data source is sufficiently compatible to be useful.
I think it would be fine to have examples of codelists that are used in
particular countries, or prepared libraries of codes, or prepared mappings
of one set of codes to another, but I don't think SDMX should be the
defining authority or repository of codes. If you take a look at HL7, the
code lists which they define are not often substantive but only those
necessary to support the data structures being defined. It is the
implementation guides and their defining authorities that create the
substantive codes.
I have two suggestions. One would be to have a pre-defined convention
for mapping from PHIN VADS to SDMX. (It would be nice to find a
multi-lingual repository.) Another would be to extend the international
indicator repositories to be codelist repositories.
HTH, Roger
Hi Beatriz

Bob Jolliffe

unread,
Mar 14, 2010, 7:09:47 AM3/14/10
to sdm...@googlegroups.com
Hi Roger

I think you hit a couple of nails on a couple of heads.

On 12 March 2010 23:47, r.friedman <r.fri...@mindspring.com> wrote:
> List --
>   It is quite impossible to know what demographic categories a country will
> consider important, let alone the codelist they think is appropriate for it.
> You can go with the flow or try to change things or put it on the to-do list
> or whatever strategy makes sense given you and them.  At least countries are
> actually using their codelists, the experts are still arguing over flavors
> of null.
>   I am worried that a communication protocol definition should purport to
> be the authority on code lists.

And so you should. This was a point myself and others made about
SDMX-HD at the geneva connectathon last year. As a result of which
two key decisions were taken:
(i) whereas in the initial conception, all indicators were to be drawn
from a codelist of indicators under the maintenance agency of SDMX-HD,
this was changed. The indicator dimension should use the common
concept (to identify itself as an indicator), but the codelist from
which values are drawn should be an agency specific codelist.
(ii) the specification of the structure of datasets was separated from
the specification of common codes. Part 1 will describe the protocol
definition (at least the structuring/markup of XML data) and part 2
will describe common codelists and concepts. The common codelists and
concepts being those lists under the maintenance agency of SDMX-HD.
How minimal those lists should be and how the governance of them
should be managed is an open question which I think is why these
questions are aired publicly. There has not been a significant input
from stakeholders over the past few months but perhaps the public
review period might provide an opportunity for scrutinizing what is
there.

>There is no universal semantics, it is up
> to the communicating parties to harmonize their categories or to determine
> if a particular data source is sufficiently compatible to be useful.
>   I think it would be fine to have examples of codelists that are used in
> particular countries, or prepared libraries of codes, or prepared mappings
> of one set of codes to another, but I don't think SDMX should be the
> defining authority or repository of codes.  If you take a look at HL7, the
> code lists which they define are not often substantive but only those
> necessary to support the data structures being defined.  It is the
> implementation guides and their defining authorities that create the
> substantive codes.
>   I have two suggestions.  One would be to have a pre-defined convention
> for mapping from PHIN VADS to SDMX.  (It would be nice to find a
> multi-lingual repository.)  Another would be to extend the international
> indicator repositories to be codelist repositories.

Generating SDMX codelists from PHIN VADS codes should be a fairly
trivial process which CDC or others would probably have to do if they
wished to use these codes in statistical reporting with SDMX. Its
really a matter of translating to the appropriate xml form, adding an
agencyID="CDC" (or whatever is appropriate), and possibly linking
additional metadata to the codes.. If you want to do this and are
stuck you can contact me off list and I'll help if I can.

Which international indicator repositories are you referring to? The
WHO IMR? In which case I do agree. And I'm guessing it would be easy
enough to enhance the existing functionality with this. (It might
even be useful if someone were to set up an SDMX registry using
something like the eurostat registry software but perhaps this is more
than is required for just making codelists avaialble).

Regards
Bob

> HTH, Roger
> Hi Beatriz

r.friedman

unread,
Mar 14, 2010, 11:42:32 AM3/14/10
to sdm...@googlegroups.com
Bob --
I went back and re-read the spec, and while what you say is literally
true, the combination of the declarative nature of the spec and the
reservation of codelist names seems to arrogate the ownership to SDMX. I
think the spec could be improved with more statements of purpose, usage or
rationale, the screen shots of XML are not self-explanatory.
So maybe we could work through one of these on-line. Here is a PHIN
VADS value set for true/false:
Value Set Name: True False (TF)
Value Set Code: PHVS_TrueFalse_CDC
Value Set OID: 2.16.840.1.114222.4.11.928
Value Set Version: 1
Value Set Definition: True/False values that may be used when the source has
a boolean (T/F) input field
Value Set Status: N/A
VS Last Updated: 10/22/2008
Concept Code: 64100000
Concept Name: False (qualifier value)
Preferred Concept Name: False
Preferred Alternate Code: G-A354
Code System OID: 2.16.840.1.113883.6.96
Code System Name: SNOMED-CT
Code System Code: PH_SNOMED-CT
HL7 Table 0396 Code: SCT
Concept Code: 31874001
Concept Name: True (qualifier value)
Preferred Concept Name: True
Preferred Alternate Code: G-A355
Code System OID: 2.16.840.1.113883.6.96
Code System Name: SNOMED-CT
Code System Code: PH_SNOMED-CT
HL7 Table 0396 Code: SCT2
So if I want to use PHIN VADS coding in my message, I have to do something
like this:
<structure:CodeList id="CL_LOGICAL" agencyID="CDC-PHIN-VADS" version="1"
isFinal="true"

urn="http://phinvads.cdc.gov/vads/ViewValueSet.action?id=8BD34BBC-617F-DD11-
B38D-00188B398520">
<structure:Name xml:lang="en">PHVS_TrueFalse_CDC</structure:Name>
<structure:Description xml:lang="en">True/False values that may be
used when the source has a boolean (T/F) input field</structure:Description>
<structure:Code value="64100000"
urn="http://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.
1.113883.6.96&code=64100000"
<structure:Description xml:lang="en">False</structure:Description>
<structure:Code value="31874001"
urn="http://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.
1.113883.6.96&code=31874001"
<structure:Description xml:lang="en">True</structure:Description>
<structure:Code value="UNK"
urn="http://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.
1.113883.5.1008&code=UNK"
<structure:Description xml:lang="en">unknown
unknown</structure:Description>
</structure:CodeList>
Notice I have hijacked CL_LOGICAL, I'm not sure whether this is necessary.
I'm not sure about the urns, whether SDMX is expecting xml or just a
reference. These are just references. Notice the ugly GUID, there ought to
be some way to get to it through OID, which is universally unique and
meaningful, like the code values, but I am no PHIN VADS expert. I have
included only one of the PHIN VADS null flavors (if interested, see
http://phinvads.cdc.gov/vads/ViewValueSet.action?id=A0D34BBC-617F-DD11-B38D-
00188B398520).
As I understand it, I should only be listing codes which occur in the
data. Is this optional or required? It seems it would prevent the
preparation of standard codelist files. Also, it would make mapping more
difficult because you could only map those codes which appear in the data.
Wouldn't you be surprised if next month you received data with a "maybe"
code value.
Speaking of mapping, would it make sense to define a mapping XSD?
WRT indicator types, sometimes I have seen indicators which are raw
numbers applicable to a subset of patients (e.g., number of patients with
low BMI among HIV+ patients). The indicator could be reported for
non-disjoint subsets of patients (e.g. TB patients, children, orphans,
alcohol and drug abusers, etc.). How would you expect this to be handled?
Thanks, Roger


-----Original Message-----
From: sdm...@googlegroups.com [mailto:sdm...@googlegroups.com] On Behalf
Of Bob Jolliffe
Sent: Sunday, March 14, 2010 7:10 AM
To: sdm...@googlegroups.com
Subject: Re: CL_RACE codelist

Bob Jolliffe

unread,
Mar 14, 2010, 2:11:56 PM3/14/10
to sdm...@googlegroups.com
Hi

On 14 March 2010 15:42, r.friedman <r.fri...@mindspring.com> wrote:
> Bob --
>    I went back and re-read the spec, and while what you say is literally
> true, the combination of the declarative nature of the spec and the
> reservation of codelist names seems to arrogate the ownership to SDMX.  I
> think the spec could be improved with more statements of purpose, usage or
> rationale, the screen shots of XML are not self-explanatory.

Not sure which version you are talking of, but by your reference to
screenshots I presume you are referring to the original word document
date April 2009. This is not really a standard - more a sort of user
guide. The emerging standard text will be in 3 parts:
1. a definition of the required xml markup of sdmx_hd messages. For
this I am specifying a schema using schematron to supplement the
requirement of being valid SDMX through the SDMX xsd schemata.
2. a description of the agency=SDMX-HD codelists, conceptschemes etc
3. a description of zip file packaging

Part 2 is basically complete. I am planning on completing part 1 this
week. Meanwhile you can follow progress using subversion at
http://svn.sdmx-hd.org/sdmxhd/schematron-sdmx-hd or check the
repository through the web interface at sdmx-hd.org.

The aim is not to be a userguide or tutorial. Gary and team will
provide that in addition. Rather it is to simply specify what is
required by the standard in a formal and measurable way so that
application developers can know what to do in order to be conformant.

That looks quite ok to me. Except for the urns. URNs must follow a
standard synatx (see eg. rfc2141). Over and above which SDMX defines
a particular format for SDMX urns. I can find you a reference but it
is Sunday and I've still to cook dinner :-) I am not fond of urms
btw, but it seems we are stuck with them in terms of SDMX. See below
re mapping.

>    As I understand it, I should only be listing codes which occur in the
> data.  Is this optional or required?  It seems it would prevent the
> preparation of standard codelist files.

I agree. I don't think this should be a requirement. I think it is
sufficient requirement when transmitting data to only provide the
codelists which are referred to as accompanying metadata. My sense is
that where you have a codelist with isFinal attribute set as true, you
are not really supposed to be messing with it.

>Also, it would make mapping more
> difficult because you could only map those codes which appear in the data.
> Wouldn't you be surprised if next month you received data with a "maybe"
> code value.
>    Speaking of mapping, would it make sense to define a mapping XSD?

Not sure about a mapping XSD. I am guessing it might be possible to
create a mapping xslt to provide standard transforms from say
PHIN_VADS lists to SDMX codelists but its not something I've
investigated. The keys to be mapped would probably be the SDMX-HD urn
to the PHIN_VADS authoritative URI in order to be able to navigate to
and fro between the two representations.

>    WRT indicator types, sometimes I have seen indicators which are raw
> numbers applicable to a subset of patients (e.g., number of patients with
> low BMI among HIV+ patients).  The indicator could be reported for
> non-disjoint subsets of patients (e.g. TB patients, children, orphans,
> alcohol and drug abusers, etc.).  How would you expect this to be handled?

Roger this is a tough question which I imagine the likes of Ola would
answer better. In DHIS terms the raw numbers would not be treated as
indicators at all - but rather as dataelements. Indicators are
constructed from dataelements. But thats not really the substance of
your question.

I'm not sure that SDMX solves these sort of problems - it simply
provides a structure. An sdmx-hd data value would have an indicator
key and any number of additional disaggregation keys. How you choose
those keys would determine how you could represent the non-disjoint
subsets. One thing which seems clear (if I understand you correctly)
is that a single "target population" key would not be sufficient as it
would not be able to express the overlaps.

Regards
Bob

Whitaker, John Patrick

unread,
Mar 15, 2010, 5:32:04 AM3/15/10
to sdm...@googlegroups.com
Hi Roger,

We are pursuing both strategies, a VADS link and enhancements to the IMR
to support codelists.

Observation-level metadata, e.g. placename and organization, is an issue
since necessary for data exchange. Translation and complete
placenames(to nth admin level) are not well-supported internationally.

Cross-domain codelists are part of the SDMX standard. It seems easier to
agree on statistical concepts. I do like the idea of ISO standards for
codelists, whether HL7, SDMX, or other SDO, for the formal international
vetting process, which encourages convergence among organizations.
Custom codelists in SDMX-HD can be based on these.

Patrick

Patrick Whitaker
Technical Officer
Health Care Informatics Unit
Health Statistics and Informatics Department
World Health Organization

Tel. direct: +41 22 791 1372
E-mail: whit...@who.int

-----Original Message-----
From: sdm...@googlegroups.com [mailto:sdm...@googlegroups.com] On

Behalf Of r.friedman
Sent: Saturday, March 13, 2010 12:48 AM
To: sdm...@googlegroups.com
Subject: RE: CL_RACE codelist

--

Bob Jolliffe

unread,
Mar 15, 2010, 6:00:36 AM3/15/10
to sdm...@googlegroups.com
Hello Beatriz

Thank you for your input regarding the practice in Brazil. It seems
to me then that there might well be a number of use cases for
retaining a concept of race in the SDMX-HD disaggregation concept
scheme, but to allow agencies/countries to specify the actual code
list which would be used.

Patrick should we put race back into the conceptscheme? I know its a
problematic concept as it is understood very differently in different
contexts, but if agencies have a clear requirement for such a concept
we might be better off providing it. My initial objection was to the
codelist rather than the concept. I am pretty clear on the former and
there also seems to be some consensus - I am a bit fuzzier on the
latter :-)

Regards
Bob

2010/3/12 Beatriz de Faria Leão <bfl...@gmail.com>:

Beatriz de Faria Leao

unread,
Mar 15, 2010, 7:12:29 AM3/15/10
to sdm...@googlegroups.com
I also like the idea of using ISO. Should we look at the ISO Health Informatics -TS 22220 - Identification of Subjects of Care? They have several of these attributes.
http://www.iso.org/iso/catalogue_detail.htm?csnumber=40782
Beatriz

Bob Jolliffe

unread,
Mar 15, 2010, 9:46:20 AM3/15/10
to sdm...@googlegroups.com
On 15 March 2010 11:12, Beatriz de Faria Leao <bfl...@gmail.com> wrote:
> I also like the idea of using ISO.

I am lukewarm (well at least conditional) about the use of some ISO
standards. Where they are not freely available they can actually be
deeply problematic for implementation, particularly (though not
exclusively) by open source developers in developing countries. It
seems for example I can purchase a hard copy of TS 22220 standard from
National Standards Ireland for just over Euro 200 - no pdf :-(. It is
not available through the South African Bureau of Standards and
probably most other national bodies who don't have a direct membership
to the relevant SC . Developers in most countries would have to (i)
have access to a credit card and (ii) have access to forex in order to
be able to purchase a pdf from ISO or one of the big national body
"standards shops" around the world. I really do wish the committees
working on these standards made more of a point of making them open.
I suspect it really comes down to SCs insisting on this - like SC34
has done with the DSDL standard (www.dsdl.org). Otherwise ISO runs
the serious risk of making itself redundant, except in some ivory
towers.

Sorry - rant over .... But if anyone on this list is working on ISO
committees then please do heed my plea. We need to create open
standards so that the important work that is put into their creation
is effectively accessable to those who need it. The 1950's iso
business model has had its day :-)

If anyone can share the contents of TS 22220:2009 on this list I am
sure we can discuss it.

Regards
Bob

r.fri...@mindspring.com

unread,
Mar 15, 2010, 10:49:49 AM3/15/10
to sdm...@googlegroups.com
List --
Most of the racial categories I have worked with internationally are like Beatriz', perhaps with more politically correct names. The hispanic category is a US thing. In some countries in Latin America, tribe has also been important among indigenous; in other places, indigenous coastal people have mixed with black slaves to the extent that there is no distinction between tribal and racial origin. Open MRS began with tribe as a primary person attribute, now it has become only an installation-specifiable one. Maybe the thing to do is have CL_ETHNICITY as an open codelist.
Am I correct that in SDMX a particular observation cannot take on multiple code values? Can a data element have cardinality greater than 1? If so, can their order be preserved (like primary cause of death, secondary cause of death, underlying cause of death)?
How is multi-lingual capability handled in SDMX? Can there be multiple name and description elements with different xml:lang attributes? Does the XSL which you use to display the DSD allow the selection of a language?
Could you explain again why Indicator is a dimension with IDs as values rather than each data element constituting a series? Where indicators are a ratio, where are the numerator and denominator for roll-ups? How is CL_DISAGG used?
What is the use case for _ALL?
Thanks, Roger

-----Original Message-----
>From: Bob Jolliffe <bobjo...@gmail.com>
>Sent: Mar 15, 2010 6:00 AM
>To: sdm...@googlegroups.com
>Subject: Re: CL_RACE codelist
>

>>>> -----Original Message-----
>>>> From: sdm...@googlegroups.com [mailto:sdm...@googlegroups.com] On Behalf Of Bob Jolliffe
>>>> Sent: vendredi 12 mars 2010 17:08
>>>> To: sdm...@googlegroups.com
>>>> Subject: CL_RACE codelist
>>>>
>>>> Looking (unusually) at codelist content.  Where on earth did we come
>>>> up with the codes for CL_RACE?
>>>> (http://www.sdmx-hd.org/docs/SDMX/SDMX-HD-v1.0/COMMON/SDMX-HD/v1.0/codelists/CL_RACE+SDMX-HD+1.0.xml).
>>>>  It looks more like something that was peeled from the back of a US
>>>> police station wall :-)
>>>>
>>>> For the most part this looks quite inappropriate for use in an
>>>> international-oriented standard.  Who needs this?  If we really must
>>>> have a racial classification codelist (and I would be really reluctant
>>>> to try and standardise one myself) is there some pre-existing,
>>>> acceptable list that anyone is aware of?  This one looks like it may
>>>> be US government standard in which case it might exist under
>>>> agencyID="USG" or the like, but I don't think we can mandate it as an
>>>> SDMX-HD standard list.
>>>>
>>>> Regards
>>>> Bob
>>>>

>>>> --
>>>> You received this message because you are subscribed to the Google Groups "SDMX-HD (Health Domain)" group.
>>>> To post to this group, send email to sdm...@googlegroups.com.
>>>> To unsubscribe from this group, send email to sdmx_hd+u...@googlegroups.com.
>>>> For more options, visit this group at http://groups.google.com/group/sdmx_hd?hl=en.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups "SDMX-HD (Health Domain)" group.
>>>> To post to this group, send email to sdm...@googlegroups.com.
>>>> To unsubscribe from this group, send email to sdmx_hd+u...@googlegroups.com.
>>>> For more options, visit this group at http://groups.google.com/group/sdmx_hd?hl=en.
>>>>
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "SDMX-HD (Health Domain)" group.
>>> To post to this group, send email to sdm...@googlegroups.com.
>>> To unsubscribe from this group, send email to sdmx_hd+u...@googlegroups.com.
>>> For more options, visit this group at http://groups.google.com/group/sdmx_hd?hl=en.
>>>
>>>
>>

>> Beatriz de Faria Leão
>> beatri...@zilics.com.br
>>
>>
>>

Bob Jolliffe

unread,
Mar 15, 2010, 12:35:23 PM3/15/10
to sdm...@googlegroups.com
On 15 March 2010 14:49, <r.fri...@mindspring.com> wrote:
> List --
>    Most of the racial categories I have worked with internationally are like Beatriz', perhaps with more politically correct names.  The hispanic category is a US thing.  In some countries in Latin America, tribe has also been important among indigenous; in other places, indigenous coastal people have mixed with black slaves to the extent that there is no distinction between tribal and racial origin.  Open MRS began with tribe as a primary person attribute, now it has become only an installation-specifiable one.  Maybe the thing to do is have CL_ETHNICITY as an open codelist.

Yes I could go along with that. Though we don't actually need to
specify open/empty codelists. If there is a common concept
"ETHNICITY" it would anyway allow implementors to create a dimension
of data using this concept - along with the codelist of their
choosing. Something like:

<structure:Dimension conceptRef="ETHNICITY"
conceptSchemeRef="CS_COMMON" conceptVersion="1.0"
conceptSchemeAgency="SDMX-HD"
codelist="CL_ETNIA"
codelistVersion="1.4"
codelistAgency="FED_GOV_BRAZIL"/>

>    Am I correct that in SDMX a particular observation cannot take on multiple code values?  Can a data element have cardinality greater than 1?  If so, can their order be preserved (like primary cause of death, secondary cause of death, underlying cause of death)?

Not sure exactly what you mean. Dimensions will take on code values.
So you might have:

<ns:OBS_VALUE INDICATOR="8254" ETHNICITY="1" VALUE="13" />
<ns:OBS_VALUE INDICATOR="8254" ETHNICITY="2" VALUE="18" />

In which case there is more than one value with indicator 8254.

Or do you mean more like:

<ns:OBS_VALUE INDICATOR="8254" PRIMARY_COD="1" SECONDARY_COD="3" VALUE="12" />
<ns:OBS_VALUE INDICATOR="8254" PRIMARY_COD="2" SECONDARY_COD="3" VALUE="17" />

Where codes for PRIMARY_COD and SECONDARY_COD are different concepts,
but drawn from the same codelist? That should certainly be possible.
The ordering is something else mind you. The sdmx standard does refer
to the importance of ordering of sdmx dimensions and attributes but
the fact is that they all eventually come out in the wash as XML
attributes to an OBS_VALUE. And the XML standard is very clear that
the ordering of attributes is not guaranteed to be preserved by XML
processors. So I don't believe we can or should make any assumptions
about the ordering.

>    How is multi-lingual capability handled in SDMX?  Can there be multiple name and description elements with different xml:lang attributes?

I did check and the schema certainly allows us. So for example:

<structure:Concept id='INDICATOR'
urn='urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=SDMX-HD:CS_COMMON[1.0].INDICATOR'>
<structure:Name xml:lang='en'>The indicator</structure:Name>
<structure:Name xml:lang='no'>Indikatoren</structure:Name>
<structure:Description xml:lang='en'>The indicator</structure:Description>
<structure:Description xml:lang="no">Indikatoren</structure:Description>
<structure:TextFormat textType='String'/>
</structure:Concept>

is valid. Of course it is up to applications how they would interpret
this multilingualism.

>Does the XSL which you use to display the DSD allow the selection of a language?

Not sure which particular XSL this is but I'm guessing not. Though do
bear in mind that I don't think this XSL is part of the standard. But
I guess these stylesheets could easily be parameterized to select
Name[@xml:lang='no'] etc.

The one gotcha I see is in the concept ids. The concept id (eg
ETHNICITY, INDICATOR, PRIMARY_COD in the examples above) is used to
form the XML attribute name in the resulting dataset obs_values. So
the XML attribute names formed from common concepts (and the element
names like DataSet, Series etc) will not be readily translatable.
Though the the underlying code values, descriptions and metadata can
certainly be internationalized.

There is an excellent little ISO standard called DSRL (edited by
Martin Bryan) which is designed to handle exactly this issue -
http://www.dsdl.org/dsrl-tutorial.pdf. This might be something to
consider, but for the moment I think it is an sdmx issue which is
outside of the scope of sdmx-hd.

>    Could you explain again why Indicator is a dimension with IDs as values rather than each data element constituting a series?  Where indicators are a ratio, where are the numerator and denominator for roll-ups?

I'm going to get confused with nomenclature here. Are you using
dataelement in the dhis sense - ie raw aggregated observations
(usually counts) from which indicators (usually ratios, percentages)
are composed?

Not sure if it helps, but there is an indicator hierarchy where
numerator/denominator components of composite indicators can be
defined.

I have some misgivings about using the term indicator and dataelement
interchangeably but I'm going along with it for now. You will know
from dhis that it doesn't work that way in our datamodel. Is that
where you are coming from? And much of the discussion which happens
about indicators is really referring to what I would call
dataelements. Rolling up of multidimensional values for dataelements
I understand. Rolling up of percentages and ratios is certainly more
of a challenge :-)

> How is CL_DISAGG used?
Something of a hack to allow "partial configuration". There was some
discussion on this list around 24 February about this.

>    What is the use case for _ALL?

The WHO folk came up with these "generic" codes - _ALL, _NA, and
_UNK. I am sure they have use cases in mind for all of them. I have
found '_NA' to be the most useful one for giving a codevalue to a
mandatory dimension which might not be relevant in a particular
context. I can foresee using _UNK. Though I haven't seen a use case
for _ALL either. Gary do you have one?

Regards
Bob

Reply all
Reply to author
Forward
0 new messages