I have spent the afternoon trying to wrap up the schema for the DSD
but I am a bit stuck on attributes.
1. We have had a number of discussions around fequency and
periodicity. The current concepts and codelists do not reflect where
we are on this. Probably the most important discussion is here:
http://groups.google.com/group/sdmx_hd/browse_thread/thread/be43a95a45349949.
The consequence of this is that we require a
(i) FREQUENCY concept in the SDMX -HD mandatory dimension concepts.
(ii) CL_PERIODICITY renamed CL_FREQUENCY. Not really a requirement
but it makes the naming more uniform.
Also ..
In the CS_ATTRIBUTE conceptscheme we have:
FREQ Frequency Frequency refers to the time interval between the
observations of a time series.
FREQ_DISS String Expected Frequency of Data Dissemination Expected
frequency of updates of WHO estimates.
FREQ_RECOM String Recommended Collection Expected frequency of updates
of national-level data.
Can I suggest that (i) FREQ_DISS and (ii) FREQ_RECOM are AgencyID=WHO
concepts so, should be removed. And that the definition of FREQ is
misleading. Presumably this concept is meant to be used with the MSD
to describe the expected frequency of collection for a
dataelement/indicator. It cannot refer to the actual time interval
between actual observations because it is metadata about the indicator
not the dataset. For the former we already have the frequency
dimension on the series.
That then should restore some sanity to the time dimensions/attributes.
2. Regarding the remaining attributes I am having some trouble
specifying exactly what we actually require. For the sake of
progress I am leaving it open for now in the sense that implementors
which produce a DSD are free to define whatever attributes they wish.
They should where it makes sense make use of the existing concepts in
CS_Attributes and existing SDMX-HD codelists to do so. And consumers
have to consume what the DSD tells them they have to consume :-)
Tomorrow is Paddy's Day here in Ireland so I am off work. I'll try
and look at my mail in the evening.
Regards
Bob.
Hi Bob,
OK quick recap is needed:
"While frequency refers to the time interval between the observations of a time series, periodicity refers to the frequency of compilation of the data (e.g., a time series could be available at annual frequency but the underlying data are compiled monthly, thus have a monthly periodicity)."
ID=FREQ = The interval between observations e.g YEARLY
ID=PERIODICITY = The frequency in which observational data was compiled e.g. MONTHLY.
Actions:
1. I will deleted these two concepts as suggested as they should be Agency specific:
FREQ_DISS = Expected frequency of updates of WHO estimates.
FREQ_RECOM = Expected frequency of updates of national-level data.
2. For me what you call FREQUENCY is FREQ which uses CL_FREQ
In fact when you look at some of the SDMX samples they also use FREQ not FREQUENCY:
As for its metadata use, I have no problem in using the same concept “FREQ” as you have the metadata stating “FREQ of observations should be Quarterly” and then actual observations showing that the observations are “Monthly”.
3. FREQUENCY concept in the SDMX -HD mandatory dimension concepts?
As previously stated I think this is ID=FREQ not ID=FREQUENCY.
I initially thought that FREQ should be a mandatory attribute, but not a dimension as it is not used to key the observations.
However, the SDMX samples do have this as a Dimension:
<structure:Dimension conceptRef="FREQ" codelist="CL_FREQ" isFrequencyDimension="true" />
As the name implies “isFrequencyDimension” is a dimension.
Are you therefore suggesting that the FREQ concept needs to me moved to the CS_COMMON CompactScheme?
4. CL_PERIODICITY renamed CL_FREQUENCY?
As previously stated I think this is CL_FREQ not CL_FREQUENCY.
I fail to see why CL_PERIODICITY should be renamed CL_FREQUENCY.
a) Are you suggesting by renaming CL_PERIODICITY to CL_FREQUENCY we will end up with CL_FREQUENCY and the existing CL_FREQ?
Confusing!
b) What happens to CL_FREQ if we actually rename CL_PERIODICITY to CL_FREQ
c) CL_FREQ is published as a cross domain code-list (http://sdmx.org/wp-content/uploads/2009/01/02_sdmx_cog_annex_2_cl_2009.pdf#9 ) so for me we should not changes this.
Suggest we leave CL_FREQ and CL_PERIODICITY as is for now – OK?
Regards
gary patchen
lead consultant
direct + 41.(0)58.307.7094
mobile +41.(0)79.333.1339
blue-infinity headquarters
INTERNATIONAL
t +800.307.70.000
f +800.307.70.001
b-i branding.technology.integration.
The information in this e-mail, and those ensuing, is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, please destroy this message and notify us immediately.
Please think of the environment before printing this email
--
You received this message because you are subscribed to the Google Groups "SDMX-HD (Health Domain)" group.
To post to this group, send email to sdm...@googlegroups.com.
To unsubscribe from this group, send email to sdmx_hd+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sdmx_hd?hl=en.
SDMX defines 3 formats - cross-sectional, compact and generic. The
compact dataset format is the time series one (is that what you mean
by longitudinal?). In the routine reporting of routine data there is
no real benefit in using the compact form as the observations will
typically all be for one time period. In the proof of concept we did
with OpenMRS reporting monthly ART summary data to DHIS using SDMX, we
used the cross-sectional format and that seemed to meet our
requirements well. Ryan might have additional comments to make. It
was a relatively straightforward modification to his openmrs module to
allow it to produce a cross-sectional report instead of a compact
dataset one.
Given that we are not yet sitting on a pool of experience of sdmx-hd
deployment for data transfer I have argued that we should not be over
prescriptive about this. We have some fairly tight rules about the
definition of datasets (the DSD). I strongly feel that if there are
scenarios which are better supported by one data format or the other
we should let the requirements dictate which format to use. The
schema for both formats is generated off the DSD. There is nothing
really to be gained by saying you must use a particular dataset
format. Ultimately it should be the publisher of the DSD in question
which should dictate which format it would like to see datasets coming
back in.
Regards
Bob
Hi Bob,
OK quick recap is needed:
"While frequency refers to the time interval between the observations of a time series, periodicity refers to the frequency of compilation of the data (e.g., a time series could be available at annual frequency but the underlying data are compiled monthly, thus have a monthly periodicity)."
ID=FREQ = The interval between observations e.g YEARLY
ID=PERIODICITY = The frequency in which observational data was compiled e.g. MONTHLY.
Actions:
1. I will deleted these two concepts as suggested as they should be Agency specific:
FREQ_DISS = Expected frequency of updates of WHO estimates.
FREQ_RECOM = Expected frequency of updates of national-level data.
2. For me what you call FREQUENCY is FREQ which uses CL_FREQ
In fact when you look at some of the SDMX samples they also use FREQ not FREQUENCY:
As for its metadata use, I have no problem in using the same concept “FREQ” as you have the metadata stating “FREQ of observations should be Quarterly” and then actual observations showing that the observations are “Monthly”.
3. FREQUENCY concept in the SDMX -HD mandatory dimension concepts?
As previously stated I think this is ID=FREQ not ID=FREQUENCY.
I initially thought that FREQ should be a mandatory attribute, but not a dimension as it is not used to key the observations.
However, the SDMX samples do have this as a Dimension:
<structure:Dimension conceptRef="FREQ" codelist="CL_FREQ" isFrequencyDimension="true" />
As the name implies “isFrequencyDimension” is a dimension.
Are you therefore suggesting that the FREQ concept needs to me moved to the CS_COMMON CompactScheme?
4. CL_PERIODICITY renamed CL_FREQUENCY?
As previously stated I think this is CL_FREQ not CL_FREQUENCY.
I fail to see why CL_PERIODICITY should be renamed CL_FREQUENCY.
a) Are you suggesting by renaming CL_PERIODICITY to CL_FREQUENCY we will end up with CL_FREQUENCY and the existing CL_FREQ?
Confusing!
b) What happens to CL_FREQ if we actually rename CL_PERIODICITY to CL_FREQ
c) CL_FREQ is published as a cross domain code-list (http://sdmx.org/wp-content/uploads/2009/01/02_sdmx_cog_annex_2_cl_2009.pdf#9 ) so for me we should not changes this.
Suggest we leave CL_FREQ and CL_PERIODICITY as is for now – OK?