Proposal for results schema changes

58 views
Skip to first unread message

Herb Caudill

unread,
Mar 3, 2015, 5:47:39 PM3/3/15
to iati-te...@googlegroups.com
All - I've drafted a proposal outlining three changes to the IATI standard that would make it more useful for reporting results. 

The first proposal is easy, the second slightly more complex, and the third is likely quite involved. I've put up the proposal in an unpublished blog post here: http://blog.devresults.com/iati-results-standard/

Here's the executive summary:
  1. Require unambiguous indicator references. There is currently no way to identify precisely which indicator is being reported on, which makes it impossible to aggregate data coming from disparate sources. The solution here is relatively simple: Support an element referencing an indicator vocabulary and an indicator code within that vocabulary.
  2. Support disaggregation of performance data. A thornier issue is that there is currently no way to disaggregate results data — by gender, by age, by geography, or any other way. IATI already supports a versatile and precise standard for specifying an activity's geographic scope. We propose adding geographic disaggregation to the results standard, as well as adding elements for disaggregating results data into other categories.
  3. Add an indicator schema to the IATI standard. This is a much bigger challenge than the first two. Ultimately, though, if we're going to understand the data reported on a for a given indicator, we need a precise definition of that indicator. Our proposal is to create a separate top-level indicator standard, parallel to the existing activity and organization standards, so that aid organizations and indicator registries will have a common language for describing what they measure and how.
I don't have any idea what the appropriate channels are for making these sorts of proposals, so please forgive me if I'm being clumsy here, or if this is the wrong venue for this discussion. 

Otherwise, please let me know what the next steps might be. I'd be happy to put together a pull request that encodes the first two changes (XSD, documentation, tests) if that would be helpful. If there's enthusiasm for the third proposal, I'd be happy to take a first stab at drafting a schema. 

Let me know what you think. 

Cheers,
Herb

Herb Caudill
Founder | Chief Technology Officer
202.294.7901

DevResults: A better way to manage international aid. | www.devresults.com







On Mon, Mar 2, 2015 at 3:45 AM, Bill Anderson <Bill.A...@devinit.org> wrote:

Catherine (and belatedly Herb)

 

One of the main reasons for publishing  to the IATI standard is for your data to be comparable with other publishers' data. This applies as much to results as it does to geography, sectors, etc. Without commonly used indicator vocabularies reporting results through IATI has limited value. The M&E world in general appears not to appreciate the need for global standards which would allow for better comparison of results data being produced from different sources.

 

One good example of what this should look like is the World Health Organisation's Indicator and Measurement Registry which is "a central source of metadata of health-related indicators used by WHO and other organizations. It includes indicator definitions, data sources, methods of estimation and other information that allow users to get better understanding of their indicators of interest. It facilitates complete and well-structured indicator metadata, harmonization and management of indicator definitions and code lists, internet access to indicator definitions, and consistency with other statistical domains."

 

 

From: iati-te...@googlegroups.com [mailto:iati-te...@googlegroups.com] On Behalf Of Catherine Marschner
Sent: 02 March 2015 05:39
To: iati-te...@googlegroups.com
Subject: [IATI Tech] Re: Disaggregated results data

 

All - MCC would also like to look at publishing our results data in a greater disaggregated format, particularly by gender.  Our own systems can handle multiple disaggregation types but we just post the rolled up data to IATI. Would love to have some conversations about ways to do this!

 

 



On Friday, January 16, 2015 at 12:27:41 PM UTC-5, Herb Caudill wrote:

One other question on reporting results data via IATI, this one probably a bit thornier than the issue of indicator identifiers. 

 

Typically, implementers are asked to disaggregate reported results data in various ways. The most common attributes are demographic (gender, age, ethnicity) but there's no end to the possibilities. Just a few examples:

  • Yields, disaggregated by crop type (soybeans, cassava, groundnuts, etc)
  • Training, disaggregated by organization type (NGO, government agency, private firm, etc)
  • Individuals reached by a public information campaign, disaggregated by media type (TV, radio, print, etc)
  • Individuals treated, disaggregated by HIV status (positive, negative, unknown)
  • Individuals treated, disaggregated by treatment regimen (sdNVP, AZT+sdNVP, ART)

It doesn't seem at all realistic to come up with any sort of universal codelist since the possibilities are literally endless; we're often forced to just match on text, with the aid of an internal thesaurus for synonyms (e.g. M = Man = Men = Male = Homme etc.)

 

In many legacy systems that we've dealt with, disaggregations are treated as separate indicators (e.g. "1.1a Men trained" vs
"1.1b Women trained"). This is a very messy approach and I'd hope we wouldn't have to re-introduce it here. 

 

Further complicating things, when data is disaggregated by more than one attribute, you have two possible approaches, both about equally common in the real world:

 

1. Parallel disaggregation 

 

Male

Female

Total

Child

Adult

Total

20

30

50

10

40

50

 

2. Cross-disaggregation (preferable because it captures more information)

 

Male

Female

Total

Child

Adult

Child

Adult

10

20

5

15

50

 

Has there been any discussion on how to incorporate disaggregation into the IATI indicator element? If not, we'd be happy to draft a proposal, since this is a problem that we've already had to work through in our own software product. 

 

Cheers

 

Herb Caudill
Founder | Chief Technology Officer
202.294.7901

DevResults: A better way to manage international aid. | www.devresults.com

 

 

 

 

 

 

 

 

 

 

--
You received this message because you are subscribed to the
"IATI Technical" discussion list. Find out more at http://www.aidtransparency.net/governance/tag
 
To post to this group, send email to iati-te...@googlegroups.com
 
To unsubscribe from this group, send email to
iati-technica...@googlegroups.com
 
For more options, including the option to switch to a digest subscription, visit this group at http://groups.google.com/group/iati-technical
 
Tickets for the IATI technical secretariat can be posted to http://support.iatistandard.org
---
You received this message because you are subscribed to the Google Groups "IATI Technical Advisory Group (TAG) technical discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iati-technica...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________


______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

--
You received this message because you are subscribed to the
"IATI Technical" discussion list. Find out more at http://www.aidtransparency.net/governance/tag
 
To post to this group, send email to iati-te...@googlegroups.com
 
To unsubscribe from this group, send email to
iati-technica...@googlegroups.com
 
For more options, including the option to switch to a digest subscription, visit this group at http://groups.google.com/group/iati-technical
 
Tickets for the IATI technical secretariat can be posted to http://support.iatistandard.org
---
You received this message because you are subscribed to the Google Groups "IATI Technical Advisory Group (TAG) technical discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iati-technica...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tim Davies

unread,
Mar 4, 2015, 5:20:53 AM3/4/15
to iati-te...@googlegroups.com
Hello Herb

Great to see really good concrete proposals on the list. 

I've had a very quick look - and some brief reflections:
  • In terms of disaggregated results for (2) it may be useful to think about how an OLAP Data Cube model might be used. Whilst more verbose, I suspect it will be easier for systems to ingest and make sense of OLAP Cube style data. (I've not found a nice simple intro reference to point to, but essentially rather than nesting all the dimensions, you would simply express the dimensions of a given value as siblings within a parent node). I can work up an example if useful.

  • In terms of existing codelists for disaggregation on issues like gender, age etc. we might want to look to SDMX (http://sdmx.org/) which is widely adopted in statistical community, and provides very thoroughly explored definitions for common concepts in statistical data. This would fit with IATIs practice of trying to avoid creating codelists where they already exist.

  • In terms of an Indicator Standard - I think this could be an incredibly interesting project. It might be one to look at in terms of a cross-standard effort (e.g. results indicators also potentially needed in other standards like Open Contracting, emerging Budget data standards) - but probably some sort of rapid prototyping with real data on indicators might be a starting point, rather than starting from drafting the schema in the abstract. One possible approach to keep things a little decentralised if going this route would be in (1), instead of using a codelist to identify indicator schemas, would be to propose use of a URL pointing to an Indicator Standard document. 
From experience with past updates to the standard like this, you/IATI Sec might find it useful to call a short Web Meeting of some sort, or perhaps there can be an in-depth session on this at the next TAG...

All the best

Tim
--


w: http://www.timdavies.org.uk | m: 07834 856 303 | twitter: timdavies

Co-director of Practical Participation: http://www.practicalparticipation.co.uk
--------------------------
Practical Participation Ltd is a registered company in England and Wales - #5381958.

Bill Anderson

unread,
Mar 4, 2015, 5:24:27 AM3/4/15
to iati-te...@googlegroups.com

Hi Herb

 

Not clumsy at all. This is very useful.

 

The formal way to go about this is to post in the Modifications, Additions, Improvements forum on the IATI Consultations Knowledge Base. Could you post the three proposals separately so that they can be processed and tracked more easily.

 

I also think this merits a session at the next Technical Advisory Group meeting. It would be worth posting a proposal on the What Next? Theme.

 

As to your proposals:

 

1.       Support an element referencing an indicator vocabulary and an indicator code within that vocabulary.
I support this 100%. An optional vocabulary attribute could be introduced in the next decimal upgrade. Some work will need to be done to create a core code list of recognised vocabularies.

2.       Add geographic disaggregation to the results standard, as well as adding elements for disaggregating results data into other categories
I also support this 100% but I don't see a way of doing this that maintains the backward compatibility of the standard. This means that we cannot consider this until the next integer upgrade which is unlikely to happen until 2016.

3.       Add an indicator schema to the IATI standard
I agree that there is a desperate need for such a standard. However I do not think that this is IATI's responsibility. Furthermore the content of this standard will need curation that is beyond IATI's knowledge and remit. This, I believe, is a problem for the M&E community to tackle - and we would support any attempt at a solution wholeheartedly.

 

Best

 

Bill


david C

unread,
Mar 4, 2015, 6:57:46 AM3/4/15
to iati-te...@googlegroups.com
Just to follow up on what Bill is saying:

"2.       Add geographic disaggregation to the results standard, as well as adding elements for disaggregating results data into other categories
I also support this 100% but I don't see a way of doing this that maintains the backward compatibility of the standard. This means that we cannot consider this until the next integer upgrade which is unlikely to happen until 2016."

I agree that a change to the IATI Standard to accommodate this would require an integer upgrade, but in the meantime I think this could be taken forward using a 'namespace'. This also gives you/us time to experiment with the namespace schema and work with others to try it out. Given the time scales that Bill is talking about here, it would mean there could be potentially be something that is tried and tested in place when the Standard might be ready to accommodate it.

I might be wrong!

Cheers
David

Joshua Powell

unread,
Mar 4, 2015, 8:47:42 AM3/4/15
to iati-te...@googlegroups.com
Hi all,
Re: geographic disaggregation - if a publisher is already reporting sub-national data to the standard (which too few are, but hopefully a growing proportion will in coming years), it should be including unique IDs for each location geocoded. 

Perhaps an interim solution would merely be to reference that location ID in the results section when individual results entries are tied to one or more geocoded locations? This would avoid the need to provide (what should be) the same geocoded locations in multiple areas of the standard, given that presumably results locations would match project locations (or at the least represent a subset of the project locations).

Best,

Herb Caudill

unread,
Mar 4, 2015, 8:56:28 AM3/4/15
to iati-te...@googlegroups.com
Thanks, all. Bill, I will post these proposals in the forum. 

A couple of thoughts on disaggregations:

1. Data model: My first stab at this was, I think, what Tim is suggesting:

<period>
  <actual value="100">
    <dimension name="sex" value="male" />
  </actual>
  <actual value="110">
    <dimension name="sex" value="female" />
  </actual>
</period>

This easily accommodates multiple dimensions:

<actual value="110">
  <dimension name="sex" value="female" />
  <dimension name="age value="adult" />
</actual>

Where this seemed to break down was in specifying geography - it would be extremely verbose and redundant to have to specify the exact same location block for lots of different data points. 

But, we could just attribute results data to locations by reference to existing places already defined for the activity:

<actual value="110">
  <location ref="Clinic-001" />
  <dimension name="age value="adult" />
</actual>

This assumes that the "Clinic-001" is the @ref of a location element defined elsewhere in the same document. 

I would actually prefer this approach. 

2. Backwards compatibility: I'm not clear why either approach to disaggregated data would break backwards compatibility. All of these new elements would be optional - after all, lots of performance data isn't disaggregated at all, so the standard needs to support leaving these out altogether. 

Using the dimensional approach described above, the only schema changes required would be:
  • Define a dimension element in iati-common.xsd, and allow 0 or more dimensions within the target and actual elements. 
  • Change the target and actual elements themselves to maxOccurs="unbounded"
Existing documents would still validate. 

3. Disaggregation vocabularies: I'm all for using existing codelists, and the indicator standard should include a way of referencing an existing vocabulary. The variety of disaggregation attributes we see in the wild is mind-boggling, though. One small example: These are all the different ways that PEPFAR indicators are disaggregated in one country office we work with:
  • 0-14, 15+
  • <18, 18+
  • 0-4, 5-14, 15+
  • 0-14, 15-17, 18+
  • <12 months, 1-14 years, 15+ years
  • 0-14, 15-24, 25+
  • 0-4, 5-10, 11-14, 15+
  • 10-11, 12-14, 15-24, 25+
  • <12 months, 12-23 months, 2-14 years, 15+ years
  • <12 months, 1-4 years, 5-14 years, 15+ years
  • 0-9, 10-14, 15-17, 18-24, 25+
  • 10-14, 15-19, 20-24, 25-49, 50+
  • 0-9, 10-14, 15-19, 20-24, 25+
  • 0-4, 5-9, 10-14, 15-17, 18-24, 25+
  • <1, 1-4, 5-9, 10-14, 15-17, 18+
  • 0-9, 10-14, 15-19, 20-24, 25-49, 50+
  • <12 months, 1-9 yrs, 10-14 yrs, 15-19 yrs, 20-24 yrs, 25-49 yrs, 50+
  • Youth 18-35, 36+
That's one program in one country. Here's a random sampling of some other ways of disaggregating indicators currently in use in DevResults:
  • Innovation Pipeline Phase: Development, Initial Piloting, Early Adoption, Transition to Scale, Global Adoption
  • Partner Level of Engagement: Low, Medium, High, TBD
  • Post-Circumcision Adverse Event Severity: None, Moderate, Severe
  • Autoridad Étnica: Cabildo Indígena, Consejo Comunitario, Cabildo Mayor, Asociación de Autoridades Indígenas
  • Diarrhea Result: Dead, Recovered
The point is just that while we will want to steer people towards using unambiguous names and codes for things, we'll always need to support custom categories. As long as the indicator definition establishes canonical names, any results reported on that indicator will still be comparable. 

Herb





    

Reply all
Reply to author
Forward
0 new messages