IATI Organization Search Tool

42 views
Skip to first unread message

Dan Mihaila

unread,
Nov 24, 2014, 10:53:23 AM11/24/14
to iati-te...@googlegroups.com

Hello,


A while ago there were some interesting discussions about IATI Organization IDs.  This is what motivated us to create a tool that provides a simple overview over how organization identifiers are used.


The following analysis tool (note this is an alpha version) captures almost all the IATI data related to organizations that is currently published and registered via IATI Registry:

http://happy-devs.atman.ro/browse

An organization is defined as being unique by the name-code pair.  You can also see additional information for each organization, such as:

  • count ( number of appearances in feeds )

  • locations (geographical information used as <location> element)

  • sources (name of the files they were published)

  • source links (url of the files where the organization was published)


You can also use the filters (which are by default collapsed) from the left column to find useful information or use the search box from top of the page.


Each organization box has an interesting option: "More Organizations with same name" which will display a graph to analyse how many times a code is used.


Sample queries

http://happy-devs.atman.ro/solr/org_collection/browse?q=&fq=name:%22DFID%22&sort=count%20desc&show_chart=true  - it shows an organization called "DFID" (as it is published in all IATI feeds) with different codes and how many times they appear.


http://happy-devs.atman.ro/solr/org_collection/browse?q=&fq=name:%22Department%20for%20International%20Development%22&sort=count%20desc&show_chart=true  - it shows an organization called "Department for International Development" with different codes as they were found in all IATI feeds and how many times they appear.


Do you think this is helpful when creating new IATI feeds and searching for information about existing organizations ?

Are there any related features that would make this tool more useful ?

All the best,
  Dan

Dan Mihaila, IT Consultant
(M) +40 722 502 304 • (GTalk) dan.m...@gmail.com (Skype) carcotelul
(Yahoo) carcotelul
 

david C

unread,
Nov 25, 2014, 6:06:40 AM11/25/14
to iati-te...@googlegroups.com
Hi Dan

Good work - this looks pretty good to me. 

Couple of thoughts...

I think you could do with an 'About' page on the application that explains everything you've said in the email to explain what is going on to those that visit the site directly.

Can we split Organisations into two distinct types? Those that publish IATI data (i.e. are 'reporting-orgs'), and those that don't (these are reported as participating-orgs).
It looks (to me) as though you are using participating-org from activity files as the source of organisation data. 

I think that for publishers, there is (arguably) a hierarchy of data sources for finding out information about that publisher. So, e.g if I look for GB-1, and it has an associated Organisation file, it would be good to find that easily. Next, I want to know about how that is used as a 'reporting-org', and then next as a 'participating-org' . With the links that you have provided, there is a lot of useful information, but I can't easily find that DFID does have an organisation file, for example.

For organisations in the data that are not publishers themselves, then this seems pretty good to me!

..and, with the current results, there is lots of interesting 'data quality' information. We can see where people are incorrectly using name/identifier combinations quite easily, so that's pretty useful - we just need more people able to follow those issues up!


So , to summarise, I think I'd like to see the following.
As a user I ask - what can you tell me about Organisation X?

I'd like to know about:
1) Organisation.xml files
2) Anything you can find about the organisation as a Reporting Organisation
3) Anything you can find out about it as a participating organisation
4) Other stuff you may find interesting (such as alternate spellings, mismatched identifiers and names)

My reading of the application is that it is currently doing 3 and 4 well, I'm not sure about 1 and 2.

Finally, (as the cherry on the cake!), I'm guessing there would be demand for this to be machine readable - some sort of api whereby I can request data on an organisation.

Good stuff Dan.

All the best
David

--
You received this message because you are subscribed to the
"IATI Technical" discussion list. Find out more at http://www.aidtransparency.net/governance/tag
 
To post to this group, send email to iati-te...@googlegroups.com
 
To unsubscribe from this group, send email to
iati-technica...@googlegroups.com
 
For more options, including the option to switch to a digest subscription, visit this group at http://groups.google.com/group/iati-technical
 
Tickets for the IATI technical secretariat can be posted to http://support.iatistandard.org
---
You received this message because you are subscribed to the Google Groups "IATI Technical Advisory Group (TAG) technical discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iati-technica...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alex Gartner

unread,
Nov 28, 2014, 5:54:02 PM11/28/14
to iati-te...@googlegroups.com
Hi David, 

thank you so much for the feedback.
Dan is currently on leave until next week and he asked me to reply to the email as part of the team who developed this alpha version.

I will try to reply inline below:


On Tue, Nov 25, 2014 at 1:06 PM, david C <capr...@gmail.com> wrote:
Hi Dan

Good work - this looks pretty good to me. 

Couple of thoughts...

I think you could do with an 'About' page on the application that explains everything you've said in the email to explain what is going on to those that visit the site directly.
We will definitely do a FAQ and About page in our next release (soon I hope) 

Can we split Organisations into two distinct types? Those that publish IATI data (i.e. are 'reporting-orgs'), and those that don't (these are reported as participating-orgs).
It looks (to me) as though you are using participating-org from activity files as the source of organisation data. 
Currently we are tracking 'participating-org', 'reporting-org', 'provider-org', 'receiver-org' elements from IATI sources. If I understand correctly, you would be interested to see the information coming from the 'reporting-org' field separately, right ?
Right now we merge the information together (in one entity) no matter what field the respective organization is specified in and no matter what role that organization plays. A user can search by DFID and filter by roles=Funding and would find all mentions of organizations with DFID in the name and that have appeared at least once with the role "Funding"

I think that for publishers, there is (arguably) a hierarchy of data sources for finding out information about that publisher. So, e.g if I look for GB-1, and it has an associated Organisation file, it would be good to find that easily. Next, I want to know about how that is used as a 'reporting-org', and then next as a 'participating-org' . With the links that you have provided, there is a lot of useful information, but I can't easily find that DFID does have an organisation file, for example.
In this version we are not using the Organization files but it's something that we hope will be part of the next one. 

For organisations in the data that are not publishers themselves, then this seems pretty good to me!

..and, with the current results, there is lots of interesting 'data quality' information. We can see where people are incorrectly using name/identifier combinations quite easily, so that's pretty useful - we just need more people able to follow those issues up!

We could think about some (semi) automatic reports that produce feedback to the publishers. 

So , to summarise, I think I'd like to see the following.
As a user I ask - what can you tell me about Organisation X?

I'd like to know about:
1) Organisation.xml files
 Definitely part of the plan
2) Anything you can find about the organisation as a Reporting Organisation
3) Anything you can find out about it as a participating organisation
 For both (2) and (3) - Should we show any other piece of information that we're currently missing ? 
4) Other stuff you may find interesting (such as alternate spellings, mismatched identifiers and names)

My reading of the application is that it is currently doing 3 and 4 well, I'm not sure about 1 and 2.

Finally, (as the cherry on the cake!), I'm guessing there would be demand for this to be machine readable - some sort of api whereby I can request data on an organisation.
We are using SOLR which itself provides a pretty powerful API ( with support for xml and json). Here's an example: 
If there's a need we could also create additional API calls that are more specific or provide additional features. 

Good stuff Dan.

All the best
David

Thanks and have a nice weekend,
Alex

david C

unread,
Dec 1, 2014, 5:33:58 AM12/1/14
to iati-te...@googlegroups.com
Thanks Alex.

That all sounds spot on to me.

I do think that 'reporting-org' is a special case, because given a reporting-org, we do have a good chance of tracking the organisation down, and finding out more about them (and their organisation files should tell us a good deal about them), so I think worth looking into.

The API looks great!

Thanks again

David

Steven Flower

unread,
Dec 3, 2014, 1:14:29 PM12/3/14
to iati-te...@googlegroups.com
Hi

+1 - this is very useful

I'd also echo the call around "reporting-org" identifiers.  This week, I started to clear up this list (believe me, it was much longer on Monday morning):  http://dashboard.iatistandard.org/reporting_orgs.html.  This compares the reporting-org ID(s) in the data, against the field that is (manually) recorded for the publisher in the IATI Registry.  I noticed quite a lot of data entry issues (spaces, _ instead of -, lowercase agency prefix), quite often through AidStream - so there could be room to tighten up on that at source.   Thankfully, there were also tickets raised at data tickets to help pinpoint and clarify further issues:

http://data.tickets.iatistandard.org/query?status=closed&col=id&col=summary&col=owner&col=component&col=version&col=resolution&col=reporter&report=14&order=priority

This is a slight tangent from the initial email sent by Dan.  The work with Solr illustrates the task ahead - with reporting-org being the tip of the iceberg.  A question I'd have would be to understand the workflow(s) this could provide:

- data quality via the lens of Org identifiers.  Our purpose is to isolate the identifiers that are ambiguous, and seek a resolve (or just flag it)
- lookup - I need to know how to identify X organisation in my data
- dereferencing - how do people identify *my* organisation

Do we therefore need to get to an authority identifier (bearing in mind the methodology in place at http://iatistandard.org/organisation-identifiers/) "list"?

Lastly - with IATI 2.01 not being "concerned" with the text associated with a code, this could also be interesting.  Granted, it is very useful to see the multiple ways that people describe "DfID", but there is some confidence to be had in the fact that GB-1 has gained some traction (or should that be XM-DAC-12-1 ?!)...

BTW - the work of Kit Wallace , OpenCirce, is well worth checking also: http://opencirce.org/org/

Thanks - great stuff Dan et al

Steven



------------------
skype: stevieflow
telephone: 441612981213

Jaap-Andre de Hoop

unread,
Dec 12, 2014, 8:18:24 AM12/12/14
to iati-te...@googlegroups.com
On 12/03/2014 07:14 PM, Steven Flower wrote:
> Hi
>
> +1 - this is very useful
>
> I'd also echo the call around "reporting-org" identifiers. This week,
> I started to clear up this list (believe me, it was much longer on
> Monday morning):
> http://dashboard.iatistandard.org/reporting_orgs.html. This compares
> the reporting-org ID(s) in the data, against the field that is
> (manually) recorded for the publisher in the IATI Registry. I noticed
> quite a lot of data entry issues (spaces, _ instead of -, lowercase
> agency prefix), quite often through AidStream - so there could be room
> to tighten up on that at source. Thankfully, there were also tickets
> raised at data tickets to help pinpoint and clarify further issues:
But don't trust the reporting org id published in the IATI registry. 233
id's are correct according to the standard, while 54 aren't. (checked
with the organisation codelist and organisation agency code list)
See attached file with incorrect id's.

Groets,

Jaap-Andre (dreams about better data quality)

--
Data-Assist
Tubalaan 7
7577 LK Oldenzaal
06-16846315
skype: jaap-andre

registry-organisation-ref-check.csv

David Megginson

unread,
Dec 12, 2014, 11:01:17 AM12/12/14
to iati-te...@googlegroups.com
I'm very interested in this thread, because I believe that an org-identifier ecosystem is one of the biggest wins we could pull off for aid data interoperability over the next couple of years.  This applies especially to IATI and HXL, of course, but also to associated efforts like Open Corporates. One thing I've realized in my years on both IATI and HXL is that shared codes and identifiers are, if anything, more important than shared formats -- it's not excessively complicated to combine data from CSV, XML, RDF N3, JSON, and <whatever> if we're actually using the same identifiers/codes to refer to our key business objects, especially orgs, subnational geographical areas, sectors, and populations (SADD). 


Cheers, David

Bill Anderson

unread,
Dec 12, 2014, 3:42:52 PM12/12/14
to <iati-technical@googlegroups.com>
Couldn't agree more David

Sent from my iPhone
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

Sarah Johns

unread,
Feb 4, 2015, 1:13:26 PM2/4/15
to iati-te...@googlegroups.com

Coming a bit late to this but wanted to +1 the work you are all doing on IATI organisation identifiers, and the call to standardise particularly across IATI, HXL and open corporates. Have you seen the work on open LEIs (legal entity identifiers) that open corporates have been involved with? I'm not sure how/whether you could use LEIs as another reference/lookup but for example Save the Children have both an LEI and an IATI ID, both of which are based on their Companies House registration. The LEI stays static even if the organisation changes its name or organisation type, wheras the IATI ID would change if the organisation moved registration agencies ie charity to company.

Ps. If any UK IATI organisations need a reminder to correct their IATI org ID let me know and I'll contact them.

Cheers, Sarah

Cheers, Sarah

Reply all
Reply to author
Forward
0 new messages