sample data

0 views
Skip to first unread message

James Wilson

unread,
Jan 31, 2010, 9:01:40 AM1/31/10
to crisis-w...@googlegroups.com
I have access to the "database"  that KROS (CROSE) created for the city of Jacmel to report on what buildings were damaged.  Its actually not a database and not normalized, but just an excell worksheet.   A couple take away points are:

1) Volunteers
there are plenty of volunteers (local haitians) willing to go out and get data from from people living in the streets.
2) the system for taking 'notes' as it were was just a simple print out of the excel worksheet.
3) the digitization process was very flawed, required usually two people, one person to read the printout, and another "technically inclined" (someone that can use a keyboard and excel) to insert the data.
4) a student recompiled all the excel worksheets turned in by volunteers copies by hand into one large list  of about 8000 lines.

I could probably send it to anyone interested, 

James Wilson

unread,
Jan 31, 2010, 9:33:12 AM1/31/10
to crisis-w...@googlegroups.com
Oops, I hit send before the list was complete.... Here is a complete list...


I have access to the "database"  that KROS (CROSE) created for the city of Jacmel to report on what buildings were damaged.  Its actually not a database and not normalized, but just an excell worksheet.   A couple take away points are:

1) Volunteers:

There are plenty of volunteers (local haitians) willing to go out and get data from from people living in the streets.

2) Data Collection Process:

The system for taking 'notes' as it were was just a simple print out of the excel worksheet.

3)The digitization process was very flawed, required usually two people, one person to read the printout, and another "technically inclined" (someone that can use a keyboard and excel) to insert the data.


4) a student recompiled all the excel worksheets turned in by volunteers copies by hand into one large list  of about 8000 lines.

5) Data Integrity:

The excel worksheet is wrought with Formatting errors from entering data in the wrong column (step 3 above)  or mismatched columns from copy/paste during the recomplilation process (step 4 above).

6) Because of the way the 'database columns' were designed, a house may have also been classified as a hotel (no validation can be performed at the time of digitalization in excel without experienced programmer to setup some sort of 'form' for inserting information and validating it). Also, there are many cases where the Building Type field was missing, and i was told to assume then that it would be counted as a 'House'.

7) The building data is often very incomplete: there is no field such as the APN code (mentioned in other threads) that exists for all buildings.  The information collected (although not always present) that could be used to identify a property is Neighborhood, Street, Number, Proprietor Name, Tenant Name.  In the case where the street number is unknown, the EDH code (electricity de Haiti) may be used, but often the case that the entire Number field is left blank. The only real way to identify a property is Street name, Neighborhood, and Owner. This may not be the case in PaP, but i would guess that this will also be a problem.

8) Other fields of information collected are: Building Type (church, school, resto/hotel, home, other, etc), Amount of damage (partial, complete),  Building size (Small, Medium, Large),  Number of Families living in the building, Comments (often filled with information like 'under construction' or 'cracked walls' or 'one person confirmed dead').

If anyone is interested in seeing this document i could send it directly to you outside of this list.

Chris

unread,
Jan 31, 2010, 1:24:55 PM1/31/10
to Crisis Workflows
EXCELLENT ...precisely what we need at this stage.

We must ask ourselves at every stage, as we model things in Drupal and
elsewhere, "how will this be sustained"? Do you have a link to the
KROS effort? Who is organizing, how widespread/sustained in the
effort, who is the customer, etc. What is their concept of the "master
database", and if they don't have one, are we "it"? If they don't have
Internet connectivity, and we want to engage Internet people to clean
up/geocde/etc, can thumb drives to/from hotel rooms work periodically?
etc.

Which in turns begs the question: What is a transition plan for the
'long haul'?...we will put in place something now to aggregate field
reports from aerial/satellite, etc. And of course, we should always
keep in the back of our minds: "Can this cast a mould for better
transparency in government during the permit review cycle down the
road?" and help break the cycle of rampant corruption through
engineered roles for review, etc.

I will try to do more brain dumps to the wiki with notes from
yesterday, and try to sync with the World Bank assessment folks,
Mapaction and Reliefweb people either later today or tomorrow

Thank you James!

Reply all
Reply to author
Forward
0 new messages