tiger_import: redirecting the generated SQL to a text file for debugging.

14 views
Skip to first unread message

John L. Poole

unread,
Mar 28, 2010, 6:25:16 PM3/28/10
to GeoCommons Geocoder
The tiger_import script elegantly assembles a massive SQL statement
that is piped into sqlite. Unfortunately, if you want to debug to
isolate where the loader errors out, you need to make modifications as
follows:

# original line below
# cat ${SQL}/convert.sql) | sqlite3 $DATABASE
#
# Debug line
#
cat ${SQL}/convert.sql) > /tmp/debug_tiger_import.sql

Simiply open tiger_import and save it under a different name and then
rem out the "cat..." line replacing it with the above.

When I did so for the Napa County, California, files only it generated
a 25 MB file.

Once you have your debug file, then you can load it into sqlite as
follows:

hermes geocoder # cat /tmp/debug_tiger_import.sql |sqlite3
debug.db

When I did so, I was able to recreate some error conditions which I'll
now need to isolate further and determine why the loading is not error
free:

hermes geocoder # cat /tmp/debug_tiger_import.sql |sqlite3 debug.db
memory
Error: near line 50693: no such table: feature
Error: near line 50704: no such table: feature_edge
Error: near line 50712: no such table: feature
Error: near line 50718: no such table: edge
Error: near line 50725: no such table: range
hermes geocoder #

Kate Chapman

unread,
Mar 29, 2010, 1:41:13 AM3/29/10
to GeoCommons Geocoder
John,

Have you figured out the issue? Looks like your tables weren't being
created before the import began.

-Kate

John L. Poole

unread,
Mar 29, 2010, 2:00:20 AM3/29/10
to GeoCommons Geocoder

On Mar 28, 10:41 pm, Kate Chapman <k8chap...@gmail.com> wrote:
> John,
>
> Have you figured out the issue? Looks like your tables weren't being
> created before the import began.
>
> -Kate

<snip>

Yes, you are right, the portion of the script:

# Initialize the database if it doesn't exist.
[ ! -r $DATABASE ] && cat ${SQL}/{create,place}.sql | sqlite3
$DATABASE

needed to be run first.

I was able to successfully load the Napa County data without any
error.

However, my results still came back with longitude and latitudes that
did not make sense. I had previously downloaded
only the files under the Napa County folder; I did not download any
files from directories that are parent or grandparent
to the Napa County folder.

I'm guessing that files at the state level, and national level may be
required for all counties and provide
a point of reference for the county data? At any rate, for the last
four hours I have been downloading the zip files directly
under http://www2.census.gov/geo/tiger/TIGER2009/ and then only all
files under the folder http://www2.census.gov/geo/tiger/TIGER2009/06_CALIFORNIA/
and when the downloads finish, I'll try a new seeding of a new
database.

Kate Chapman

unread,
Mar 29, 2010, 2:22:21 AM3/29/10
to geocommon...@googlegroups.com
Hi John,

You might try looking in the archives for this thread "Strange
Lat/Long numbers for Tiger 2008." It might help you resolve the
issLoading rich text...ue.

-Kate

> To unsubscribe from this group, send email to geocommons-geocode+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.
>

John L. Poole

unread,
Mar 29, 2010, 8:43:35 AM3/29/10
to GeoCommons Geocoder
On Mar 28, 11:22 pm, Kate Chapman <k8chap...@gmail.com> wrote:
> Hi John,
>
> You might try looking in the archives for this thread "Strange
> Lat/Long numbers for Tiger 2008."  It might help you resolve the
> issLoading rich text...ue.
>
> -Kate
>
> On Mon, Mar 29, 2010 at 2:00 AM, John L. Poole <jlpool...@gmail.com> wrote:

Thank you, Kate, your suggestion was helpful.

Yes, I'm on a 64 bit platform and I followed the suggested
modifications which were:

diff --git a/src/libsqlite3_geocoder/wkb_compress.c b/src/
libsqlite3_geocoder/wkb_compress.c
index fc717f3..a4643fd 100644
--- a/src/libsqlite3_geocoder/wkb_compress.c
+++ b/src/libsqlite3_geocoder/wkb_compress.c
@@ -4,6 +4,10 @@
uint32_t compress_wkb_line (void *dest, const void *src, uint32_t
len) {
uint32_t d, s;
double value;
+ /*
+ hack per http://groups.google.com/group/geocommons-geocode/msg/b471f9db7ce15c58?
+ */
+ exit(1);
if (!len) return 0;
for (s = 9, d = 0; s < len; d += 4, s += 8) {
value = *(double *)(src + s);


diff --git a/lib/geocoder/us/database.rb b/lib/geocoder/us/database.rb
index 77750f4..10b9a12 100644
--- a/lib/geocoder/us/database.rb
+++ b/lib/geocoder/us/database.rb
@@ -9,6 +9,9 @@ require 'time'

require 'geocoder/us/address'

+require 'geo_ruby'
+include GeoRuby::SimpleFeatures
+
module Geocoder
end

@@ -493,11 +496,9 @@ module Geocoder::US
def unpack_geometry (geom)
points = []
if !geom.nil?
- coords = geom.unpack "V*" # little-endian 4-byte long ints
-
- # now map them into signed floats
- coords.map! {|i| ( i > (1 << 31) ? i - (1 << 32) : i ) /
1_000_000.0}
- points << [coords.shift, coords.shift] until coords.empty?
+ g = Geometry.from_ewkb(geom)
+ g.points.each{|point|
+ points << [point.x, point.y]}
end
points
end

I successfully geocoded "2133 First Street, Napa, CA 94559".

(My intersection test of "First & Main, Napa, CA 94559" caused the
GeoRuby::SimpleFeatures:EWKBFormatError and that's another problem to
wrangle with later)

Reply all
Reply to author
Forward
0 new messages