Mongodb 2.0, spacial indexes and Geonames database

500 views
Skip to first unread message

Ivan Belmonte

unread,
Oct 5, 2011, 6:33:25 PM10/5/11
to mongodb-user
Hello sirs,

I got a recent problem which concerns Mongodb 2.0 and spacial indexes.

I'm hosting a copy of the Geonames database:

http://download.geonames.org/export/dump/

I downloaded the "allcountries.zip" file, which is a tab separated txt
file including latitude and longitude (apart of various other fields)
for locations (cities, countries, interesting points…) around the
world.

I wrote a ruby script to parse that file and insert every row into a
mongodb collection, following the instructions on the mongodb
documentation for the "loc" field.
I am building a Ruby on Rails application, so for that pourpose I used
a Mongoid model, but you can write a similar script in any other
language and it will work.
I'm attaching my script at the end of this message so you can follow
my steps.

Once I got all the dataset into a Mongodb collection, I ran the
command for generating a 2d index for the "loc" field:

db.locations.ensureIndex({loc:"2d"});


The problem is as follows:

1) On Mongodb 2.0 I get this error:

point not in interval of [ -180, 180 )

2) On Mongodb 1.8.3 it works as expected, no errors neither any kind
of problems.

3) I tried dumping the working dataset from an 1.8.2 instance,
restoring to a 2.0 one, and then running the command to build the 2d
index. Got the same error.

4) I also tried importing the dataset in CSV format, same result for
both versions.


I spent 2 days with this, trying all kind of dumps and restores.
Is it a bug? or maybe I am doing something wrong?


=== Attachments ===

1) A mongodump of my locations collection:

http://ivanhq.net/stuff/locations.bson.gz


2) My "location.rb" mongoid model

class Location
include Mongoid::Document
include Mongoid::Spacial::Document

field :geonameid, type: Integer
field :name
field :ansiname
field :alternatenames
field :latitude, type: Float
field :longitude, type: Float
field :loc, type: Array, spacial: true
field :feature_class
field :feature_code
field :country_code
field :cc2
field :admin1_code
field :admin2_code
field :admin3_code
field :admin4_code
field :population
field :elevation
field :gtopo30
field :timezone
field :modification_date

spacial_index :loc
end


3) My ruby script for dumping the original "allcountries.txt" file
into Mongodb:

input_file = ARGV[0]

if !input_file
puts "*** Please spoecify <input_file>"
exit
end

i_file = File.open(input_file, 'r')

while i_line = i_file.gets
o_line = i_line.split("\t")

location = Location.new

location.geonameid = o_line[0]
location.name = o_line[1]
location.ansiname = o_line[2]
location.alternatenames = o_line[3]
location.loc = [o_line[5].to_f, o_line[4].to_f]
location.feature_class = o_line[6]
location.feature_code = o_line[7]
location.country_code = o_line[8]
location.cc2 = o_line[9]
location.admin1_code = o_line[10]
location.admin2_code = o_line[11]
location.admin3_code = o_line[12]
location.admin4_code = o_line[13]
location.population = o_line[14]
location.elevation = o_line[15]
location.gtopo30 = o_line[16]
location.timezone = o_line[17]
location.modification_date = o_line[18].gsub(/\n/,'')

location.save
end

===

Thanks for your time and attention!
Ivan

Karl Seguin

unread,
Oct 5, 2011, 9:46:14 PM10/5/11
to mongod...@googlegroups.com
I'm surprised this works in 1.8.3. 
By default, geosptial indexes have a range of -180, 180...exclusive of the upper value.

Therea are a number of records in the data that fall on 180, like:

Glomar Challenger Basin (-77.75, 180)

Personally, I feel like the default range in MongoDB is wrong. I'll look more into it and raise an issue once I have a better feel for it.

You should either change your code so that if the value is 180...you store 179.9  (ugly)...or you can change your indexes bound to be 181:

db.places.ensureIndex( { loc : "2d" } , { min : -180 , max : 181 } )

Karl Seguin

unread,
Oct 6, 2011, 12:34:13 AM10/6/11
to mongod...@googlegroups.com
I created the jira at:

You can vote for it and/or keep a watch on it. It's already been flagged as a hopeful 2.1 fix.

Ivan Belmonte

unread,
Oct 6, 2011, 12:09:19 PM10/6/11
to mongodb-user
Hey folks, lots of thanks :-)
I was getting crazy with this.

Alvaro Muir

unread,
Oct 6, 2011, 9:39:42 PM10/6/11
to mongodb-user
This is a very ugly answer but . . .

I had the same issue. Even worse, when I wrote python script I found
that about 10% of the database wouldn't load.
Just to be transparent, I am hosting the entire 7.9 million records.

I think it was more a problem with my poor coding and the text file.

What I did was import the CSV to mysql.
Ran a python script to move from mysql to Mongo.

This way, I knew the data was sanitized, and I could track each row as
it was being inserted.

This script took about a half an hour to run.. But no more errors on
ensureIndedx.
Reply all
Reply to author
Forward
0 new messages