geoNear aggregation does not deliver the expected results with 2dsphere index

737 views
Skip to first unread message

Koko

unread,
Oct 7, 2016, 3:16:29 PM10/7/16
to mongodb-user
Hi,

I have 2.000.000 gps locations in my collection. I've added a 2dsphere index on the location field.
The location field is a GeoJSON "object" and I know that I have to use coordinates in the lon/lat order for my queries.

So I have the same data in Cassandra, indexed with Apache Lucene (https://github.com/Stratio/cassandra-lucene-index). 

I want to get the entries that match a specific point within a radius of 3.9km.
With Cassandra, I get 1.1m rows. Which sounds quite reasonable for me.

With MongoDB, I only get 320xxx results. 
Might it be the coordinate system that is used?

I'm running this query:

db.data.aggregate([  
   
{  
      $geoNear
:{  
         near
:{  
            type
:"Point",
            coordinates
:[  
               
10.7xxxxx,
               
52.4xxxxx
           
]
         
},
         distanceField
:"dist.calculated",
         maxDistance
:3900,
         num
:50000000,
         includeLocs
:"dist.location",
         spherical
:true
     
}
   
}
])


I'm querying with a java application. The application just skips through each entry that gets returned from the database with the iterator and increments a counter.
That's all. I'm just running a stopwatch on the client side and count the number of rows returned.
I can see that the mongodb java driver fetches more data several times. But after ~3xx.xxx rows it stops. 
So I'm not quite sure whether that is related to the java driver or the geo queries. 

Any suggestions? 

Koko

unread,
Oct 10, 2016, 7:21:33 AM10/10/16
to mongodb-user
I have found out that the mongod logs show the following error: "too many results for query ..., truncating output" which means that the maximum document size of 16mb gets exceeded.

How can I get this query working nonetheless?

Lungang Fang

unread,
Oct 24, 2016, 7:32:43 PM10/24/16
to mongodb-user

Hi Koko,

I have found out that the mongod logs show the following error: “too many results for query …, truncating output” which means that the maximum document size of 16mb gets exceeded.

How can I get this query working nonetheless?

Currently, the “geoNear” aggregation stage is limited to return results that are within the 16MB BSON size limit. This is related to an issue with earlier version of MongoDB (which is described in https://jira.mongodb.org/browse/SERVER-13486). Your query hit this issue because “geoNear” returns a single document (contains an array of result documents) and the “allowDiskUse” aggregation pipeline option unfortunately does not help in this case.

There are two options that could be considered:

  1. If you don’t need all the results, you could limit the “geoNear” aggregation result size using num, limit, or maxDistance options
  2. If you require all of the results, you can use the find() operator which is not limited to the BSON maximum size since it returns a cursor.

Below is a test I done on MongoDB 3.2.10 For your information.

  1. Create “2dsphere” for designated collection:

         db.coll.createIndex({location: '2dsphere'})
    
  2. Create and insert several big documents.

         var padding = '';
         for (var j = 0; j < 15; j++) {
             for (var i = 1024*128; i > 0; --i) {
                 var padding = padding + '12345678';
             }
         }
    
         db.coll.insert({location:{type:"Point", coordinates:[-73.861, 40.73]}, padding:padding})
         db.coll.insert({location:{type:"Point", coordinates:[-73.862, 40.73]}, padding:padding})
         db.coll.insert({location:{type:"Point", coordinates:[-73.863, 40.73]}, padding:padding})
         db.coll.insert({location:{type:"Point", coordinates:[-73.864, 40.73]}, padding:padding})
         db.coll.insert({location:{type:"Point", coordinates:[-73.865, 40.73]}, padding:padding})
         db.coll.insert({location:{type:"Point", coordinates:[-73.866, 40.73]}, padding:padding})
    
  3. Query using “geoNear” and server log shows “Too many geoNear results …, truncating output”

         db.coll.aggregate(
             [
                 {
                     $geoNear:{
                         near:{type:"Point", coordinates:[-73.86, 40.73]},
                         distanceField:"dist.calculated",
                         maxDistance:150000000,
                         spherical:true
                     }
                 },
                 {$project: {location:1}}
             ]
         )
    
  4. Query using “find” and all expected documents are returned

         // This and following "var" are necessary to avoid the screen being flushed by padding string.
         var cursor = db.coll.find (
             {
                 location: {
                     $near: {
                         $geometry:{type:"Point", coordinates:[-73.86, 40.73]},
                         maxDistance:150000,
                     }
                 }
             }
         )
    
         // It is necessary to iterate through the cursor. Otherwise, the query is not actually executed.
         var x = cursor.next()
         x._id
         var x = cursor.next()
         x._id
         ...
    

Regards,
Lungang

Koko

unread,
Dec 5, 2016, 4:26:39 AM12/5/16
to mongodb-user
Hi Lungang,

sorry for getting back to you this late. Your proposed solution unfortunately does not work for me because my collection is sharded:

Error: error: {
       
"code" : 13501,
       
"ok" : 0,
       
"errmsg" : "use geoNear command rather than $near query"
}


Can you think of a different solution to this issue?

Greetings,
Reply all
Reply to author
Forward
0 new messages