Java driver query slow, mongo console fast

chriskessel

unread,

Aug 4, 2010, 6:10:10 PM8/4/10

to mongodb-user

I've written a tiny routine in Java (Groovy actually) to execute a
query:
BasicDBObject query = new BasicDBObject();
query.put( "matchKey", "2538712337320987321" );
DBCursor contacts =
getMongoTelcoCollection().find( query ).batchSize(250).limit(250);

When this executes, I see in the mongod console the following line:
Wed Aug 4 16:56:39 query cip.telco ntoreturn:250 reslen:270282
nscanned:250 { matchKey: "2538712337320987321" } nreturned:250 3201ms

However, executing the same thing in the mongo console with .explain()
shows lightning fast (0 millisecond) query execution time and there's
nothing output in the mongod window.

> db.telco.find( {matchKey:"2538712337320987321"}).batchSize(250).limit(250).explain()
{
"cursor" : "BtreeCursor matchKey_1",
"nscanned" : 250,
"nscannedObjects" : 250,
"n" : 250,
"millis" : 0,
"indexBounds" : [
[
{
"matchKey" : "2538712337320987321"
},
{
"matchKey" : "2538712337320987321"
}
]
]
}

Can someone explain why there's a difference both in performance and
in the mongod log output? I'm on 1.5.3.

Thanks,
Chris

Eliot Horowitz

unread,

Aug 4, 2010, 11:30:50 PM8/4/10

to mongod...@googlegroups.com

When doing the explain - it just looking at the index.
When actually doing the query - its loading all the data off disk and
copying to network.
If you do it again,is it fast or slow? Probably reading all the
objects off disk and seeking around is the slow part.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

chriskessel

unread,

Aug 5, 2010, 11:34:17 AM8/5/10

to mongodb-user

Running it multiple times doesn't seem to make much difference. I'm
doing equivalent queries vs. MySQL and MongoDB. The query returns 250
records. 250 results isn't the norm, but for service level reasons,
it's a good test as an outer limit of the number of matches we'll
allow on a query. I'm querying similar systems with identical hardware
(except MongoDB's system has 6 gigs while the MySQL system has 2 gigs
of memory). I'm running the same query 10 times against each system
and MongoDB is indexed on a "matchKey" array and I'm querying by a
matchKey.

The results aren't promising. It's taking MongoDB twice as long. Any
thoughts on why or if there's something performance related I need to
do besides index matchKey? An explain() shows MongoDB is only scanning
250 records, so the index is working.

Queries: 10
-------- MySQL --------
Total Queries: 10
Total Results Found: 2500
Total Time: 5670
Millis per Query: 567

Queries: 10
-------- Mongo 1.5.3 --------
Total Queries: 10
Total Results Found: 2500
Total Time: 10951
Millis per Query: 1095.1

chriskessel

unread,

Aug 5, 2010, 11:49:44 AM8/5/10

to mongodb-user

I tried a much smaller query that only returns 6 results and MongoDB
is comparable. I'm not sure what I was expecting, but for whatever
reason I thought MongoDB would be considerably faster for our
key:value lookup approach.

-------- MySQL --------
Total Queries: 10

Total Results Found: 60
Total Time: 1905
Millis per Query: 190.5

Queries: 10
-------- Mongo 1.5.3 --------
Total Queries: 10

Total Results Found: 60
Total Time: 1906
Millis per Query: 190.6

Eliot Horowitz

unread,

Aug 5, 2010, 12:01:11 PM8/5/10

to mongod...@googlegroups.com

What do the queries/schema look like in both cases?
If you send both - could see if there is something going on

chriskessel

unread,

Aug 5, 2010, 12:54:27 PM8/5/10

to mongodb-user

The MongoDB schema is one collection with 40 million records, each
looks like:

{ "batchId" : 0, "listingKey" : "cea42fca6b7cb082", "lastUpdateYear" :
2005, "lastUpdateMonth" : 9, "lastUpdateDay" : 20,
"earliestReportedYear" : 2003, "earliestReportedMonth" : 3,
"earliestReportedDay" : 12, "privacyIndicatorEnum" : 3,
"listingTypeEnum" : 3, "serviceProvider" : "23", "dataProvider" :
"WC", "phoneNPA" : "503", "phoneNXX" : "848", "phoneLINE" : "9476",
"firstName" : "Chris", "lastName" : "Kessel", "deliveryPointBarCode" :
"36", "checkDigit" : "0", "congressionalDistrict" : "1",
"carrierRouteSortZone" : "D", "state" : "OR", "addressType" : "S",
"latitude" : "45.463616", "longitude" : "-122.89142",
"preDirectional" : "SW", "zip4" : "7542", "MSA" : "6440", "CMSA" :
"79", "FIPSCode" : "41067", "carrierRoute" : "R018", "zip5" : "97007",
"houseNumber" : "20836", "suffix" : "Ln", "streetName" : "Vicki",
"city" : "Beaverton", "fullAddress" : "20836 SW Vicki Ln",
"matchKey" : [ "2392798418644328169" , "2469488646933425020" ,
"2559616267708426042" , "2634775134312296806" ,
"2817462582529963327" , "2715342769193125472" ,
"2746945466986027237" , "2305843014252183428"] }

The SQL Schema has multiple tables. The matchKeys are in one table,
the address in another, the rest (name, lat, long, various dates, etc)
are in a 3rd table. The SQL query is a pickier about the return
values, but only a little (it wouldn't return the matchKeys).

chriskessel

unread,

Aug 6, 2010, 2:02:48 PM8/6/10

to mongodb-user

For what it's worth, I think I've found the answer. I set up profiling
and according to the profiling, queries were extremely quick. That
didn't mesh with my tests though. I'm testing from my PC, querying a
MongoDB instance living one of our Linux servers. The servers are in
another state and the bandwidth isn't particularly good.

So, I tried running the Java driver queries from another Linux box in
the same subnet as the MongoDB box. Huge, monumental difference. I
mean multiple orders of magnitude difference!

I'm guessing that MySQL is relatively compact in the results it sends
across the wire, while Mongo is more verbose. Mongo was really
suffering in my PC->server tests due to the slow network connection.

Reply all

Reply to author

Forward