Very slow query

235 views
Skip to first unread message

marcandre

unread,
Sep 10, 2012, 8:50:17 AM9/10/12
to mongod...@googlegroups.com
Good afternoon,
 
I'am using MongoDB with 350 000 documents and php. The problem is the php reads documents very slowly .
 
When I'm updating documents I drop the database and I insert the documents : during this processus the inserts are very long .
 
I don't understand, I putted indexes .
 
I show you a piece of my php code when I updating documents :
 
$cnx_mongo = new Mongo("127.0.0.1", array("persist" => "x"));
 
 $db = $cnx_mongo->my_database;
 $ma_collection = $db->my_collection;
 
 $response = $ma_collection->drop();
 $response = $db->drop();
 
 $db = $cnx_mongo->my_database;
 $ma_collection = $db->my_collection;
 
 
 ///I Import data from a database Postgres
 

 ///I put index in a fiel named tags
 $ma_collection->ensureIndex("tags");
 
This is the code when php reads documents :
 
///Research of documents who containts key words
$cnx_mongo = new Mongo("127.0.0.1", array("persist" => "x"));
$db = $cnx_mongo->my_database;
$ma_collection = $db->my_collection;

 //Get back keywords researching by tags fields
 $who = array();
 if(count($ArrayWord) > 1) {
  $tmp = array();
  foreach ($ArrayWord as $q) {
   $tmp[] = new MongoRegex( "/". strtolower($q) ."/" );
  }
  
  $who['tags'] = array('$all' => $tmp);
 
 }else{
  $who['tags'] = new MongoRegex( "/". strtolower($mot) ."/" );
 }

 $cursor = $ma_collection->find($who)->limit(30)->skip(60);
 
 $cursor->timeout(1000);
 
 print($cursor->count()); // I display the number of documents
 
 foreach($cursor as $obj){
  ///I display the results of document 
 }
 
 
The tags field containts array keywords .
 
What should I do to resolve this problem ?
 
I checked the RAM (6 Go of physical RAM and 4 Go of available memory) .
 
How can I check the cause of this slow reads and inserts ?
 
I there is something to configure in mongodb ?
 
Best regards .

Gianfranco

unread,
Sep 10, 2012, 9:43:55 AM9/10/12
to mongod...@googlegroups.com
Hi,

First of all there is no need to drop the collection if you're going to drop the database containing that collection.

You can set profiling on, to record the queries which take a long time (by default more than 100ms)

Is this a batch script which loads data from PostgreSQL to MongoDB?

Can you explain again what are you trying to do?
And which queries are slow find() and/or insert() ?

Gianfranco

marcandre

unread,
Sep 10, 2012, 10:16:51 AM9/10/12
to mongod...@googlegroups.com
Yes it is the batch script witch loads data from PostgresSQL to MongoDB :
 
 

$cnx_mongo = new Mongo("127.0.0.1", array("persist" => "x"));
 
$db = $cnx_mongo->my_database;
$ma_collection = $db->my_collection;
 
 $response = $ma_collection->drop();
 $response = $db->drop();
 
 
 $db = $cnx_mongo->my_database;
 $ma_collection = $db->my_collection;
 
 
 $SQL = "select id_marchand, titre_prod, img_prod_petit, img_prod_zoom, produit.prix_ht, produit.prix_ttc, ref_constr, marque,categorie, sous_categorie, prix_max, prix_min, nb_offres, max_dispo, min_dispo, id_offre, id_prod from product";
 
 $rs = $cnx->query($SQL);
 
 
 $tab_produit = NULL;
 if ($rs != FALSE) {
   $tab_produit = $rs->fetchAll();
 }
 $tab = array();
 $num_tab = 0;
 
 $num_ligne = 0;
 
 $tags = array();
 
 ///Loading data from Postgres to MongoDB
 for($ii = 0; $ii < count($tab_produit); $ii++){
  $id_prod = $tab_produit[$ii]['id_prod'];
  $titre_prod = utf8_decode($tab_produit[$ii]['titre_prod']);
  $img_prod_petit = $tab_produit[$ii]['img_prod_petit'];
  $img_prod_zoom = $tab_produit[$ii]['img_prod_zoom'];
  $ref_constr = $tab_produit[$ii]['ref_constr'];
  $marque = $tab_produit[$ii]['marque'];
  $categorie = utf8_decode($tab_produit[$ii]['categorie']);
  
  $sous_categorie = utf8_decode($tab_produit[$ii]['sous_categorie']);
  $prix_max = $tab_produit[$ii]['prix_max'];
  $prix_min = $tab_produit[$ii]['prix_min'];
  $nb_offres = $tab_produit[$ii]['nb_offres'];
  $max_dispo = $tab_produit[$ii]['max_dispo'];
  $min_dispo = $tab_produit[$ii]['min_dispo'];
  $id_offre = $tab_produit[$ii]['id_offre'];
  $id_assoc = $tab_produit[$ii]['id_assoc'];
  
  $titre_prod = utf8_encode(str_replace('"','&quot;',str_replace("É","E", strtolower($titre_prod))));
  $titre_recherche = $titre_prod . " " . strtolower($ref_constr);
  
  $tags = split(" ",$titre_recherche);
  
  //I insert document in Collection of MongoDB
  $obj = array( "id_prod" => "$id_prod", "titre_prod" => "$titre_prod", "img_prod_petit" => "$img_prod_petit", "img_prod_zoom" => "$img_prod_zoom", "ref_constr" => "$ref_constr", "marque" => "$marque", "categorie" => "$categorie", "sous_categorie" => "$sous_categorie", "prix_max" => "$prix_max", "prix_min" => "$prix_min", "nb_offres" => "$nb_offres", "max_dispo" => "$max_dispo", "min_dispo" => "$min_dispo", "id_offre" => "$id_offre", "titre_recherche" => "$titre_recherche", "id_assoc" => "$id_assoc", "tags" => $tags  );
  $ma_collection->insert($obj);
 }
 
 $ma_collection->ensureIndex("tags");
 
 
 unset($ma_collection,$db,$cnx_mongo);
 
 
 
 $cnx = NULL;
 unset($cnx);
 
 
Thank you for your response .

Gianfranco

unread,
Sep 10, 2012, 10:42:35 AM9/10/12
to mongod...@googlegroups.com
As you did correctly, if this is not a live production database, you can ensureIndex() after you finish the batch.
Because if there's an index, the database needs to update the index every time there is new document inserted.

Is it possible, and i'm just wondering here, that PostgreSQL is one of the causes of the batch being slow?
If not do you have any more information on why is not?

On Monday, September 10, 2012 1:50:17 PM UTC+1, marcandre wrote:

marcandre

unread,
Sep 10, 2012, 11:07:30 AM9/10/12
to mongod...@googlegroups.com
I don't think that Postgres causes the bach being slow, because when I use MongoDB as a  reseach engin  to research documents by keywords ,
 
my reseach engin is very slow or ends with an error of timeout : Fatal error: Uncaught exception 'MongoCursorTimeoutException' with message 'cursor timed out (timeout: 30000, time left: 0:0, status: 0)'  .
 
This my code for research douments :
 
$cnx_mongo = new Mongo("127.0.0.1", array("persist" => "x"));
$db = $cnx_mongo->my_database;
$ma_collection = $db->my_collection;
 
$who = array();
 if(count($ArrayWord) > 1) {
  $tmp = array();
  foreach ($ArrayWord as $q) {
   $tmp[] = new MongoRegex( "/". strtolower($q) ."/" );
  }
  $who['tags'] = array('$all' => $tmp); 
 }else{ 
  $who['tags'] = new MongoRegex( "/". strtolower($mot) ."/" );
 } 
 
 
 
$cursor = $ma_collection->find($who)->limit(30)->skip(($off_set - 1)*30);
 
$cursor->timeout(1000);
 
print($cursor->count());
 
foreach($cursor as $obj){
 
$id_prod = $obj["id_prod"];
$ref_constr = $obj["ref_constr"];
$img_prod = $obj["img_prod_petit"];
$zoom = $obj["img_prod_zoom"];
$titre = utf8_decode($obj["titre_prod"]);
 
$prix_max = $obj["prix_max"];
$prix_min = $obj["prix_min"];
$nb_offres = $obj["nb_offres"];
$max_dispo = $obj["max_dispo"];
$min_dispo = $obj["min_dispo"];
$id_assoc = $obj["id_assoc"];

Gianfranco

unread,
Sep 10, 2012, 11:31:40 AM9/10/12
to mongod...@googlegroups.com
If you run explain() on the cursor you can see why is taking that long.

var_dump($cursor->explain());

Do that on one and see if it's using an index (BtreeCursor)

Another problem could be that you are limiting the find() results to 30 and then doing a skip.
This is not every efficient, and I don't understand from your code exactly why are you doing that.

You can also try to increase the timeout of the cursor and try if that helps

marcandre

unread,
Sep 10, 2012, 11:48:11 AM9/10/12
to mongod...@googlegroups.com
I get this reponse when I call var_dump($cursor->explain());  :
 
 
array(15) { ["cursor"]=> string(11) "BasicCursor" ["isMultiKey"]=> bool(false) ["n"]=> int(30) ["nscannedObjects"]=> int(1148) ["nscanned"]=> int(1148) ["nscannedObjectsAllPlans"]=> int(1148) ["nscannedAllPlans"]=> int(1148) ["scanAndOrder"]=> bool(false) ["indexOnly"]=> bool(false) ["nYields"]=> int(0) ["nChunkSkips"]=> int(0) ["millis"]=> int(2) ["indexBounds"]=> array(0) { } ["allPlans"]=> array(1) { [0]=> array(5) { ["cursor"]=> string(11) "BasicCursor" ["n"]=> int(30) ["nscannedObjects"]=> int(1148) ["nscanned"]=> int(1148) ["indexBounds"]=> array(0) { } } } ["server"]=> string(12) "host:27017" }
 
I am limiting the find to avoid display every documents in the same page  .

gelin yan

unread,
Sep 10, 2012, 11:52:28 AM9/10/12
to mongod...@googlegroups.com


在 2012年9月10日星期一UTC+8下午11时48分11秒,marcandre写道:
I get this reponse when I call var_dump($cursor->explain());  :
 
 
array(15) { ["cursor"]=> string(11) "BasicCursor" ["isMultiKey"]=> bool(false) ["n"]=> int(30) ["nscannedObjects"]=> int(1148) ["nscanned"]=> int(1148) ["nscannedObjectsAllPlans"]=> int(1148) ["nscannedAllPlans"]=> int(1148) ["scanAndOrder"]=> bool(false) ["indexOnly"]=> bool(false) ["nYields"]=> int(0) ["nChunkSkips"]=> int(0) ["millis"]=> int(2) ["indexBounds"]=> array(0) { } ["allPlans"]=> array(1) { [0]=> array(5) { ["cursor"]=> string(11) "BasicCursor" ["n"]=> int(30) ["nscannedObjects"]=> int(1148) ["nscanned"]=> int(1148) ["indexBounds"]=> array(0) { } } } ["server"]=> string(12) "host:27017" }
 
I am limiting the find to avoid display every documents in the same page  .


Hi
     "BasicCursor" implies your index didn't work for your query.

Gianfranco

unread,
Sep 10, 2012, 12:01:33 PM9/10/12
to mongod...@googlegroups.com
Yes, as Gelin said.
That query is not using an index, that's why is taking longer, going through the entire collection.

Make sure you ensureIndex() on the correct fields.
Then try doing an explain again to check if it worked

marcandre

unread,
Sep 10, 2012, 3:10:37 PM9/10/12
to mongod...@googlegroups.com
Thank you very much for these informations,
 
But, how can I get a BtreeCursor  with ensureIndex() function ?
Whitch arguments can I put in this function, I know that I have to give the field name in the first argument .

Sam Millman

unread,
Sep 10, 2012, 3:14:02 PM9/10/12
to mongod...@googlegroups.com
Non-prefixed regexes do not use indexes would be one of your problems:


$who = array();
 if(count($ArrayWord) > 1) {
  $tmp = array();
  foreach ($ArrayWord as $q) {
   $tmp[] = new MongoRegex( "/". strtolower($q) ."/" );
  }
  $who['tags'] = array('$all' => $tmp); 
 }else{ 
  $who['tags'] = new MongoRegex( "/". strtolower($mot) ."/" );
 }

What exactly are you trying to do here?

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb

marcandre

unread,
Sep 10, 2012, 3:44:33 PM9/10/12
to mongod...@googlegroups.com
In this code I am trying is to research only documents that have all keywords asked by the client,
 
For example a user who is looking for documents, when he types "windows 2008 Server R2 Datacenter" or "win 2008 Server R Data"  the research engin have to
display documents that contains the following keywords : array(0 => windows, 2008, Server, R2, Datacenter)  contained in the "tags" field (It is an array) 

Sam Millman

unread,
Sep 10, 2012, 3:50:05 PM9/10/12
to mongod...@googlegroups.com
If it is an array of words then you don't need the regex, try taking that out.

Sam Millman

unread,
Sep 10, 2012, 3:52:10 PM9/10/12
to mongod...@googlegroups.com
Ok my last message was written a bit quickly.

Basically I understand what your trying to do, essentially an FTS. Problem is the infixing on your fields is 2 characters min so "win 2008 Server R Data" won't match without regex. I strongely suggest you either deal with out regex or you split up your words further, but then they could lose their meaning and return superflious amount of documents.

Sam Millman

unread,
Sep 10, 2012, 4:03:26 PM9/10/12
to mongod...@googlegroups.com
Though thinking twice here, you can prefix since win would match windows with /^win/ and r would match r2 with /^r/. That might work.

marcandre

unread,
Sep 11, 2012, 3:45:42 AM9/11/12
to mongod...@googlegroups.com
Hellow everybody ,
 
 
When I update my code I get with explain cursor :
 
array(16) { ["cursor"]=> string(24) "BtreeCursor tags_1 multi" ["isMultiKey"]=> bool(true) ["n"]=> int(30) ["nscannedObjects"]=> int(330) ["nscanned"]=> int(330) ["nscannedObjectsAllPlans"]=> int(330) ["nscannedAllPlans"]=> int(330) ["scanAndOrder"]=> bool(false) ["indexOnly"]=> bool(false) ["nYields"]=> int(0) ["nChunkSkips"]=> int(0) ["millis"]=> int(8) ["indexBounds"]=> array(1) { ["tags"]=> array(2) { [0]=> array(2) { [0]=> string(7) "windows" [1]=> string(7) "windowt" } [1]=> array(2) { [0]=> object(MongoRegex)#7 (2) { ["regex"]=> string(8) "^windows" ["flags"]=> string(0) "" } [1]=> object(MongoRegex)#8 (2) { ["regex"]=> string(8) "^windows" ["flags"]=> string(0) "" } } } } ["allPlans"]=> array(1) { [0]=> array(5) { ["cursor"]=> string(24) "BtreeCursor tags_1 multi" ["n"]=> int(30) ["nscannedObjects"]=> int(330) ["nscanned"]=> int(330) ["indexBounds"]=> array(1) { ["tags"]=> array(2) { [0]=> array(2) { [0]=> string(7) "windows" [1]=> string(7) "windowt" } [1]=> array(2) { [0]=> object(MongoRegex)#9 (2) { ["regex"]=> string(8) "^windows" ["flags"]=> string(0) "" } [1]=> object(MongoRegex)#10 (2) { ["regex"]=> string(8) "^windows" ["flags"]=> string(0) "" } } } } } } ["oldPlan"]=> array(2) { ["cursor"]=> string(24) "BtreeCursor tags_1 multi" ["indexBounds"]=> array(1) { ["tags"]=> array(2) { [0]=> array(2) { [0]=> string(7) "windows" [1]=> string(7) "windowt" } [1]=> array(2) { [0]=> object(MongoRegex)#11 (2) { ["regex"]=> string(8) "^windows" ["flags"]=> string(0) "" } [1]=> object(MongoRegex)#12 (2) { ["regex"]=> string(8) "^windows" ["flags"]=> string(0) "" } } } } } ["server"]=> string(12) "host:27017" } 
 
For this moment my code look for one key word . I use research by keyword prefix like for example  /^windows/ . But when I type the word (windows for example), the first time the web page is very slow (It take 1 or 2 minutes to display results), and the second time the web page responses very quikly (less than 1 second) .
 
Thank you very much

marcandre

unread,
Sep 11, 2012, 10:18:05 AM9/11/12
to mongod...@googlegroups.com
Hellow,  
 
Is there someone to help me to resolve the slowness problem of my code with MongoDB ? 
 
Thank you.

Sam Millman

unread,
Sep 11, 2012, 10:29:45 AM9/11/12
to mongod...@googlegroups.com
Can you show us a formatted explain from console using this command? It's a bit hard to read that one.

From what I can see it is using the index now so it might not be the query that is the problem since you nsscanned is also quite small.

Also try running this same query in console now without the explain and see if it is slow there.

marcandre

unread,
Sep 11, 2012, 10:57:40 AM9/11/12
to mongod...@googlegroups.com
Ok , this is the my formatted explain when I execute the same query with "windows" keyword in my php page :
 
 
array(16) {
 ["cursor"]=> string(24) "BtreeCursor tags_1 multi"
 ["isMultiKey"]=> bool(true)
 ["n"]=> int(30)
 ["nscannedObjects"]=> int(30)
 ["nscanned"]=> int(30)
 ["nscannedObjectsAllPlans"]=> int(30)
 ["nscannedAllPlans"]=> int(30)
 ["scanAndOrder"]=> bool(false)
 ["indexOnly"]=> bool(false)
 ["nYields"]=> int(0)
 ["nChunkSkips"]=> int(0)
 ["millis"]=> int(0)
 ["indexBounds"]=> array(1) {
  ["tags"]=> array(2) {
   [0]=> array(2) {
    [0]=> string(7) "windows"
    [1]=> string(7) "windowt"
   }
   [1]=> array(2) {
    [0]=> object(MongoRegex)#6 (2) {
     ["regex"]=> string(8) "^windows"
     ["flags"]=> string(0) ""
    }
    [1]=> object(MongoRegex)#7 (2) {
     ["regex"]=> string(8) "^windows"
     ["flags"]=> string(0) ""
    }
   }
  }
 }
 ["allPlans"]=> array(1) {
  [0]=> array(5) {
   ["cursor"]=> string(24) "BtreeCursor tags_1 multi"
   ["n"]=> int(30)
   ["nscannedObjects"]=> int(30)
   ["nscanned"]=> int(30)
   ["indexBounds"]=> array(1) {
    ["tags"]=> array(2) {
     [0]=> array(2) {
      [0]=> string(7) "windows"
      [1]=> string(7) "windowt"
     }
     [1]=> array(2) {
      [0]=> object(MongoRegex)#8 (2) {
       ["regex"]=> string(8) "^windows"
       ["flags"]=> string(0) ""
      }
      [1]=> object(MongoRegex)#9 (2) {
       ["regex"]=> string(8) "^windows"
       ["flags"]=> string(0) ""
      }
     }
    }
   }
  }
 
 }
 ["oldPlan"]=> array(2) {
  ["cursor"]=> string(24) "BtreeCursor tags_1 multi"
  ["indexBounds"]=> array(1) {
   ["tags"]=> array(2) {
    [0]=> array(2) {
     [0]=> string(7) "windows"
     [1]=> string(7) "windowt"
    }
    [1]=> array(2) {
     [0]=> object(MongoRegex)#10 (2) {
      ["regex"]=> string(8) "^windows"
      ["flags"]=> string(0) ""
     }
     [1]=> object(MongoRegex)#11 (2) {
      ["regex"]=> string(8) "^windows"
      ["flags"]=> string(0) "" }
     }
    }
   }
 }
 ["server"]=> string(12) "host:27017"
}
 
But when you say that I should run the same query in console , do you want to say that I should run the query in mongo's shell ?
 
I never used mongodb by shell consol . If I have to use linux shell, how can I connect to my database ?

Sam Millman

unread,
Sep 11, 2012, 11:23:28 AM9/11/12
to mongod...@googlegroups.com
Mongo comes by default with two programs:

- mongod (The mongo deamon, the database itself in your eyes)
- mongo (The console app, a lot like mysql)

You can connect to it pretty much the same way as with mysql as well:

> sudo /path_to_my_mongo/mongo
> use my_db
> db.my_col.find({ //my condition in JSON form })

You can add a .explain onto the of the find and see if PHP is reading differently to console.

In theory both should be the same speed but we will see.

marcandre

unread,
Sep 11, 2012, 11:39:31 AM9/11/12
to mongod...@googlegroups.com
OK, This is the response what I get if I type in shell consol :
 
db.my_collection.find({tags:/^windows/}).limit(30).skip(30).explain();
{
        "cursor" : "BtreeCursor tags_1 multi",
        "isMultiKey" : true,
        "n" : 30,
        "nscannedObjects" : 60,
        "nscanned" : 60,
        "nscannedObjectsAllPlans" : 60,
        "nscannedAllPlans" : 60,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 1,
        "indexBounds" : {
                "tags" : [
                        [
                                "windows",
                                "windowt"
                        ],
                        [
                                /^windows/,
                                /^windows/
                        ]
                ]
        },
        "server" : "host:27017"
}

marcandre

unread,
Sep 11, 2012, 11:45:45 AM9/11/12
to mongod...@googlegroups.com
Another thing, if I type  
 
db.my_collection.find({tags:/^windows/}).limit(30).skip(30);
 
(without explain()) I get an UTF-8 error in my consol (I have got french title in data) , I don't know if it is the cause of this slowness .

marcandre

unread,
Sep 11, 2012, 3:50:06 PM9/11/12
to mongod...@googlegroups.com
I don't know why is there a lot of differences between the results in php and the shell consol when I execute explaine() for the same request
 
 
From consol :
 
db.my_collection.find({tags:/^windows/}).limit(30).skip(30).explain();
{
"cursor" : "BtreeCursor tags_1 multi",
"isMultiKey" : true,
"n" : 30,
"nscannedObjects" : 60,
"nscanned" : 60,
"nscannedObjectsAllPlans" : 60,
"nscannedAllPlans" : 60,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 1,
"indexBounds" : {
"tags" : [
[
"windows",
"windowt"
],
[
/^windows/,
/^windows/
]
]
},
"server" : "host:27017"
}
 
 
From php :
 
 
Best regards.

marcandre

unread,
Sep 12, 2012, 10:24:42 AM9/12/12
to mongod...@googlegroups.com
Hellow everybody,
 
I come back to you and I'm looking for  a solution for my query slowness from php again .
Is there someone can suggest me something about this problem ?
 
Best regards . 
Reply all
Reply to author
Forward
0 new messages