Node-RED + mongodb find + large result sets + http response = dead RED, and other difficulties

2,198 views
Skip to first unread message

mfeb

unread,
Oct 13, 2015, 3:08:16 PM10/13/15
to Node-RED
I've burned a lot of time on this, and have very little to show.

First, when I issue a "find" mongodb query where the result set is large (341,669 lines of text), node-red seems to go out to lunch - it hangs and the client issuing the initiating http request times out, and node-red needs to be hard-restarted. When I reduce the resultset size to a subset of data (using projection), the 64,276 line resultset has no problem.

I've tried a larger memory allocation for node.js using
 
node --max-old-space-size=7168 my-node-script.js #increase to 7gb

nothing changes. The flow is fairly simple: an http get pushes a mongodb find query (in a template node) and a projection spec through the "mongodb in" node (set to "find") to an http response:



Now, the second problem is at least as concerning: the query in question performs a quasi-date-range query against records in mongo, returning records between designated dates:

 {_id:{$lt:new ObjectId( Math.floor(new Date(new Date("2015/10/11")).getTime()/1000).toString(16) + "0000000000000000"),
          $gt:new ObjectId( Math.floor(new Date(new Date("2015/10/10")).getTime()/1000).toString(16) + "0000000000000000")}}

Odd and "quasi-" because the actual dates in the records are not viable in a mongo query. The above query takes advantage of our awareness of how the mongo "_id" is generated.
set of records is returned - 3890 records.

 
Auto Generated Inline Image 1

mfeb

unread,
Oct 13, 2015, 3:14:52 PM10/13/15
to Node-RED

[Edit]

The clue is that the submitted query, when coming through the chain, is morphed into something that mongo doesn't process well:

{_id:{$lt:new ObjectId( Math.floor(new Date(new Date(\"2015/10/11\";)).getTime()/1000).toString(16) + \"0000000000000000\"),
        $gt:new ObjectId( Math.floor(new Date(new Date(\"2015/10/10)).getTime()/1000).toString(16) +\"0000000000000000\")}}

I've tried not using the template node and instead punching the whole thing into a string using a function node. Same thing.

I've also tried %22 and also \" and they too get sent through and not processed well by mongodb.

So, what can I do about the response size? And what can I do to get the query to go in as intended?



 

Nicholas O'Leary

unread,
Oct 13, 2015, 5:51:24 PM10/13/15
to Node-RED Mailing LIst
Hi,

Hmmm... good questions.

The find parameter, held in msg.payload, is expected to be a JavaScript object, not a String representation of the object. This is why it is failing when you set it via the Template node. You need to evaluate that stuff as part of a Function node.

However, the next hiccup will be your find parameter wants to use the ObjectId object that is part of the mongo driver and not part of the Function node sandbox - so you'd need to add the mongodb module to functionGlobalContext so you could access it. As I said, hmmmm, not ideal.


As for performance - the node is pulling down the complete result set and building it into a message object. Normally applications are able to use the mongo cursor to efficiently page through results as needed -  but that isn't the model we can easily do.

The fact you have the mongodb node wired into multiple subsequent nodes also means we have to clone the message before passing it to the second node - this will be an expensive operation if the message is very large.

Nick




 

--
http://nodered.org
---
You received this message because you are subscribed to the Google Groups "Node-RED" group.
To unsubscribe from this group and stop receiving emails from it, send an email to node-red+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mfeb

unread,
Oct 13, 2015, 6:15:59 PM10/13/15
to Node-RED
Regarding the find expression, the whole thing gets evaluated, as javascript, on the mongodb side. I need to be able to send it through unadulterated. When I punch that exact string into the mongodb shell, it works.

As for the size - I don’t really want the whole set. If the date filtering works, the subset will be smaller, although perhaps as large as the full set, depending on the filter. The reason for that is that we have a set of records that we’re feeding to another node-red instance(!) which will then slice and dice it in a number of visualizations. So coping with the size will be an issue, likely on both servers.

I can prune it down a bit using projection filters - perhaps down to a reasonable size - but I still might need more than the mechanism can handle.

I can simplify the flow to have the "mongodb in” node feed directly to the http response node (only need to remove the debug node).

In contrast to the issues with node-RED, we have another server accessing mongodb directly (using php) and it has no trouble with the sizes that are problematic on node-RED.

Could it be that there’s an issue with http buffer size?

Thanks,

Mark

Nicholas O'Leary

unread,
Oct 13, 2015, 6:44:35 PM10/13/15
to Node-RED Mailing LIst
I can find no examples that show you can pass a string expression as the find parameter to the mongo api - everything I see says it has to be an object. The mongodb shell will be evaluating that expression before passing it to the api.

A (not ideal) workaround would be to add an additional date field, to avoid the need to use ObjectId entirely.

Mark Feblowitz

unread,
Oct 13, 2015, 7:39:56 PM10/13/15
to node...@googlegroups.com
Heh - 

Of course, that was my very first thought. Unfortunately, the system that posts the date/time can only do it in the bad format. Of course. We can fix that later. 

A string can be passed to find - the value of msg.payload can be that string. The real trouble is that the string I need has embedded quotes. If not for those, it would work as a simple string. Oh, and to complicate matters, it needs a ‘ to surround the payload string and a “ to quote the embedded strings, e.g.’{gt: …. “2015/10/11”  … }’ . 

Unfortunately, flowing the string along requires escaped strings and the "mongodb in” doesn’t recognize this and unescape them. 

I can manually take the emitted string, remove the \ in front of the “  and plug it into the mongodb shell. And it works.




You received this message because you are subscribed to a topic in the Google Groups "Node-RED" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/node-red/8oQvdYaBEbs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to node-red+u...@googlegroups.com.

Nicholas O'Leary

unread,
Oct 14, 2015, 3:16:32 AM10/14/15
to Node-RED Mailing LIst
Mark,

no, the mongo node requires the selector to be an object because the underlying mongodb node module requires one:

> col.find('{}')
MongoError: query selector must be an object


Nick

Mark Feblowitz

unread,
Oct 14, 2015, 12:11:23 PM10/14/15
to node...@googlegroups.com
Nick - 

The mongodb node doc says:

Input

Calls a MongoDB collection method based on the selected operator.

Find queries a collection using the msg.payload as the query statement as per the .find() function.

No restrictions on what is sent to the .find()

The restriction on objects appears in the Outputs section

Output

A simple MongoDB output node. Can save, insert, update and remove objects from a chosen collection.

MongoDB only accepts objects.

That appears to only apply to the "mongodb out” node.


With a bit of editing and careful removal of linefeeds (which mongo doesn’t like), I was able to send through a find string that the mongo node appears to pass through without error and apparently without escapes (but returning all records) and that the mongo shell processes without error (returning the correct number of records);

{_id:{$lt:new ObjectId( Math.floor(new Date('Sun Oct 11 2015 00:00:00 GMT-0400 (EDT)').getTime()/1000).toString(16) + '0000000000000000'),$gt:new ObjectId( Math.floor(new Date('Sat Oct 10 2015 00:00:00 GMT-0400 (EDT)').getTime()/1000).toString(16) + '0000000000000000')}}

Not sure I can explain this,  in terms of the difference between submitting this via the "mongo in" node versus directly to the mongo shell (nor, for that matter, using our php code, which send a find query statement that not only gets through mongo’s api but is also processed correctly).

My best guess is that the "mongodb in" node does something with the payload contents before sending (perhaps based on an untrapped parsing error?) and lets *something* through. But I’ve tried a few variants of the find, using the mongo shell, and nothing seems to bring me the same answer (all db elements).

I guess my only hope might be to insert some print statements into the mongo in node.

Other suggestions?

Thanks,

Mark

Dave C-J

unread,
Oct 14, 2015, 12:43:24 PM10/14/15
to node...@googlegroups.com
Mark

all the examples in the doc are objects... 

even the find all documents is an empty object..
db.inventory.find( {} )

Mark Feblowitz

unread,
Oct 14, 2015, 1:40:32 PM10/14/15
to node...@googlegroups.com
Yes. I’ve been told that already. And the docs are very heavy in simple examples like that. There are a couple of examples that make use of multiple commands to set variables for use in a subsequent find. And we have at least one proof point - the find statement that does away with the variables and embeds the logic in the find statement:

{_id:{$lt:new ObjectId( Math.floor(new Date('Sun Oct 11 2015 00:00:00 GMT-0400 (EDT)').getTime()/1000).toString(16) + '0000000000000000'),$gt:new ObjectId( Math.floor(new Date('Sat Oct 10 2015 00:00:00 GMT-0400 (EDT)').getTime()/1000).toString(16) + '0000000000000000')}}

It isn’t parseable JSON. But it is accepted at the mongo shell. And from the php code using “MongoClient"
 
So perhaps assuming it must be JSON is the problem. And another problem is passing something through to mongodb.

Guess I’ll have to look at the code.

Dave C-J

unread,
Oct 14, 2015, 1:48:15 PM10/14/15
to node...@googlegroups.com

That code is parseable JSON when quoted correctly.
{
    "_id": {
        "$lt": "newObjectId(Math.floor(newDate('SunOct11201500: 00: 00GMT-0400(EDT)').getTime()/1000).toString(16)+'0000000000000000')",
        "$gt": "newObjectId(Math.floor(newDate('SatOct10201500: 00: 00GMT-0400(EDT)').getTime()/1000).toString(16)+'0000000000000000')"
    }
}

Nicholas O'Leary

unread,
Oct 14, 2015, 2:10:37 PM10/14/15
to Node-RED Mailing LIst
1. The MongoClient node.js module does not accept strings as the argument to collection.find(). If you try to do so it throws the MongoError I posted this morning.

2. The node-red mongo node uses the node.js MongoClient module. If msg.payload is not an Object type, it gets replaced with the default empty object {} before being passed to MongoClient. Rightly or wrongly, this means the node will appear to accept anything, but if it was of an invalid type (ie a string) the result is the default find-everything result. Here is the function that does this: https://github.com/node-red/node-red-nodes/blob/master/storage/mongodb/66-mongodb.js#L45

3. MongoShell is an interpreted JS shell. It will properly evaluate JS expressions before passing to the underlying API. I do not believe the 'new ObjectId(Math.floor(....))' stuff goes anywhere near the mongo server - it is evaluated in the shell before being sent over the API.

4. In a simple node.js shell, I've created a MongoClient connection to an instance. I then passed the following query arguments to collection.find(), on the basis that a document with the _id value used here does exist and should get returned.


// Object type with _id as a string            result: no match
    var query = {_id:'55f821bba8d4c31d0052a57c'};

// Object type with _id as a string expression          result: no match
    var query = {_id:"new ObjectId('55f821bba8d4c31d0052a57c')"};

// Object type with _id as a properly instantiated ClientId object           result: MATCH
    var query = {_id: new mongo.ObjectId('55f821bba8d4c31d0052a57c')};

// String version of the one that works      result: error thrown "MongoError: query selector must be an object"
    var query = "{_id:new ObjectId('55f821bba8d4c31d0052a57c')}"


You point at the docs for the node-red mongo node and say they don't say there is a restriction on what can be passed. You are quite right - I will fix the docs to make it clear you cannot pass a string payload to the node as the query parameter. I will point to the mongodb documentation that explains the object format of queries.


Nick


You received this message because you are subscribed to the Google Groups "Node-RED" group.
To unsubscribe from this group and stop receiving emails from it, send an email to node-red+u...@googlegroups.com.

Mark Feblowitz

unread,
Oct 14, 2015, 2:41:24 PM10/14/15
to node...@googlegroups.com
Yes - that’s helpful. Getting closer.

If I send that straight into the mongo in node, it includes a whole lot of escape characters. 

If I first send it through the json operator, it looks far more reasonable:

{
 "_id": {
 "$lt": "new ObjectId(Math.floor(new Date('Sun Oct 11 2015 00:00:00 GMT-0400 (EDT)').getTime()/1000).toString(16)+'0000000000000000')”, 
"$gt": "new ObjectId(Math.floor(new Date('Sat Oct 10 2015 00:00:00 GMT-0400 (EDT)').getTime()/1000).toString(16)+'0000000000000000’)"
 }
}

However, querying against that object now reports 0 entries. So, according to what Nick reported, the mongodb in node is letting it through.
The same query outcome (zero results) can be seen when that exact object is entered at the mongodb shell, until I remove the quotes (but leave in the apostrophes). Then I get the correct count.

So, close…

Ready, however, to give up.

Thanks,

Mark

Mark Feblowitz

unread,
Oct 14, 2015, 2:45:47 PM10/14/15
to node...@googlegroups.com
Very comprehensive, and confirms my hunch. Thanks!

Perhaps line 45 (or the callers of that function) should test and warn or throw an exception? 

As for my case, it seems that converting the query to a proper object does help the query  pass through. However inserting quotes around the JS parts seems also to interfere. See my response to Dave’s note. 

I see the sample in #4 below that worked does not appear to be JSON object. 

Any way I can shove that through? Should I test by temporarily disabling the test at line 45?

Mark Feblowitz

unread,
Oct 14, 2015, 2:47:41 PM10/14/15
to node...@googlegroups.com
Yes - that’s helpful. Getting closer.

If I send that straight into the mongo in node, it includes a whole lot of escape characters. 

If I first send it through the json operator, it looks far more reasonable:

{
 "_id": {
 "$lt": "new ObjectId(Math.floor(new Date('Sun Oct 11 2015 00:00:00 GMT-0400 (EDT)').getTime()/1000).toString(16)+'0000000000000000')”, 
"$gt": "new ObjectId(Math.floor(new Date('Sat Oct 10 2015 00:00:00 GMT-0400 (EDT)').getTime()/1000).toString(16)+'0000000000000000’)"
 }
 }

However, querying against that object now reports 0 entries. So, according to what Nick reported, the mongodb in node is letting it through.
The same query outcome (zero results) can be seen when that exact object is entered at the mongodb shell, until I remove the quotes (but leave in the apostrophes). Then I get the correct count.

So, close…

Ready, however, to give up.

Thanks,

Mark
Reply all
Reply to author
Forward
0 new messages