facing problems with concurrency

118 views
Skip to first unread message

manjunath

unread,
Aug 19, 2014, 8:43:51 AM8/19/14
to nod...@googlegroups.com
we have the following scenario and we are greatly stuck to scale it and cannot find the problem

our processflow is as follows:
-request received from client and headers are collected
-initiate a async.waterfall
>>get some data from redis
>>build some object based on redis
>>build an array of tasks for async.parallel to send requests to 3-5 servers
>>run a async.parallel using above tasks
>>>>if the async.parallel got some results in any of the request sent in parallel, send that to client and end(header is set to connection:close for client)
>>>>if not then run an async.waterfall to send requests to different set of servers one after another in a synchronous manner, we are waiting for one to close before other one is initiated(wait.for is used with fibers)
>>>>>>same here if there is some result from request in waterfall then send it to client and end

we are trying to hit this to concurrency at 200(having plans to increase to 2000) but still failing, we are facing gradual increase in the response time from our system
we are using cluster also in this app, we have a guess this might be due to sockets issue, can anyone help us out.

alessioalex

unread,
Aug 20, 2014, 4:46:38 AM8/20/14
to nod...@googlegroups.com
It would be a lot better if you could isolate the problem and provide a snippet. Normally you shouldn't experience this problem, so there must be something wrong with the code.

Sam Roberts

unread,
Aug 20, 2014, 5:21:18 PM8/20/14
to nod...@googlegroups.com
Looks like a fun problem!

Zones is perfect for this use-case.

They can be used to encapsulate a number of parallel async activities,
and the first to complete can return a result for the Zone, and it
will tear down all the other ongoing activities... see
https://github.com/strongloop/zone. You could even try it... if it
works for you, who cares if it has bugs that you don't hit, right? :-)
And we could use the feedback.


But, bleeding edge concurrency primitives aside, a few suggestions:

- very unlikely that cluster is your problem, it doesn't have any
effect (you are on 0.10, right? you didn't say, node version is
important) except making sure listening sockets are shared across the
cluster, shouldn't do anything with outgoing client connections

- easy way to verify: run without cluster.... always a good idea
anyhow, if it doesn't work standalone, it won't work with cluster.

You don't have any way to cancel your parallel actions, that you
mention, so when the first of a set of parallel actions completes....
all the others keep going.... so with fast incoming connections, every
complete request will "leak" (for a little while) some set of
incomplete outgoing connections... a brutal multiplier effect, made
worse the more "irregular" the response time is... incoming
requests/sec will be as fast as the fastest outgoing request, but
accumulation of un-needed outgoing connections will be based on the
slowest of the outgoing response times. Ouch.

if you have fd limits (remove them), you can hit those.

if you have connection pool max size limits (outgoing mysql
connections, for example, or the HTTP outgoing concurrent request
limit which defaults to 5,
http://nodejs.org/api/http.html#http_agent_maxsockets) your
dead-but-ongoing requests can slam into those barriers fast with the
multiplier effect you have. This might be capping your incoming
request completion rate.

You don't mention logging... you should count current number of
incoming and outgoing incomplete requests, and log those counts.

Consider using vasync, it is more inspectable, might make it much
easier to dump current state. Also, it appears you may benefit from
its barrier().

For "interrupting", I'd suggest finding a way with all outgoing
requests to cancel, to teardown the connection and terminate it early,
if possible, returning an "interrupted" error result to async ASAP
after being cancelled.

Be interesting to hear what you did after you figure this out!

Cheers,
Sam

William Dias

unread,
Aug 20, 2014, 5:21:23 PM8/20/14
to nod...@googlegroups.com
I'm sorry that I don't have an answer for that. 
But you caught my attention, why do you think it is related to a sockets issue?


--
Job board: http://jobs.nodejs.org/
New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
---
You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+un...@googlegroups.com.
To post to this group, send email to nod...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nodejs/459b90ad-097b-4340-a2d7-e4586af7c011%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
William Régis Drawanz Dias
Chief Technology Officer at itquetas
Blog: diaswrd.com

Matt

unread,
Aug 20, 2014, 5:22:54 PM8/20/14
to nod...@googlegroups.com

On Tue, Aug 19, 2014 at 8:43 AM, manjunath <manjua...@gmail.com> wrote:
send requests to 3-5 servers

Using the built-in http module? You're probably running up to the Agents limit.

manjunath

unread,
Aug 28, 2014, 4:51:50 AM8/28/14
to nod...@googlegroups.com
thanks to all the inputs, but after some serious research, we found that increasing globalAgent.maxSockets=Infinity; fixed our problems.
and we also used some fd fine tuning for TCP open files.
Reply all
Reply to author
Forward
0 new messages