Node.js Long running Task scenario

MrNode

unread,

Mar 11, 2018, 12:17:46 PM3/11/18

to nodejs

Create a mechanism to support a long running task where we can solve for the mentioned scenario:

A support user starts a baseline upload for an organization. The baseline upload had 100000 rows in the CSV file uploaded, and the support user - right after clicking the “Start” button - realizes that he seemed to have wrongly exchanged data between two adjacent columns. Subsequently, the support user would need to wait until the current request completes, which can be in the magnitude of hours, and since we don't allow 2 concurrent baseline uploads in Labs right now, this implies he would not be able to upload the correct file till this task completes. Once done, the user would need to remove those 1 lac entries, and finally upload the correct file.

How I can write code in Node.js to solve this problem.

Zlatko

unread,

Mar 13, 2018, 11:26:57 AM3/13/18

to nodejs

Hi MrNode,

This is way too generic task description to be node-specific. We don't know how is the file processed. We don't know how is this CSV processing needs to be atomic or not - it seems yes, but not clear. We don't know if you can abort it or not. We don't know if you want to block the upload function completely or you merely block it because you can't abort the previous job. Such a thing needs to be designed with the task specifics in mind.

Here are a few scenarios, depending on your requirements.

Under assumption that the 100,000 rows in the CSV are individual "work items", and the whole CSV of them is a "batch of work", you can, on upload, simply create a queue of 100,000 things to process. Also, hold a hashmap of these work items, so you can address them by the batch. Also have a work dispatch protocol and a node Microservice that takes the rows to processing, one by one, and take them off this batch queue into another queue, "pending batch finish". Once it is all completed, mark those "pending finish" as "completely completed" and clear the batch. Expose a "Cancel batch work" functionality to user, if they click "cancel current batch", you clean up all the pending tasks so that your worker microservice stops processing these. Also, mark the batch as canceled, so you can clean up the work items that were already completed.

If you can't break processing into items (maybe you're aggregating things over those 100,000 rows), perhaps you can run this aggregation in an interruptible loop. Then your "Cancel batch work" would first check if there are running aggregations and interrupt/abort them, then proceed as planned and run the new file.

Your least flexible option, e.g. you cannot stop the processing once it has started, is to at least provide upload queue - the first file uploaded is getting processed, and now you expose an endpoint where you can upload additional CSVs. They are just sent to server and are waiting. You can still cancel those "pending" CSVs and upload new ones instead, even if ytou can't break the main, running CSV. Then expose a simple "status" endpoint where you can indicate your status to the user, e.g. "processed 40,000 of 100,000 rows, 3 CSVs pending processing".

You would have to keep all those locks and things outside the running process - Redis is probably the simplest to use - because you might be running these tasks (upload, processing, statuses) on different servers, or at the very least, different workers in your Node cluster instance.

But these are very rough guesses. With a lot more details, I could provide a better overview. Shoot if you have other questions.

Murukesh Sadasivan

unread,

Mar 14, 2018, 10:45:01 AM3/14/18

to nodejs

Adding to Zlatko anwser,

You definitely need a queuing system (or probably is using it already). That is because it's a long running job from what I understand (as you mentioned magnitude of hours). The queuing system can be a simple custom solution based on a database or even plain filesystem or full-fledged queueing systems like ActiveMQ etc. You can implement without a queueing system as well, but this makes the program very robust and can survive restarts etc.

Now all you need to do is embed a batchId for each batch request that the user is submitting. So the support user can reload the web page etc and see that his batch is in progress and have option to cancel it if needed. Now he can press the cancel button to request a cancellation of his process and this will mark that batch with batchId as cancelled in a table or a variable. Now in between processing the batch job processor can check if the batchId is cancelled or not. It will process the items from the batch only if the batchId is not cancelled otherwise it will drain the remaining workItems if using a queuing system (or delete from table, if using a database system) and mark the batch as cancelled.