In doing some performance testing with bolts written in Python, I found that it was taking ~7.1 seconds to emit 18,000 tuples. In looking at the code involved, I see that after every tuple emitted, storm.py is reading task IDs from Storm. In my bolt, I was not using the task IDs, so I decided to try and make them optional.
I made an experimental change to the handleEmit() function in ShellBolt.java as follows:
if(task==null) {
Object need_task_ids = action.get("need_task_ids");
List<Integer> outtasks = _collector.emit(stream, anchors, tuple);
if (need_task_ids == null || ((Boolean) need_task_ids).booleanValue())
_pendingWrites.put(outtasks);
} else {
_collector.emitDirect((int)task.longValue(), stream, anchors, tuple);
}
With this change (and a corresponding change to storm.py to set need_task_ids=false), the time to emit 18,000 tuples decreased to 2.8 seconds, for a 250% performance improvement. Are there any down sides to this change? Would it be a good change to put in the main Storm code base? Let me know, and I will be happy to create a pull request.
On a related note, has anyone looked into the possibility of implementing Storm's multilanguage support using ZeroMQ instead of stdout/stdin? It might create a big performance boost, especially in combination with the change above.
Barry