I need to read from a CSV file and HTTP POST to a server the fastest way I can. I think reading from the file is limiting my concurrency but I don't know where to look to check that.
Would using the gevent.fileobj wrapper help? Is the fileobj wrapper used by default when monkey patched? Would using geventhttpclient help? I'm using requests with a session shared by the spawned threads.
I'm trying to fill up a KVS (key-value store) which has an http interface. Saving a value means: POST with the body the key and value.
The key-values are stored in a CSV file.
I've tried different implementations, using a Pool and passing the iterator of the opened file to the map_unorderd function. And currently adding using a queue to`put` the items from the csv file, and have 10 workers that consume from the queue.
As you can see I read the file line by line, pass the line to a gevent thread to POST to the server.
I get very similar throughputs reported by the server with the different implmentations. And eventhough I tried increasing the concurrency (more threads) it had no effect on the throughput.
Thinks I think I could try:
* chunking instead of reading line by line
* mmap file
Any ideas or help?