This is a follow up from my previous message regarding uploading records to a server. Each record is a line in CSV, the number of records may vary, from a million to +100 million records. I would like to upload it as fast as possible.
pool.map_unordered(post, [create_data(line) for line csv.reader(csv)]
Got me about 10k requests per minutes which for 10 million records meant waiting for 16hrs for the process to end.
I tried gipc and got about 20k rpm. (You can see a previous mail for that)
I think I can improve the results if I can have one thread reading chunks of the file and sending the records through a queue where several threads with gevents can post to the server, but I'm not sure if I can use a threading.Queue shared by several greenlets? Should I have one greenlet per thread taking out of the threading.Queue and passing on to other greenlets through gevent.queue?
Has anyone done something similar?