I have directories with hundreds of csv files, about 3 GB worth of data. I need to import this into a database, but without writing to a massive file as an intermediary step. Can you recommend a good way to do this, eg. in memory process where you can monitor the progress of the work being done in real time. I'm not fond of looking at my current drake step, which just stalls as awk does it's work on 80 million lines of text data.
I love the concept of Drake, really nice effort, but the documentation is a bit of a slog, so excuse me for taking the lazy way out here. ;)
Thanks,
Peder J.