Hello,
Thanks for testing and sharing the benchmark results.
I looked at the weather data e.g.:
year,month,day,hour,minute,Temperature,Precipitation - Tipping
Bucket,Precipitation - Weighing,Solar - Incoming,Solar - Outgoing,Wind
Speed - Average 4.4,Wind Speed - Gust 4.4,Wind Speed - Average
2.0,Wind Speed - Gust 2.0,Wind Direction,RH,Pressure,Soil Moisture - 5
cm,Soil Moisture - 10 cm,Soil Moisture - 20 cm,Battery,Dew Point
2017, 1, 1, 0, 0, -27.95667, 0.00000, 0.00000,
1.00000, 1.00000, 0.00000, 0.00000, 0.00000,
0.00000, 329.00000, 80.50667, 103.26868, -9999.90039,
-9999.90039, 0.28500, 4.11733, -30.30333,
One plus of the new csvreader is that it supports many flavors /
formats / dialects out-of-the-box without any configuration. In the
case about that would be the CSV <3 Numerics format [1]
Try changing:
Csv.read(FILE, { headers: true, converters: :all })
to
Csv.num.read(FILE, headers: true ) ## or Csv.numeric.read(FILE,
headers: true )
and it should be faster (in theory - always benchmark, of course)
because the data conversion pipeline is seriously broken (and will get
replaced / redone ). See What's Your Type? [2] on the inside story /
details.
Thanks again. Cheers. Prost.
[1]
https://github.com/csvspecs/csv-numerics
[2]
https://github.com/csvreader/docs/blob/master/csv-types.md