04/28/2013 13:47:04 49624.230 3450001624.245 12:47:04 SPEC:N2O,N2O,CO,CO,H2O
3450001623.890860 3.24297e2 3.25e2 1.79025e2 2.21956e3 6.90787e6
3450001623.990910 3.24163e2 3.25e2 1.79056e2 2.22118e3 6.92437e6
3450001624.090950 3.24244e2 3.25e2 1.78798e2 2.24119e3 6.94525e6
3450001624.191000 3.24314e2 3.25e2 1.78959e2 2.23028e3 6.978e6
3450001624.291050 3.24259e2 3.25e2 1.78645e2 2.22066e3 7.05206e6
3450001624.391100 3.243e2 3.25e2 1.78802e2 2.22303e3 7.01239e6
3450001624.491150 3.24294e2 3.25e2 1.78935e2 2.2222e3 7.02175e6
3450001624.591190 3.24172e2 3.25e2 1.7854e2 2.22271e3 7.0164e6
3450001624.691240 3.24257e2 3.25e2 1.79067e2 2.22521e3 6.99538e6
3450001624.791290 3.24277e2 3.25e2 1.78358e2 2.2416e3 7.06199e6
3450001624.891340 3.24382e2 3.25e2 1.7869e2 2.22428e3 7.05238e6
3450001624.991390 3.24035e2 3.25e2 1.78968e2 2.23876e3 6.97643e6
3450001625.091430 3.24257e2 3.25e2 1.79076e2 2.22578e3 7.03076e6
3450001625.191480 3.2415e2 3.25e2 1.78592e2 2.23124e3 6.95091e6
3450001625.291530 3.24096e2 3.25e2 1.78835e2 2.21803e3 6.95247e6
3450001625.391580 3.24197e2 3.25e2 1.78331e2 2.23744e3 6.93555e6
3450001625.491630 3.24094e2 3.25e2 1.78812e2 2.22798e3 6.93854e6
3450001625.591670 3.24315e2 3.25e2 1.78494e2 2.22471e3 6.99256e6
3450001625.691720 3.24108e2 3.25e2 1.78534e2 2.23321e3 6.93832e6
3450001625.791770 3.24202e2 3.25e2 1.78795e2 2.21372e3 6.93131e6
fid = open("file.dat","r")
(D,H) = readdlm(fid,' ',has_header=true)
I know that DataFrames readtable() hasn't been set up to do this yet. It shouldn't be too hard: it's just ugly. Adding a switch that keeps looping over multiple whitespace characters should be easy for both readtable() and readdlm().
We’d be happy to have a patch for this. I don’t see any way for us to efficiently support actual regexes, so I’d prefer that we just provide a mechanism for allowing multiple whitespaces to be treated as one delimiter.
With all that said, I think you would do a great service to a humanity by scrubbing any file with that formatting and using something like tabs instead.