wee_import csv problem "csv.Error: line contains NULL byte"

837 views
Skip to first unread message

Janne Prokkola

unread,
Mar 22, 2020, 2:03:54 PM3/22/20
to weewx-user
Hello
I've tried to import csv without luck. I get following message.

A CSV import from source file '/var/tmp/data.csv' has been requested.
Using database binding 'wx_binding', which is bound to database 'weewx.sdb'
Destination table 'archive' unit system is '0x01' (US).
Missing derived observations will be calculated.
This is a dry run, imported data will not be saved to archive.
Traceback (most recent call last):
  File "/usr/bin/wee_import", line 834, in <module>
    main()
  File "/usr/bin/wee_import", line 784, in main
    source_obj.run()
  File "/usr/share/weewx/weeimport/weeimport.py", line 350, in run
    _mapped_data = self.mapRawData(_raw_data, self.archive_unit_sys)
  File "/usr/share/weewx/weeimport/weeimport.py", line 558, in mapRawData
    for _row in data:
  File "/usr/lib/python2.7/csv.py", line 107, in next
    self.fieldnames
  File "/usr/lib/python2.7/csv.py", line 90, in fieldnames
    self._fieldnames = self.reader.next()
_csv.Error: line contains NULL byte

Any help or ideas? The data.csv is stripped from practically everything and from csv.conf I have commented nearly all fileds in FieldMap.  Attached also original data3.csv.


best regards
Janne

data.csv
csv.conf
data3.csv

gjr80

unread,
Mar 23, 2020, 6:32:36 AM3/23/20
to weewx-user
Hi,

The clue is in the error message. Whatever process you used to create your csv files has resulted in the UTF-8 Byte order mark (BOM) being included at the start of the file and that is upsetting the python csv reader. From memory this issue has occurred before (or perhaps it was some other non-displaying sequence of bytes). In either case it is probably worthwhile adding some code to wee_import strip out any BOMs during the  pre-processing of any csv files being imported. In the meantime you should be able to import your data by opening your csv file(s) in a text editor, moving to the start of the file and deleting characters one at a time until the first displayable character (" in data3.csv, t in data.csv) is deleted. Re-type the just deleted character (the file should again look just like it did when opened) and save the file. It should now import without problem (well without the BOM/null byte problem anyway).

Gary

Janne Prokkola

unread,
Mar 23, 2020, 10:57:57 AM3/23/20
to weewx-user
hi

thanks for a good answer. Yesterday after my posting I thought also the reason might be in csv. I tried several different programs like Gedit and Libreoffice and in the end I managed to import my csv to weewx. Unfortunately I did not understand why. 

Now I have to find a reasonable workaround to get csv imported to weewx. You might ask why I need this? I'm going to install my Ventus to our cottage without PC connection. When I visit my cottage I'd like to read the history data from my Ventus W835. The manufacturer of Ventus provide a Windows program (Weather Tool v1.exe) to do this, and the output is csv. 

My steps are following
1) download history-data from weather station (Windows)
2) open file in Linux
3) change wind directions (N, E, etc) to degrees
4) merge date and time to one column (maybe not necessary to convert it to timestamp?)
5) get rid of BOM
6) import to weewx

Just too many steps to do it regularly. A nice script would help. Maybe I have to try to write one.

Or will there be in the future an option to download the history data stored in weather station directly to weewx?

regards
Janne

Bob Atchley

unread,
Mar 23, 2020, 3:12:14 PM3/23/20
to weewx-user
Hi Janne,

I think it is possible.  The driver  includes a function "genArchiveRecords" which I have not yet implemented for the ws6in1 driver (yet ... getting it working was the important thing first), but I think the purpose would be to do precisely what you want.  I'm not sure how it is invoked or what it does with the archive records, but I think I have the required information to implement the function.
Be warned though that the manual for my Youshiko YC9388 (so the same for your Ventus) says that the history can only be reset at the console, and once the buffer is full it will not keep any more history, so once you have successfully imported the data you need to clear the history at the console (step 7 in your sequence should be to reset the buffer ... maybe after you have taken a backup of the database)

Regards

Bob

gjr80

unread,
Mar 24, 2020, 7:19:56 AM3/24/20
to weewx-user
On Tuesday, 24 March 2020 00:57:57 UTC+10, Janne Prokkola wrote:
hi

thanks for a good answer. Yesterday after my posting I thought also the reason might be in csv. I tried several different programs like Gedit and Libreoffice and in the end I managed to import my csv to weewx. Unfortunately I did not understand why. 

As I said, the issue was some non-displaying characters/bytes at the start of the file, specifically the UTF-8 BOM. Most likely inserted by the program that created the file or an editor used to edit the file. At present wee_import cannot handle those characters so the only way to import such a file is to delete the characters.
 
Now I have to find a reasonable workaround to get csv imported to weewx. You might ask why I need this? I'm going to install my Ventus to our cottage without PC connection. When I visit my cottage I'd like to read the history data from my Ventus W835. The manufacturer of Ventus provide a Windows program (Weather Tool v1.exe) to do this, and the output is csv. 

My steps are following
1) download history-data from weather station (Windows)
2) open file in Linux
3) change wind directions (N, E, etc) to degrees
4) merge date and time to one column (maybe not necessary to convert it to timestamp?)
5) get rid of BOM
6) import to weewx

Date and time needs to be in a single field/column but can be in any format that can be represented by Python strptime() format codes. There is no need to convert to Unix epoch timestamps. I have a long outstanding task to allow compass point directions to be used in CSV imports. Seems I might have some spare time on my hands now so I will see if I can get that implemented in the not too distant future. The BOM issue is being worked on and should be solved in WeeWX 4.0. So if you have a lot of data to import if you wait a for short while it should make your task of importing your data somewhat easier.

Just too many steps to do it regularly. A nice script would help. Maybe I have to try to write one.

A script or a decent (code) editor with some well thought out regexs/searches/replaces will make life easier.

Or will there be in the future an option to download the history data stored in weather station directly to weewx?

Downloading history stored in the station is a driver issue not a 'wee_import' issue. Some station hardware supports it, some does not. Of the stations that do support it some drivers implement it and some do not (for a variety of reasons). By the sounds of it your station may support downloading the history but the driver is yet to implement the feature.

Gary

gjr80

unread,
Mar 24, 2020, 7:33:36 AM3/24/20
to weewx-user
Don't get too wrapped around the axles about what happens with the output from genArchiveRecords(), WeeWX will take care of it. Just like a driver emits loop packets and WeeWX accumulates the data from these loop packets and (if necessary) emits an archive record, WeeWX will take care of the archive records emitted from genArchiveRecords(). By the sounds of it the need to reset the history via a button on the console may be the real problem, guess it will come down to exactly how the station works and what it can/can't do. Sounds like some experimentation will be required.

Gary

gjr80

unread,
Mar 25, 2020, 7:44:41 AM3/25/20
to weewx-user
Just to wrap the wee_import issue up. wee_import to be included in WeeWX 4.0 will handle files that include the BOM and CSV imports will accept numeric degrees or one, two or three letter abbreviations or words representing the cardinal, intercardinal and secondary intercardinal directions for direction data being imported. If you can't wait for the 4.0 release and your are up for it you can download and install the current 4.0 beta by cloning and installing from the WeeWX master branch on GitHub.

Gary

Bob Atchley

unread,
Mar 28, 2020, 1:31:26 PM3/28/20
to weewx-user
Hi Janne,

I've carried out a similar import from my Youshiko weather station  and came across exactly the same issue.  I'm an Emacs user and looking at the csv  file exported from the Youshiko software in hexl-mode it would appear that the csv file is exported in UTF-16 format rather than ASCII.  I can think of no conceivable reason for doing this and assume it is a bug (of course LibreOffice and even emacs render this perfectly, so quite difficult to detect).  If you open the file in a simpler editor such as nano it is much clearer that something is very wrong.  

So a conversion from UTF-16 to ASCII is needed before attempting to use the weewx import utility. 

I found 2 ways of doing this
1) open a new file in nano.  Open the the csv file in Emacs (or probably any editor that renders it correctly) and copy the contents of the file and paste it into the nano editor and save
2) From the command line:
$ iconv -f UTF-16 -t ASCII//TRANSLIT input.csv -o output.csv

I used method 1) (I expect you did  something similar as well even if accidentally).  Method 2 is not perfect, and probably needs more work.

But to be clear a CSV file would normally be expected to be ASCII, so this is not a weewx import issue.

Hope this helps if you need to do another import in the future

Regards

Bob

Thomas Keffer

unread,
Mar 28, 2020, 3:11:41 PM3/28/20
to weewx-user
Good sleuthing, Bob!

With Gary's modifications, the user will be able to specify the encoding of the import file, avoiding the need to translate it.

--
You received this message because you are subscribed to the Google Groups "weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/6c5e21fc-21e8-483d-8646-64ec2f605080%40googlegroups.com.

gjr80

unread,
Mar 29, 2020, 3:30:21 AM3/29/20
to weewx-user
Yes, once 4.0 is released wee_import will automatically decode UTF-8 CSV files. Files with any other python supported encoding will also be supported; however, in these cases the user will need to specify the encoding being used via an option in the import config file.

Gary

Reply all
Reply to author
Forward
0 new messages