cmon v0.16 crashes in weewx version 3.9.1

239 views
Skip to first unread message

Luc Heijst

unread,
Feb 14, 2019, 11:41:19 AM2/14/19
to weewx-user
cmon v0.16 crashes in weewx version 3.9.1

See crashlog:
Feb 14 11:10:15 pi21 vpro[28175]: manager: Added record 2019-02-14 11:10:15 -03 (1550153415) to database 'cmon21'
Feb 14 11:10:15 pi21 vpro[28175]: engine: Main loop exiting. Shutting engine down.
Feb 14 11:10:15 pi21 vpro[28175]: engine: Shutting down StdReport thread
Feb 14 11:10:15 pi21 vpro[28175]: engine: StdReport thread has been terminated
Feb 14 11:10:15 pi21 vpro[28175]: engine: Caught unrecoverable exception in engine:
Feb 14 11:10:15 pi21 vpro[28175]:     ****  accum: ScalarStats.addHiLo expected float or int, got 3809297
Feb 14 11:10:15 pi21 vpro[28175]:     ****  Traceback (most recent call last):
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/engine.py", line 890, in main
Feb 14 11:10:15 pi21 vpro[28175]:     ****      engine.run()
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/engine.py", line 202, in run
Feb 14 11:10:15 pi21 vpro[28175]:     ****      self.dispatchEvent(weewx.Event(weewx.POST_LOOP))
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/engine.py", line 224, in dispatchEvent
Feb 14 11:10:15 pi21 vpro[28175]:     ****      callback(event)
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/engine.py", line 580, in post_loop
Feb 14 11:10:15 pi21 vpro[28175]:     ****      self._catchup(self.engine.console.genArchiveRecords)
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/engine.py", line 635, in _catchup
Feb 14 11:10:15 pi21 vpro[28175]:     ****      origin='hardware'))
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/engine.py", line 224, in dispatchEvent
Feb 14 11:10:15 pi21 vpro[28175]:     ****      callback(event)
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/user/cmon.py", line 714, in new_archive_record
Feb 14 11:10:15 pi21 vpro[28175]:     ****      self.save_data(self.get_data(now, self.last_ts))
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/user/cmon.py", line 721, in save_data
Feb 14 11:10:15 pi21 vpro[28175]:     ****      self.dbm.addRecord(record)
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/manager.py", line 246, in addRecord
Feb 14 11:10:15 pi21 vpro[28175]:     ****      self._addSingleRecord(record, cursor, log_level)
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/manager.py", line 1216, in _addSingleRecord
Feb 14 11:10:15 pi21 vpro[28175]:     ****      _day_summary.addRecord(record, weight=_weight)
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/accum.py", line 256, in addRecord
Feb 14 11:10:15 pi21 vpro[28175]:     ****      func(self, record, obs_type, add_hilo, weight)
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/accum.py", line 314, in add_value
Feb 14 11:10:15 pi21 vpro[28175]:     ****      self[obs_type].addHiLo(val, record['dateTime'])
Feb 14 11:10:15 pi21 vpro[28175]:     ****    File "/home/weewx/bin/weewx/accum.py", line 77, in addHiLo
Feb 14 11:10:15 pi21 vpro[28175]:     ****      raise ValueError("accum: ScalarStats.addHiLo expected float or int, got %s" % val)
Feb 14 11:10:15 pi21 vpro[28175]:     ****  ValueError: accum: ScalarStats.addHiLo expected float or int, got 3809297
Feb 14 11:10:15 pi21 vpro[28175]:     ****  Exiting.

The problem is caused by the net_eth0_tbytes calculation in cmon.py, see below. 
       # get network usage
        fn = '/proc/net/dev'
        try:
            netinfo = self._readproc_dict(fn)
            if netinfo:
                for iface in netinfo:
                    values = netinfo[iface].split()
                    for i, k in enumerate(self._NET_KEYS):
                        if iface not in self.last_net:
                            self.last_net[iface] = {}
                        if k in self.last_net[iface]:
                            x = int(values[i]) - self.last_net[iface][k]
                            if x < 0:
                                maxcnt = 0x100000000 # 32-bit counter
                                if x + maxcnt < 0:
                                    maxcnt = 0x10000000000000000 # 64-bit counter
                                x += maxcnt
                            record['net_' + iface + '_' + k] = x
                        self.last_net[iface][k] = int(values[i])
        except Exception, e:
            logdbg("read failed for %s: %s" % (fn, e))

The value of net_eth0_tbytes is a long integer ('net_eth0_tbytes': 108233L,) and accum.py (of version 3.9.1) expects a float or an integer, but NOT a long.

To solve this, the cmon value must be converted to an integer in line 463 of cmon.py
                            record['net_' + iface + '_' + k] = int(x)

Luc

Sef Konings

unread,
Feb 16, 2019, 4:06:32 AM2/16/19
to weewx-user
Hi Luc,

Here also the same problem last night, the Main engine shut down, casued by:

accum: ScalarStats.addHiLo expected float or int, got 308902

I have adapted line 463 of cmon.py. After this patch, the problem has been solved.

Thanks for your  advise!

Sef  (PA3SK)

Echt (NL)

Op donderdag 14 februari 2019 17:41:19 UTC+1 schreef Luc Heijst:

Luc Heijst

unread,
Feb 16, 2019, 4:39:45 AM2/16/19
to weewx-user
Hallo Sef,

The problem will occur only on systems with 64-bit net counters when the counter reaches the maximum value and starts with zero again. When you have low network traffic this may take a while.

Luc

Sef Konings

unread,
Feb 17, 2019, 8:16:51 AM2/17/19
to weewx-user
Thanks,

I am using 64bit counters and the system was up for about 8 days. I presume that the counters are reset after a server reboot.
But, the problem is solved now!

Thanks for your answer !


Op donderdag 14 februari 2019 17:41:19 UTC+1 schreef Luc Heijst:
cmon v0.16 crashes in weewx version 3.9.1

gjr80

unread,
Feb 17, 2019, 8:35:16 PM2/17/19
to weewx-user
Hmm, I can't help wondering what has changed. There were only insignificant changes to accum.py between 3.8.2 and 3.9.1. Likewise no significant changes to cmon from 0.15 to 0.16. 64bit systems have been around for a while so I am wondering why this only seems to be rearing it's head now. I've seen the long v int issue once before but don't believe it was in the context of cmon network traffic.

Makes no difference to the solution but just wondering.

Gary

Luc Heijst

unread,
Feb 17, 2019, 8:47:47 PM2/17/19
to weewx-user
Hi Gary,

Accum v.3.9 throws an exception if the parameter is not a float or an int where accum v.3.8.2 didn’t.
So I believe the long parameter of cmon itself is not the problem because it didn’t cause any exceptions before.

Luc

Luc Heijst

unread,
Feb 17, 2019, 9:01:08 PM2/17/19
to weewx-user
I forgot to say the net stat value (transmitted number of bytes last achive period) of cmon gets type long when the 64-bit net counter turned around from max-long to zero, but the net stat value itself was always in the range of 1 to max-int. So only the new type check in accum caused the exception.

Luc

gjr80

unread,
Feb 17, 2019, 9:48:11 PM2/17/19
to weewx-user
OK, I see why I missed it, the actual commit that made the change was dated quite some time ago, 21 November 2017 (v3.8.0 vintage), but it appears it did not make a release until 3.9.0 for some reason - perhaps it is my limited appreciation on the finer points of git or maybe there was a disturbance in the git Force. The accumulators have worked quite happily previously accepting longs so the check in accum.py could be changed to accept longs, but that is python 2 specific and Tom is slowly moving the code base to be python 3 compatible (Python 3 does away with longs, everything is an int). So the good news is when a python 3 version of WeeWX is released cmon 0.16 will work just fine :). In the meantime I guess the cmon change needs to stay.

Mystery solved.

Gary

Peter Hurn

unread,
Mar 16, 2019, 10:11:10 AM3/16/19
to weewx-user
Unfortunately despite adding the integer to the line in cmon.py in my /use/share/weewx/user folder I am still getting the following error, any ideas?

Mar 16 13:01:11 moode weewx[2938]: engine: Caught unrecoverable exception in engine:
Mar 16 13:01:11 moode weewx[2938]:     ****  accum: ScalarStats.addHiLo expected float or int, got 3392911048
Mar 16 13:01:11 moode weewx[2938]:     ****  Traceback (most recent call last):
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/engine.py", line 890, in main
Mar 16 13:01:11 moode weewx[2938]:     ****      engine.run()
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/engine.py", line 202, in run
Mar 16 13:01:11 moode weewx[2938]:     ****      self.dispatchEvent(weewx.Event(weewx.POST_LOOP))
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/engine.py", line 224, in dispatchEvent
Mar 16 13:01:11 moode weewx[2938]:     ****      callback(event)
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/engine.py", line 574, in post_loop
Mar 16 13:01:11 moode weewx[2938]:     ****      self._software_catchup()
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/engine.py", line 646, in _software_catchup
Mar 16 13:01:11 moode weewx[2938]:     ****      self.engine.dispatchEvent(weewx.Event(weewx.NEW_ARCHIVE_RECORD, record=record                                                                , origin='software'))
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/engine.py", line 224, in dispatchEvent
Mar 16 13:01:11 moode weewx[2938]:     ****      callback(event)
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/user/cmon.py", line 704, in new_archive_record
Mar 16 13:01:11 moode weewx[2938]:     ****      self.save_data(self.get_data(now, self.last_ts))
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/user/cmon.py", line 711, in save_data
Mar 16 13:01:11 moode weewx[2938]:     ****      self.dbm.addRecord(record)
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/manager.py", line 246, in addRecord
Mar 16 13:01:11 moode weewx[2938]:     ****      self._addSingleRecord(record, cursor, log_level)
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/manager.py", line 1216, in _addSingleRecord
Mar 16 13:01:11 moode weewx[2938]:     ****      _day_summary.addRecord(record, weight=_weight)
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/accum.py", line 256, in addRecord
Mar 16 13:01:11 moode weewx[2938]:     ****      func(self, record, obs_type, add_hilo, weight)
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/accum.py", line 314, in add_value
Mar 16 13:01:11 moode weewx[2938]:     ****      self[obs_type].addHiLo(val, record['dateTime'])
Mar 16 13:01:11 moode weewx[2938]:     ****    File "/usr/share/weewx/weewx/accum.py", line 77, in addHiLo
Mar 16 13:01:11 moode weewx[2938]:     ****      raise ValueError("accum: ScalarStats.addHiLo expected float or int, got %s" %                                                                 val)
Mar 16 13:01:11 moode weewx[2938]:     ****  ValueError: accum: ScalarStats.addHiLo expected float or int, got 3392911048
Mar 16 13:01:11 moode weewx[2938]:     ****  Exiting.

Luc Heijst

unread,
Mar 16, 2019, 10:37:14 AM3/16/19
to weewx-user
On Saturday, 16 March 2019 11:11:10 UTC-3, Peter Hurn wrote:
Unfortunately despite adding the integer to the line in cmon.py in my /use/share/weewx/user folder I am still getting the following error, any ideas?
... 
Feb 14 11:10:15 pi21 vpro[28175]:     ****  ValueError: accum: ScalarStats.addHiLo expected float or int, got 3392911048

Hii Peter,

The maximum positive value for an int in Python 2 is 2147483648 (2^31). The value at the time of the crash is bigger (3392911048), so this value is still a long integer.
Such big numbers should only occur if you have very heavy network traffic and/or a long archive period.

A quick-and-dirty workaround might be to preset the value of x to maxint in case it is bigger, like (not tested):
                        if k in self.last_net[iface]:
                            x = int(values[i]) - self.last_net[iface][k]
                            if x < 0:
                                maxcnt = 0x100000000 # 32-bit counter
                                if x + maxcnt < 0:
                                    maxcnt = 0x10000000000000000 # 64-bit counter
                                x += maxcnt
if x > 2147483648:
    x = 2147483648
                            record['net_' + iface + '_' + k] = int(x)
                        self.last_net[iface][k] = int(values[i])

Luc
Reply all
Reply to author
Forward
0 new messages