data-collector-service.log exceptions, is it normal?

89 views
Skip to first unread message

Albert Vonpupp

unread,
Sep 28, 2013, 11:37:43 AM9/28/13
to opensta...@googlegroups.com
Hello Anton,

I just noticed the first lines of data-collector-service.log are showing an exception, this might be the reason why Neat doesn't seems to hibernate any machine, perhaps the local-manager on the compute nodes is not reporting properly to the global-manager on the controller.

I forgot to tell you on my last post that my advisor gave me a deadline until wednesday (Oct 2nd) to have Neat running, if I'm not able to do so he will change the scope of my research, and probably drop out Neat of my experiments which would be a real shame since I'm really interested in your project.

Any help would be greatly appreciated.

Many thanks!

Herewith the log:

[root@jupiter neat]# cat data-collector-service.log
Traceback (most recent call last):
  File "/usr/bin/neat-data-collector", line 9, in <module>
    load_entry_point('openstack-neat==0.1', 'console_scripts', 'neat-data-collector')()
  File "<string>", line 2, in start
  File "/usr/lib/python2.7/site-packages/contracts/main.py", line 296, in contracts_checker
    result = function_(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/openstack_neat-0.1-py2.7.egg/neat/locals/collector.py", line 140, in start
    int(interval))
  File "<string>", line 2, in start
  File "/usr/lib/python2.7/site-packages/contracts/main.py", line 296, in contracts_checker
    result = function_(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/openstack_neat-0.1-py2.7.egg/neat/common.py", line 58, in start
    state = init_state(config)
  File "<string>", line 2, in init_state
  File "/usr/lib/python2.7/site-packages/contracts/main.py", line 296, in contracts_checker
    result = function_(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/openstack_neat-0.1-py2.7.egg/neat/locals/collector.py", line 168, in init_state
    host_ram)
  File "<string>", line 2, in update_host
  File "/usr/lib/python2.7/site-packages/contracts/main.py", line 305, in contracts_checker
    raise e
contracts.interface.ContractNotRespected: Breach for return value of Database:update_host().
Could not satisfy any of the 3 clauses in float|int|uint.
 ---- Clause #1:   float
 | Could not satisfy any of the 3 clauses in Float|np_scalar_float|np_scalar,array(float).
 |  ---- Clause #1:   Float
 |  | Expected type 'float', got 'long'.
 |  | checking: Float   for value: Instance of long: 1L  
 |  | Variables bound in inner context:
 |  | - self: Instance of Database: <neat.db.Database object at 0x35c3d50>
 |  ---- Clause #2:   np_scalar_float
 |  | Could not satisfy any of the 2 clauses in np_float32|np_float64.
 |  |  ---- Clause #1:   np_float32
 |  |  | Expected type 'float32', got 'long'.
 |  |  | checking: np_float32   for value: Instance of long: 1L  
 |  |  | Variables bound in inner context:
 |  |  | - self: Instance of Database: <neat.db.Database object at 0x35c3d50>
... so on ...
 | checking: np_scalar_uint|np_scalar,array(uint)      for value: Instance of long: 1L  
 | checking: $(np_scalar_uint|np_scalar,array(uint))   for value: Instance of long: 1L  
 | checking: uint                                      for value: Instance of long: 1L  
 | Variables bound in inner context:
 | - self: Instance of Database: <neat.db.Database object at 0x35c3d50>
 ------- (end clauses) -------
checking: float|int|uint      for value: Instance of long: 1L  
checking: $(float|int|uint)   for value: Instance of long: 1L  
checking: number              for value: Instance of long: 1L  
Variables bound in inner context:
- self: Instance of Database: <neat.db.Database object at 0x35c3d50>

Albert Vonpupp

unread,
Sep 29, 2013, 8:20:27 PM9/29/13
to opensta...@googlegroups.com
Hello Anton,

I just got over this issue too but I had to comment a @contract decorator on line 142 @db.py, you can check my commit here: https://github.com/mscs-usp/openstack-neat/commit/be690275e640ae0c78867f89fe1df3f61523ffb3. I'm don't know why this produced an error

My best guess is that perhaps all nodes should have the same specs (proc, mem, etc), or perhaps a processor extension (or similar). Is there a restriction on the nodes hardware? I'm using regular PC's not real servers.

Here is a query on neat.hosts:
MySQL [(none)]> select * from neat.hosts;
+----+----------+---------+-----------+------+
| id | hostname | cpu_mhz | cpu_cores | ram  |
+----+----------+---------+-----------+------+
|  1 | jupiter  |    6000 |         2 | 3952 |
|  2 | saturno  |    4788 |         4 | 5960 |
|  3 | venus    |    4788 |         4 | 5903 |
+----+----------+---------+-----------+------+
3 rows in set (0.02 sec)


Right now Venus is off and I'm just working with Jupiter and Saturno, and I noticed another unexpected behavior this time only with Saturno. I had another contract violation on locals/collector.py on lines 553 and 628, and by commenting those lines I get a little bit farther but it seems so that some parameters get wrong values, perhaps python somehow is not getting some values properly (pls check the last lines on data-collector.log, I'm not sure, but those CPU time values are right?), here are some logs:

[root@saturno neat]# cat /var/log/neat/data-collector.log
2013-09-29 12:47:28,863 INFO     neat.locals.collector Creaned up the local data directory: /var/lib/neat
2013-09-29 12:47:28,863 INFO     neat.locals.collector Starting the data collector, iterations every 300 seconds
2013-09-29 12:47:28,944 DEBUG    neat.db Instantiated a Database object
2013-09-29 12:47:28,944 DEBUG    neat.db_utils Initialized a DB connection to mysql://root:badpa...@143.107.45.200/neat
2013-09-29 12:47:29,187 INFO     neat.locals.collector Started an iteration
2013-09-29 12:47:29,196 DEBUG    neat.locals.collector Added VMs: ['f640b590-d105-4302-8b89-55d79905a2a6', '1305f01a-d2fc-42c8-a0cf-63fd9838fa21', '98e6ce60-fc5c-4c20-bff9-5fe3c746167e']
2013-09-29 12:47:29,203 DEBUG    neat.locals.collector Fetched remote data: {'f640b590-d105-4302-8b89-55d79905a2a6': [], '1305f01a-d2fc-42c8-a0cf-63fd9838fa21': [], '98e6ce60-fc5c-4c20-bff9-5fe3c746167e': []}
2013-09-29 12:47:29,204 INFO     neat.locals.collector Started VM data collection
2013-09-29 12:47:29,226 INFO     neat.locals.collector Completed VM data collection
2013-09-29 12:47:29,226 INFO     neat.locals.collector Started host data collection
2013-09-29 12:47:29,227 INFO     neat.locals.collector Completed host data collection
2013-09-29 12:47:29,227 INFO     neat.locals.collector Completed an iteration
2013-09-29 12:52:29,328 INFO     neat.locals.collector Started an iteration
2013-09-29 12:52:29,339 INFO     neat.locals.collector Started VM data collection
2013-09-29 12:52:29,350 DEBUG    neat.locals.collector VM f640b590-d105-4302-8b89-55d79905a2a6: previous CPU time 1747643595255, current CPU time 1756590658099, previous time 1380469649.2045040131, current time 1380469949.3401300907

[root@saturno neat]# head -50 /var/log/neat/data-collector-service.log

Traceback (most recent call last):
  File "/usr/bin/neat-data-collector", line 9, in <module>
    load_entry_point('openstack-neat==0.1', 'console_scripts', 'neat-data-collector')()
  File "<string>", line 2, in start
  File "/usr/lib/python2.7/site-packages/contracts/main.py", line 296, in contracts_checker
    result = function_(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/openstack_neat-0.1-py2.7.egg/neat/locals/collector.py", line 140, in start
    int(interval))
  File "<string>", line 2, in start
  File "/usr/lib/python2.7/site-packages/contracts/main.py", line 296, in contracts_checker
    result = function_(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/openstack_neat-0.1-py2.7.egg/neat/common.py", line 62, in start
    state = execute(config, state)
  File "/usr/lib/python2.7/site-packages/openstack_neat-0.1-py2.7.egg/neat/locals/collector.py", line 271, in execute
    added_vm_data)
  File "/usr/lib/python2.7/site-packages/openstack_neat-0.1-py2.7.egg/neat/locals/collector.py", line 615, in get_cpu_mhz
    current_cpu_time)
  File "<string>", line 2, in calculate_cpu_mhz
  File "/usr/lib/python2.7/site-packages/contracts/main.py", line 294, in contracts_checker
    raise e
contracts.interface.ContractNotRespected: Breach for argument 'previous_cpu_time' to calculate_cpu_mhz().

Could not satisfy any of the 3 clauses in float|int|uint.
 ---- Clause #1:   float
 | Could not satisfy any of the 3 clauses in Float|np_scalar_float|np_scalar,array(float).
 |  ---- Clause #1:   Float
 |  | Expected type 'float', got 'long'.

[root@saturno neat]# head -50 /var/log/neat/local-manager.log
2013-09-29 12:47:31,999 INFO     neat.locals.manager Starting the local manager, iterations every 300 seconds
2013-09-29 12:47:32,068 DEBUG    neat.db Instantiated a Database object
2013-09-29 12:47:32,069 DEBUG    neat.db_utils Initialized a DB connection to mysql://root:badpa...@143.107.45.200/neat
2013-09-29 12:47:32,071 INFO     neat.locals.manager Started an iteration
2013-09-29 12:47:32,078 DEBUG    neat.locals.manager The total physical CPU Mhz: 10108
2013-09-29 12:47:32,078 DEBUG    neat.locals.manager VM CPU MHz: {'f640b590-d105-4302-8b89-55d79905a2a6': [], '1305f01a-d2fc-42c8-a0cf-63fd9838fa21': [], '98e6ce60-fc5c-4c20-bff9-5fe3c746167e': []}
2013-09-29 12:47:32,078 DEBUG    neat.locals.manager Host CPU MHz: []
2013-09-29 12:47:32,078 DEBUG    neat.locals.manager CPU utilization: []
2013-09-29 12:47:32,079 INFO     neat.locals.manager Not enough data yet - skipping to the next iteration
2013-09-29 12:47:32,079 INFO     neat.locals.manager Skipped an iteration
2013-09-29 12:52:32,177 INFO     neat.locals.manager Started an iteration
2013-09-29 12:52:32,186 DEBUG    neat.locals.manager The total physical CPU Mhz: 10108
2013-09-29 12:52:32,186 DEBUG    neat.locals.manager VM CPU MHz: {'f640b590-d105-4302-8b89-55d79905a2a6': [], '1305f01a-d2fc-42c8-a0cf-63fd9838fa21': [], '98e6ce60-fc5c-4c20-bff9-5fe3c746167e': []}
2013-09-29 12:52:32,186 DEBUG    neat.locals.manager Host CPU MHz: []
2013-09-29 12:52:32,187 DEBUG    neat.locals.manager CPU utilization: []
2013-09-29 12:52:32,187 INFO     neat.locals.manager Not enough data yet - skipping to the next iteration
2013-09-29 12:52:32,187 INFO     neat.locals.manager Skipped an iteration
2013-09-29 12:57:32,288 INFO     neat.locals.manager Started an iteration
2013-09-29 12:57:32,296 DEBUG    neat.locals.manager The total physical CPU Mhz: 10108
2013-09-29 12:57:32,296 DEBUG    neat.locals.manager VM CPU MHz: {'f640b590-d105-4302-8b89-55d79905a2a6': [], '1305f01a-d2fc-42c8-a0cf-63fd9838fa21': [], '98e6ce60-fc5c-4c20-bff9-5fe3c746167e': []}

[root@saturno neat]# head -50 /var/log/neat/local-manager-service.log
(empty)

It thought that perhaps replacing Saturno with another host with similar specs than Jupiter, but... I noticed that VM's doesn't seems to report Mhz to the manager, I'm not sure if that is normal, I guess not. Herewith the log:

[root@jupiter neat]# cat data-collector.log
2013-09-29 12:47:49,297 INFO     neat.locals.collector Creaned up the local data directory: /var/lib/neat
2013-09-29 12:47:49,297 INFO     neat.locals.collector Starting the data collector, iterations every 300 seconds
2013-09-29 12:47:49,353 DEBUG    neat.db Instantiated a Database object
2013-09-29 12:47:49,353 DEBUG    neat.db_utils Initialized a DB connection to mysql://root:badpa...@143.107.45.200/neat
2013-09-29 12:47:49,371 INFO     neat.locals.collector Started an iteration
2013-09-29 12:47:49,372 INFO     neat.locals.collector Started VM data collection
2013-09-29 12:47:49,373 INFO     neat.locals.collector Completed VM data collection
2013-09-29 12:47:49,373 INFO     neat.locals.collector Started host data collection
2013-09-29 12:47:49,373 INFO     neat.locals.collector Completed host data collection
2013-09-29 12:47:49,373 INFO     neat.locals.collector Completed an iteration
2013-09-29 12:52:49,473 INFO     neat.locals.collector Started an iteration
2013-09-29 12:52:49,475 INFO     neat.locals.collector Started VM data collection
2013-09-29 12:52:49,475 INFO     neat.locals.collector Completed VM data collection
2013-09-29 12:52:49,475 INFO     neat.locals.collector Started host data collection
2013-09-29 12:52:49,476 INFO     neat.locals.collector Completed host data collection
2013-09-29 12:52:49,493 DEBUG    neat.locals.collector Collected VM CPU MHz: {}
2013-09-29 12:52:49,493 DEBUG    neat.locals.collector Collected total VMs CPU MHz: 0
2013-09-29 12:52:49,493 DEBUG    neat.locals.collector Collected hypervisor CPU MHz: 35
2013-09-29 12:52:49,493 DEBUG    neat.locals.collector Collected host CPU MHz: 35
2013-09-29 12:52:49,493 DEBUG    neat.locals.collector Collected total CPU MHz: 35
2013-09-29 12:52:49,502 DEBUG    neat.locals.collector Overload state logged: False
2013-09-29 12:52:49,502 INFO     neat.locals.collector Completed an iteration
2013-09-29 12:57:49,602 INFO     neat.locals.collector Started an iteration
2013-09-29 12:57:49,603 INFO     neat.locals.collector Started VM data collection
2013-09-29 12:57:49,603 INFO     neat.locals.collector Completed VM data collection
2013-09-29 12:57:49,604 INFO     neat.locals.collector Started host data collection
2013-09-29 12:57:49,604 INFO     neat.locals.collector Completed host data collection
2013-09-29 12:57:57,388 DEBUG    neat.locals.collector Collected VM CPU MHz: {}
2013-09-29 12:57:57,388 DEBUG    neat.locals.collector Collected total VMs CPU MHz: 0
2013-09-29 12:57:57,388 DEBUG    neat.locals.collector Collected hypervisor CPU MHz: 31
2013-09-29 12:57:57,388 DEBUG    neat.locals.collector Collected host CPU MHz: 31
2013-09-29 12:57:57,388 DEBUG    neat.locals.collector Collected total CPU MHz: 31
2013-09-29 12:57:57,388 INFO     neat.locals.collector Completed an iteration
2013-09-29 13:02:57,441 INFO     neat.locals.collector Started an iteration
2013-09-29 13:02:57,442 INFO     neat.locals.collector Started VM data collection
2013-09-29 13:02:57,443 INFO     neat.locals.collector Completed VM data collection
2013-09-29 13:02:57,443 INFO     neat.locals.collector Started host data collection
2013-09-29 13:02:57,443 INFO     neat.locals.collector Completed host data collection
2013-09-29 13:03:03,751 DEBUG    neat.locals.collector Collected VM CPU MHz: {}
2013-09-29 13:03:03,751 DEBUG    neat.locals.collector Collected total VMs CPU MHz: 0
2013-09-29 13:03:03,751 DEBUG    neat.locals.collector Collected hypervisor CPU MHz: 26
2013-09-29 13:03:03,751 DEBUG    neat.locals.collector Collected host CPU MHz: 26
2013-09-29 13:03:03,751 DEBUG    neat.locals.collector Collected total CPU MHz: 26
2013-09-29 13:03:03,752 INFO     neat.locals.collector Completed an iteration

[root@jupiter ~]# cat /var/log/neat/local-manager.log
2013-09-29 12:47:51,795 INFO     neat.locals.manager Starting the local manager, iterations every 300 seconds
2013-09-29 12:47:51,853 DEBUG    neat.db Instantiated a Database object
2013-09-29 12:47:51,853 DEBUG    neat.db_utils Initialized a DB connection to mysql://root:badpa...@143.107.45.200/neat
2013-09-29 12:47:51,854 INFO     neat.locals.manager Started an iteration
2013-09-29 12:47:51,854 INFO     neat.locals.manager The host is idle
2013-09-29 12:47:51,854 INFO     neat.locals.manager Skipped an iteration
2013-09-29 12:52:51,941 INFO     neat.locals.manager Started an iteration
2013-09-29 12:52:51,941 INFO     neat.locals.manager The host is idle
2013-09-29 12:52:51,941 INFO     neat.locals.manager Skipped an iteration
2013-09-29 12:57:52,042 INFO     neat.locals.manager Started an iteration
2013-09-29 12:57:52,042 INFO     neat.locals.manager The host is idle
2013-09-29 12:57:52,042 INFO     neat.locals.manager Skipped an iteration
2013-09-29 13:02:52,090 INFO     neat.locals.manager Started an iteration
2013-09-29 13:02:52,090 INFO     neat.locals.manager The host is idle
2013-09-29 13:02:52,090 INFO     neat.locals.manager Skipped an iteration
2013-09-29 13:07:52,191 INFO     neat.locals.manager Started an iteration
2013-09-29 13:07:52,191 INFO     neat.locals.manager The host is idle
2013-09-29 13:07:52,191 INFO     neat.locals.manager Skipped an iteration

Both *-services.log are empty on Jupiter.

To be honest I'm completely clueless on how to proceed. Am I doing something wrong? Is a hardware issue?

Many thanks.

With best regards,
Albert.

Anton Beloglazov

unread,
Sep 29, 2013, 8:50:14 PM9/29/13
to opensta...@googlegroups.com
Hi Albert,

Those contract violations happened because host and VM IDs, as well as CPU time were expected to be integers, but in reality were longs. I've added explicit conversions to ints where appropriate. The unit tests run OK, but I can't test it on a complete environment at the moment. I've pushed the changes, please try to pull them and run the system again. Please let me know if something fails again. I hope you can make it run by your deadline.

Cheers,
Anton


--
You received this message because you are subscribed to the Google Groups "OpenStack Neat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openstack-nea...@googlegroups.com.
Visit this group at http://groups.google.com/group/openstack-neat.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages