Hello Anton,
Sorry to annoy you again. I switched back to this thread since I believe that my questions are more related to it.
I started Neat last night and it has been running for about 9h. OpenStack has been running without any VM, so the three nodes are idle. Herewith the logs:
[root@marte ~]# cat /var/log/neat/
db-cleaner-service.log(empty)
[root@marte ~]# cat /var/log/neat/
db-cleaner.log 2013-09-26 22:06:42,096 INFO neat.globals.db_cleaner Starting the database cleaner, iterations every 7200 seconds
2013-09-26 22:06:44,546 DEBUG neat.db Instantiated a Database object
2013-09-26 22:06:44,546 DEBUG neat.db_utils Initialized a DB connection to mysql://
root:badpa...@marte.eclipse.ime.usp.br/neat2013-09-26 22:06:44,547 INFO neat.globals.db_cleaner Cleaned up data older than 2013-09-26 20:06:44
2013-09-27 00:06:44,655 INFO neat.globals.db_cleaner Cleaned up data older than 2013-09-26 22:06:44
2013-09-27 02:06:44,713 INFO neat.globals.db_cleaner Cleaned up data older than 2013-09-27 00:06:44
2013-09-27 04:06:44,748 INFO neat.globals.db_cleaner Cleaned up data older than 2013-09-27 02:06:44
2013-09-27 06:06:44,844 INFO neat.globals.db_cleaner Cleaned up data older than 2013-09-27 04:06:44
2013-09-27 08:06:44,947 INFO neat.globals.db_cleaner Cleaned up data older than 2013-09-27 06:06:44
[root@marte ~]# cat /var/log/neat/
global-manager-service.log Bottle v0.11.6 server starting up (using WSGIRefServer())...
Listening on
http://marte.eclipse.ime.usp.br:60080/Hit Ctrl-C to quit.
[root@marte ~]# cat /var/log/neat/
global-manager.log 2013-09-26 22:06:44,726 DEBUG neat.db Instantiated a Database object
2013-09-26
22:06:44,726 DEBUG neat.db_utils Initialized a DB connection to
mysql://
root:badpa...@marte.eclipse.ime.usp.br/neat2013-09-26 22:06:45,108 DEBUG neat.globals.manager Calling: ether-wake -i em1 00:1c:c0:c3:f3:1f
2013-09-26 22:06:45,141 DEBUG neat.globals.manager Calling: ether-wake -i em1 00:27:0e:23:06:e9
2013-09-26 22:06:45,157 DEBUG neat.globals.manager Calling: ether-wake -i em1 70:71:bc:08:55:eb
2013-09-26 22:06:45,167 INFO neat.globals.manager Switched on hosts: ['jupiter', 'saturno', 'venus']
2013-09-26 22:06:45,329 INFO neat.globals.manager Starting the global manager listening to
marte.eclipse.ime.usp.br:60080I'm working remote on the lab today, so I cannot physically check if the compute nodes were suspended but it seems that they didn't (they ssh very quickly), here is the log of the
[root@jupiter ~]# cat /var/log/neat/
local-manager.log 2013-09-27 00:02:01,924 INFO neat.locals.manager Started an iteration
2013-09-27 00:02:01,925 INFO neat.locals.manager The host is idle
2013-09-27 00:02:01,925 INFO neat.locals.manager Skipped an iteration
2013-09-27 00:07:02,020 INFO neat.locals.manager Started an iteration
...
2013-09-27 09:07:09,477 INFO neat.locals.manager Skipped an iteration
2013-09-27 09:12:09,547 INFO neat.locals.manager Started an iteration
2013-09-27 09:12:09,547 INFO neat.locals.manager The host is idle
2013-09-27 09:12:09,547 INFO neat.locals.manager Skipped an iteration
Question 1: Shouldn't the global-manager.log show pm-suspend events, something like "neat.globals.manager Calling: pm-suspend"?
I also tried to reproduce the experiments with the following procedure:
https://github.com/mscs-usp/2013-mac5910-neat-experiments/blob/master/400-start-experiments.shOn the third step (
cd /opt/stack/spe-2013-experiments
&& python workload-distributor.py full-utilization-02) I'm using two traces with 99% load as you did, when executed I get this:
Bottle v0.11.6 server starting up (using WSGIRefServer())...
Listening on
http://marte:8081/Hit Ctrl-C to quit.
Questions 2: But I'm wondering how the system is going to send that load to the VMs? My key is called "test" as yours, but how the should the ssh key be called? Does it matter?
I noticed that on the workload-distributor.py there are hardcoded paths and files like cpu-load-generator.py (
https://github.com/beloglazov/spe-2013-experiments/blob/master/workload-distributor.py#L40-L41). I changed the paths accordingly on my repos (
https://github.com/mscs-usp/spe-2013-experiments/blob/master/workload-distributor.py#L40-L41)
Also the lookbusy should be installed on the VMs I guess.
Questions 3: The installation of lookbusy on the VMs is done automatically by your scripts or should I do it manually? How the controller will send the load to the VMs via Bottle's API or via ssh?
I merged your comments: "The workload-starter.py script should be deployed, configured with the server IP, and automatically started in the VM image on boot. This way you can create a single image, and then create multiple instances which will automatically request the server for the workload. Then, with the workload-distributor.py script you can distribute workload traces to all the VMs at the same time." on the 3.1 step of this script (
https://github.com/mscs-usp/2013-mac5910-neat-experiments/blob/master/400-start-experiments.sh#L16). I'm using the controller address (I guess that the lookbusy server you are reffering to is also the controller), and 7 minutes of time (200 traces), but when I try to start it I get the following error: requests.exceptions.MissingSchema: Invalid URL u'143.107.45.200': No schema supplied
In general I could start the workload-distributor on step 3 (
https://github.com/mscs-usp/2013-mac5910-neat-experiments/blob/master/400-start-experiments.sh#L12), but I cannot see whether all VMs are ready and send requests for the workload as you said, I think it might be an issue related to
Questions 2 and
3.
I don't understand why I must enable the distributor twice as you said on the procedure, is this correct?
I hope my message is not too long and confusing.
Thanks a lot Anton!
Best regards,
Albert.