Bosh deploy cfv194 with multi-cloud_controller

90 views
Skip to first unread message

bly1...@gmail.com

unread,
Dec 10, 2014, 9:49:49 PM12/10/14
to bosh-...@cloudfoundry.org
Hi,everyone,
    when I try to deploy cloud foundry v194 using Bosh,if I deploy mutliple cloud controller,I find it is required to deploy nfs_mounter job (it is File not Process ) with cloud controller.ON the first VM with deploying cloud controller and nfs_mounter,nfs_mounter's status is accessible,while other jobs' status is running,but bosh shows this VM is failing, and stop deploying other cloud controllers.In fact,the first VM's is working,and cluster is also working,so I can push apps,but other cloud controllers cannot work.How to solve this probelm?
thanks!
root@U:~# bosh -v
BOSH 1.2751.0
root@U:~# bosh status
Config
             /root/.bosh_config

Director
  Name       bosh_director
  URL        https://10.10.103.8:25555
  Version    1.0000.0 (00000000)
  User       admin
  UUID       5c61133c-b2e2-4e60-8c56-a08d896ba9a0
  CPI        vsphere
  dns        disabled
  compiled_package_cache disabled
  snapshots  disabled

Deployment
  Manifest   /root/cfv194.yml
root@U:~# bosh stemcells

+--------------------------+---------+-----------------------------------------+
| Name                     | Version | CID                                     |
+--------------------------+---------+-----------------------------------------+
| bosh-vsphere-esxi-ubuntu | 2427*   | sc-6f997de4-5d8e-43e7-a0c9-e567359ef1f7 |
+--------------------------+---------+-----------------------------------------+

(*) Currently in-use

Stemcells total: 1
root@U:~# bosh releases

+------+------------+-------------+
| Name | Versions   | Commit Hash |
+------+------------+-------------+
| cf   | 194+dev.3* | 345a8b3e+   |
+------+------------+-------------+
(*) Currently deployed
(+) Uncommitted changes

Releases total: 1

bosh_vms.png
cc_can_worker.png
not_running_after_update.png

bly1...@gmail.com

unread,
Dec 10, 2014, 9:52:32 PM12/10/14
to bosh-...@cloudfoundry.org, bly1...@gmail.com
This is my deployment file cfv194.yml

在 2014年12月11日星期四UTC+8上午10时49分49秒,bly1...@gmail.com写道:
cfv194.yml

James Bayer

unread,
Dec 11, 2014, 1:45:01 PM12/11/14
to bosh-users, bly1...@gmail.com
maybe try to "bosh recreate" the failing VMs or "bosh ssh" onto the VMs and checkout the logs for why the file system mount may not be working. logs are generally in /var/vcap/* and you can use the "find" command to discover various locations of log files.

To unsubscribe from this group and stop receiving emails from it, send an email to bosh-users+...@cloudfoundry.org.



--
Thank you,

James Bayer

jmy...@pivotal.io

unread,
Dec 11, 2014, 7:09:41 PM12/11/14
to bosh-...@cloudfoundry.org, bly1...@gmail.com
Hi,

If bosh recreating does not solve your problem, could you pleas provide all the logs on the Cloud Controller vm under the /var/vcap/sys/log directory and its sub directories. With these logs we should be better able to help you figure out what is happening.

Thanks,

Jim && Dan

shepherdboy

unread,
Dec 11, 2014, 8:47:00 PM12/11/14
to bosh-...@cloudfoundry.org, bly1...@gmail.com, jmy...@pivotal.io
Hi,
When I login the Cloud Controller vm,I find the jobs cloud_controller_ng and cloud_controller_worker_1 are not monitored,but I can use the command 'monit start' to make them running.
This is the logs I use the command bosh logs 161 --debug         I also give other logs in my attachment.
E, [2014-12-10T02:13:06.999948 #15285] [canary_update(cloud_controller_ng/0)] ERROR -- : Error updating canary instance: #<Bosh::Director::AgentJobNotRunning: `cloud_controller_ng/0' is not running after update>
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/instance_updater.rb:85:in `update'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/job_updater.rb:74:in `block (2 levels) in update_canary_instance'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_formatter.rb:46:in `with_thread_name'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/job_updater.rb:72:in `block in update_canary_instance'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/event_log.rb:83:in `call'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/event_log.rb:83:in `advance_and_track'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/job_updater.rb:71:in `update_canary_instance'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/job_updater.rb:65:in `block (2 levels) in update_canaries'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_pool.rb:77:in `call'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_pool.rb:77:in `block (2 levels) in create_thread'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_pool.rb:63:in `loop'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_pool.rb:63:in `block in create_thread'
D, [2014-12-10T02:13:07.000159 #15285] [0x3fa4aeb6b024] DEBUG -- : Worker thread raised exception: `cloud_controller_ng/0' is not running after update - /var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/instance_updater.rb:85:in `update'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/job_updater.rb:74:in `block (2 levels) in update_canary_instance'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_formatter.rb:46:in `with_thread_name'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/job_updater.rb:72:in `block in update_canary_instance'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/event_log.rb:83:in `call'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/event_log.rb:83:in `advance_and_track'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/job_updater.rb:71:in `update_canary_instance'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.0000.0/lib/bosh/director/job_updater.rb:65:in `block (2 levels) in update_canaries'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_pool.rb:77:in `call'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_pool.rb:77:in `block (2 levels) in create_thread'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_pool.rb:63:in `loop'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.0000.0/lib/common/thread_pool.rb:63:in `block in create_thread'
D, [2014-12-10T02:13:07.000222 #15285] [0x3fa4aeb6b024] DEBUG -- : Thread is no longer needed, cleaning up
D, [2014-12-10T02:13:07.000328 #15285] [task:161] DEBUG -- : Shutting down pool
D, [2014-12-10T02:13:07.002938 #15285] [task:161] DEBUG -- : (0.000973s) SELECT "stemcells".* FROM "stemcells" INNER JOIN "deployments_stemcells" ON (("deployments_stemcells"."stemcell_id" = "stemcells"."id") AND ("deployments_stemcells"."deployment_id" = 1))
D, [2014-12-10T02:13:07.003174 #15285] [task:161] DEBUG -- : Deleting lock: lock:deployment:cfv194
D, [2014-12-10T02:13:07.003419 #15285] [0x3fa4ae1ec7c8] DEBUG -- : Lock renewal thread exiting
D, [2014-12-10T02:13:07.004454 #15285] [task:161] DEBUG -- : Deleted lock: lock:deployment:cfv194
I, [2014-12-10T02:13:07.004619 #15285] [task:161]  INFO -- : sending update deployment error event
D, [2014-12-10T02:13:07.004700 #15285] [task:161] DEBUG -- : SENT: hm.director.alert {"id":"2f5f6c65-a702-4b25-9cc8-b71bc213404b","severity":3,"title":"director - error during update deployment","summary":"Error during update deployment for cfv194 against Director 5c61133c-b2e2-4e60-8c56-a08d896ba9a0: #<Bosh::Director::AgentJobNotRunning: `cloud_controller_ng/0' is not running after update>","created_at":1418177587}

Thanks.

在 2014年12月12日星期五UTC+8上午8时09分41秒,jmy...@pivotal.io写道:
monit.log
nfs_mounter_ctl.log
nginx_ctl.err.log
nginx_ctl.log
cloud_controller_clock_ctl.err.log
cloud_controller_clock_ctl.log
cloud_controller_ng_ctl.err.log
cloud_controller_ng_ctl.log
cloud_controller_worker_ctl.err.log
metron_agent_ctl.err.log
metron_agent_ctl.log
nfs_mounter_ctl.err.log

gwenn.e...@gmail.com

unread,
Dec 12, 2014, 1:35:25 AM12/12/14
to bosh-...@cloudfoundry.org, bly1...@gmail.com, jmy...@pivotal.io
Hi,

Are you sure about this line in your manifest ?

ccng.logging_level: debug2

shepherdboy

unread,
Dec 12, 2014, 3:58:47 AM12/12/14
to bosh-...@cloudfoundry.org, bly1...@gmail.com, jmy...@pivotal.io, gwenn.e...@gmail.com

Hi,
I'm so sorry.It's my mistake.That line should be 'logging_level: debug2',but the problem still can't be solved after I modify that mistake,because logging_level has a default value:debug2.When I configure ccng.logging_level:debug2,and logging_level will use the default value,so we still cannnot solve this problem.
Thanks.
在 2014年12月12日星期五UTC+8下午2时35分25秒,gwenn.e...@gmail.com写道:

Dmitriy Kalinin

unread,
Dec 23, 2014, 6:25:57 PM12/23/14
to bosh-...@cloudfoundry.org, bly1...@gmail.com, jmy...@pivotal.io, gwenn.e...@gmail.com
Sorry for the late response. Does this issue still happen?

I would recommend to debug it in this way:
- ssh into cloud controller machine before starting a deploy and run `watch monit summary`
- run bosh deploy
- when bosh deploy fails with error (not running after update...) keep on watching monit summary

if monit summary shows everything is running after bosh deploy ends: it means that not enough time is allocated for waiting for the job to be running. you can change update section in deployment manifest and bump update_watch_time/canary_watch_time ranges.
Reply all
Reply to author
Forward
0 new messages