can't stage apps, and cloud_controller can't connect to redis

179 views
Skip to first unread message

jake...@gmail.com

unread,
Sep 25, 2012, 5:19:43 PM9/25/12
to vcap...@cloudfoundry.org
Greetings,

I have 2 issues with a new CF multi-node install, which may or may not be related (I suspect they are).

deployment:
  name: "admin"
jobs:
  install:
    - nats_server
    - router
    - cloud_controller:
        builtin_services:
          - mysql
          - redis
    - health_manager
    - mysql_gateway
    - ccdb
    - redis_node:
        index: "0"
    - redis_gateway

cloud_controller.log looks like this:
[2012-09-25 14:13:32.853448] cc - pid=31764 tid=393a fid=373d  DEBUG -- Trying to reconnect to Redis
[2012-09-25 14:13:32.853709] cc - pid=31764 tid=393a fid=373d   INFO -- Disconnected from Redis
[2012-09-25 14:13:33.896621] cc - pid=31764 tid=393a fid=373d  DEBUG -- Trying to reconnect to Redis
[2012-09-25 14:13:33.897722] cc - pid=31764 tid=393a fid=373d   INFO -- Disconnected from Redis
[2012-09-25 14:13:34.980036] cc - pid=31764 tid=393a fid=373d  DEBUG -- Trying to reconnect to Redis
[2012-09-25 14:13:34.980272] cc - pid=31764 tid=393a fid=373d   INFO -- Disconnected from Redis
repeated indefinitely.

"ps ax | grep redis" shows redis_node and redis_gateway running. "netstat -apn | grep LISTEN" does not show a redis daemon listening.


2nd issue... when attempting to deploy an app, I get this error:

Staging Application 'env': ...............................Error (JSON 504): <html>
<head><title>504 Gatewa...

"status" shows this:

cloud_controller              :     RUNNING
router                        :     RUNNING
mysql_gateway                 :     RUNNING
health_manager                :     RUNNING
redis_node                    :     RUNNING
redis_gateway                 :     RUNNING

(dea is on another node... no obvious errors there that I can see)

Any ideas? Thanks in advance.

johnrd...@gmail.com

unread,
Sep 26, 2012, 2:46:27 PM9/26/12
to vcap...@cloudfoundry.org, jake...@gmail.com
Hi, just thought I'd chime in to say that I appear to be having exactly the same issue. No luck finding a fix yet so if anyone has any ideas it would be much appreciated. I'll be sure to post here if I find anything.

Brandon Burton

unread,
Sep 26, 2012, 3:33:20 PM9/26/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com
I'm also stuck at this point.

I did a fresh checkout from github this morning, I installed everything onto a single node just to simply things.

All my services start happily

ro...@testcf.local:/data/cloudfoundry/.deployments/testcf.local/config#.
/data/cloudfoundry/vcap/dev_setup/bin/vcap_dev -n testcf.local -d /data/cloudfoundry start
Targeting deployment "testcf.local" with cloudfoundry home "/data/cloudfoundry"
Setting up cloud controller environment
Using cloudfoundry config from /data/cloudfoundry/.deployments/testcf.local/config
Executing /data/cloudfoundry/.deployments/testcf.local/deploy/rubies/ruby-1.9.2-p180/bin/ruby.
  /data/cloudfoundry/vcap/dev_setup/bin/vcap start cloud_controller mysql_gateway redis_gateway router redis_node mysql_node health_manager dea.
  -c /data/cloudfoundry/.deployments/testcf.local/config -v /data/cloudfoundry -l /data/cloudfoundry/.deployments/testcf.local/log
cloud_controller              : RUNNING
mysql_gateway                  : RUNNING
redis_gateway                 : RUNNING
router                            : RUNNING
redis_node                    : RUNNING
mysql_nodeql_node                    : RUNNING
health_manager                :     RUNNING
dea                           : RUNNING


The redis part of cloud_controller.yml shows

# redis for new stager
redis:
  host: 10.1.1.79
  port: 5454
  password: PoIxbL98RWpwBuUJvKNojnpIcRb1ot2


The password seems to have come from the installer, I did not generate it, but I don't see a redis listening on that port


ro...@testcf.local:/# lsof -i -P -n |grep 5454
ro...@testcf.local:/#


but I do see the redis stuff for services running


root@solarcesmemedepot:/# ps waux | grep redis
root     26298  1.0  1.3 209344 53132 pts/0    Sl   12:25   0:02 ruby /data/cloudfoundry/vcap/bin/../services/redis/bin/redis_gateway -c /data/cloudfoundry/.deployments/solarcesmemedepot.private.scl3.mozilla.com/config/redis_gateway.yml
root     26424  1.0  1.3 772084 55588 pts/0    Sl   12:25   0:02 ruby /data/cloudfoundry/vcap/bin/../services/redis/bin/redis_node -c /data/cloudfoundry/.deployments/testcf.local/config/redis_node.yml
root     27528  0.0  0.0   8100   932 pts/0    S+   12:28   0:00 grep --color=auto redis


In .deployments/log/cloud_controller.log I see this over and over and over

[2012-09-26 12:31:42.281744] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:42.281903] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis
[2012-09-26 12:31:43.363758] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:43.363944] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis
[2012-09-26 12:31:44.442402] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:44.442590] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis
[2012-09-26 12:31:45.524284] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:45.524468] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis
[2012-09-26 12:31:46.606590] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:46.606792] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis
[2012-09-26 12:31:47.688563] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:47.688798] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis
[2012-09-26 12:31:48.770696] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:48.770912] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis
[2012-09-26 12:31:49.815036] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:49.815262] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis
[2012-09-26 12:31:50.897159] cc - pid=26135 tid=d2ee fid=15e2  DEBUG -- Trying to reconnect to Redis
[2012-09-26 12:31:50.897352] cc - pid=26135 tid=d2ee fid=15e2   INFO -- Disconnected from Redis


I am using Ubuntu 12.04, and I realize this could make a difference, but since everything seems to come from the chef-solo recipes and so much is compiled, it seems like either something is missing or broken in the setup script or setup recipes

Nicholas Kushmerick

unread,
Sep 26, 2012, 4:13:05 PM9/26/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com
I don't have an answer for you, but to help you avoid a red herring in case it isn't clear: redis_{gateway, node} are completely unrelated to the Redis to which CC is trying to connect.    Redis_{gateway,node} are the components behind "vmc create-service redis"; whereas CC uses Redis to get staging logs.

jake...@gmail.com

unread,
Sep 26, 2012, 4:30:26 PM9/26/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com
Yeah, that's definitely not made clear in the documentation, but we'd figured that out. From a bit of googling about it became evident that there's supposed to be a sort of embedded Redis that CC uses for its own purposes. However, as far as I can tell that's not getting installed/started, and so staging an app fails.

I found some references to a "redis-server" that needs to be running (and even some files for it in the vcap repo), but no indication on the proper way to start it. I suppose one could always run a standalone Redis instance/daemon on port 5454 completely apart from CF, but even if that works it's still a hack IMO. It should "just work".

Jake

Brandon Burton

unread,
Sep 26, 2012, 5:43:40 PM9/26/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com
Exactly, like the Postgres that gets setup for the system level services, it seems like there should be a system level Redis, but it's not getting installed and configured

jake...@gmail.com

unread,
Sep 26, 2012, 8:02:30 PM9/26/12
to vcap...@cloudfoundry.org, jake...@gmail.com
I was able to fix the "Disconnected from Redis" issue by installing a system-level Redis daemon manually and configuring it to listen on 0.0.0.0:5454. This is a hack, but at least it gets me past that particular error for now... good enough for testing.

However, this didn't actually help much... I still can't start the very simple Sinatra "env.rb" app.

Here's the output from "vmc start env --trace":

Staging Application 'env': ...............................>>>
REQUEST: put http://api.paas.mydomain.org/apps/env
RESPONSE_HEADERS:
    server : nginx
    content_length : 176
    date : Thu, 27 Sep 2012 00:00:00 GMT
    content_type : text/html
    connection : close
    keep_alive : timeout=20
REQUEST_BODY: {"console":false,"meta":{"console":false,"debug":null,"created":1348703970,"version":7},"debug":null,"env":[],"runningInstances":0,"instances":1,"services":[],"version":"c6b17f958eb350d04056e7ab4922d15295a488d8-6","staging":{"model":"sinatra","stack":"ruby18"},"name":"env","uris":["env.paas.mydomain.org"],"state":"STARTED","resources":{"disk":2048,"memory":128,"fds":256}}
RESPONSE: [504]
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>
<<<
Error (JSON 504): <html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>


Any ideas?

Jake

Haipeng Wu

unread,
Sep 27, 2012, 12:10:42 AM9/27/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com
Yes, it seems the system redis was not installed and configured.

It ought to be a bug when doing dev-setup on multi node, the system redis should be installed along with cc.
I'll make a fix ASAP.

Thanks for reporting the issue.

haipeng

John Davidge

unread,
Sep 27, 2012, 6:17:34 PM9/27/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com
Has anyone has any luck with this fix? I pulled the source after the commit 18 hours ago and I'm still getting Error (JSON 504) at the staging step when trying a vmc push.

Figo Feng

unread,
Sep 27, 2012, 6:56:49 PM9/27/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com, Haipeng Wu
Could you please make sure with the fix vcap-redis is running well? Any error logs you can get from cloud controller when you push the application?

Figo


From: "John Davidge" <johnrd...@gmail.com>
To: vcap...@cloudfoundry.org
Cc: jake...@gmail.com, johnrd...@gmail.com
Sent: Friday, September 28, 2012 6:17:34 AM
Subject: Re: [vcap-dev] Re: can't stage apps, and cloud_controller can't connect to redis

Haipeng Wu

unread,
Sep 27, 2012, 9:11:28 PM9/27/12
to vcap...@cloudfoundry.org
cloud you please check if you have a redis process? 
If exists, check if the cc config file(cloudfoundry/.deployments/devbox/config/cloud_controller.yml) have right host name of redis.

If both are ok, your problem is not cause by redis, you can take a look at the cc log, maybe you will find some discovery.

haipeng

jake...@gmail.com

unread,
Sep 28, 2012, 2:18:13 AM9/28/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com, Haipeng Wu
I re-installed clean, and yes redis-server is running now. The "Disconnected from Redis" repeating message is gone from the cloud_controller.log file... instead I get a "Connected to Redis".

But I still get a 504 during the "Staging app..." step. There are no errors in the cloud_controller.log... just a couple WARNs once it finally times out and gives the 504:

[2012-09-27 23:07:51.500225] cc - pid=12285 tid=6985 fid=c55c   WARN -- Staging for app_id=2 failed
[2012-09-27 23:07:51.500480] cc - pid=12285 tid=6985 fid=c55c   WARN -- Error: Timed out after 120s.

No idea *what* is timing out, though. I thought it was redis, but that appears not to be the case now.

In the cloud_controller.yml file, the host/port for Redis is valid.

It does have this (pw is default, haven't done any hardening since it doesn't currently work at all):

# App staging parameters
staging:
  # Max duration for staging process
  max_staging_runtime:    120 # secs

  auth:
    user: zxsfhgjg
    password: ZNVfdase9


Is there supposed to be a "staging" or "stager" process of some kind running? There isn't anything by a name like that. This might explain why the Redis wasn't installed and running initially... my read of that commit indicates it was expected to be set up as part of the "stager" installation (lines moved into cloud_controller and out of stager recipes). How can I verify that stager is installed and operational? Or is there something better to check?

I may be missing some log output. At the start of the cloud_controller.log file, right after starting up all the services, I get these two lines:

Rails Error: Unable to access log file. Please ensure that  exists and is chmod 0666. The log level has been raised to WARN and the output directed to STDERR until the problem is fixed.
Rails Error: Unable to access log file. Please ensure that  exists and is chmod 0666. The log level has been raised to WARN and the output directed to STDERR until the problem is fixed.

Note the double space between "that  exists"... there's a file path missing from those errors... possibly an unset variable. I have no way to identify what log entries I might be missing.

Any ideas?

jake...@gmail.com

unread,
Sep 28, 2012, 2:38:49 AM9/28/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com, Haipeng Wu
Here's a couple more interesting logfile entries, this time from mysql_gateway.log and redis_gateway.log, when I tried to *stop* those services (since the app I'm trying to use doesn't use either, this should be safe). This may or may not be at all related to the 504-on-staging issue, which is the real problem. But since I have no other error messages other than "timeout" on that, I figure every little bit might help.

[2012-09-27 23:26:06.523208] mysql_gateway - pid=24650 tid=9501 fid=5023  ERROR -- Failed deactivation with cloud controller, status=400
[2012-09-27 23:26:21.409462] redis_gateway - pid=24839 tid=229d fid=4172  ERROR -- Failed deactivation with cloud controller, status=400


And another 4 from cloud_controller.log, also related to mysql and redis services (not the internal redis that CC uses). I'm not sure what exactly triggered these, but I can't reproduce them. May be related to the above, may not be related to the 504-on-staging problem.

[2012-09-27 23:26:06.520566] cc - pid=24524 tid=e687 fid=05f6  DEBUG -- Create service request: {"label":"mysql-5.1","url":"http://10.22.81.169:48063","active":false}
[2012-09-27 23:26:06.520833] cc - pid=24524 tid=e687 fid=05f6  ERROR -- Failure decoding service offering request: Field: supported_versions, Error: Missing field supported_versions, Field: version_aliases, Error: Missing field version_aliases

[2012-09-27 23:26:21.406943] cc - pid=24524 tid=e687 fid=05f6  DEBUG -- Create service request: {"label":"redis-2.2","url":"http://10.22.81.169:52022","active":false}
[2012-09-27 23:26:21.407237] cc - pid=24524 tid=e687 fid=05f6  ERROR -- Failure decoding service offering request: Field: supported_versions, Error: Missing field supported_versions, Field: version_aliases, Error: Missing field version_aliases


Seems like something is wrong within cloud_controller.

Haipeng Wu

unread,
Sep 28, 2012, 2:51:11 AM9/28/12
to jake...@gmail.com, vcap...@cloudfoundry.org, johnrd...@gmail.com
could you please verify if your stager was started? 

CC will let a component called "stager" to staging your app, if staging process takes over more than 120s, cc will think it to be time out.

haipeng

jake...@gmail.com

unread,
Sep 28, 2012, 3:13:19 AM9/28/12
to vcap...@cloudfoundry.org, jake...@gmail.com, johnrd...@gmail.com
Wow.

So umm... see my yaml file at the beginning of this thread. Note specifically the lack of "stager" in there. My complete "status" output was as follows:

cloud_controller              :     RUNNING
redis_gateway                 :     STOPPED
mysql_gateway                 :     STOPPED
router                        :     RUNNING
redis_node                    :     STOPPED
health_manager                :     RUNNING


This was created in accordance with a whole lot of documentation. None of these pages say *anything* about a stager component being necessary:

https://github.com/cloudfoundry/vcap/tree/master/dev_setup
https://github.com/cloudfoundry/vcap/tree/master/dev_setup/deployments
https://github.com/cloudfoundry/vcap/tree/master/dev_setup/deployments/sample/multihost_mysql
https://github.com/cloudfoundry/oss-docs/tree/master/vcap/single_and_multi_node_deployments_with_dev_setup

It is mentioned in the devbox.yml sample file... if one were attempting to set up a single-node deployment using that exact/complete file, it would work.

To me that's bug #1... highly inconsistent documentation.

Bug #2 is that it should not fail in such an esoteric way... if a component is needed but missing, it should be highly evident *which* component is failed or missing. A simple timeout with no accompanying information is very unhelpful to troubleshooting.

Bug #3 is that it shouldn't be possible to install CC without stager, should it? Surely on install, or at least on startup, the lack of such a critical component should be highly visible.


I've re-installed this node with "stager", and it does indeed start up. I can now "stage" an app... still can't "start" it though, but I'll open a new thread on that.

Thanks for getting me pointed in the right direction... hopefully this thread can lead to some improvements in the documentation and codebase. :)

webh

unread,
Jan 9, 2013, 3:05:10 PM1/9/13
to vcap...@cloudfoundry.org, jake...@gmail.com
If I run vmc start myapp --trace

<<<
Staging Application 'uec': ................................>>>
RESPONSE_HEADERS:
    date : Wed, 09 Jan 2013 20:01:02 GMT
    content_length : 176
    keep_alive : timeout=20
    connection : keep-alive
    content_type : text/html
    server : nginx
REQUEST_BODY: {"console":false,"meta":{"console":false,"debug":null,"version":4,"created":1357761630},"debug":null,"runningInstances":0,"staging":{"model":"sinatra","stack":"ruby18"},"instances":1,"uris":["uec.cfuec.com"],"version":"4362471c5307f10f32d3623e00475c2a3a761766-2","services":[],"resources":{"fds":256,"disk":2048,"memory":128},"name":"uec","env":[],"state":"STARTED"}
RESPONSE: [504]
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>
<<<
Error (JSON 504): <html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>

Haipeng Wu

unread,
Jan 9, 2013, 10:19:09 PM1/9/13
to vcap...@cloudfoundry.org, jake...@gmail.com
Could you please check if the process "vcap_redis" exists?
Cloud_controller will connect "vcap_redis" while staging.

If vcap_redis works, you may take a look at cloud_controller log and stager log to find some clue.

haipeng

student

unread,
Jan 13, 2013, 7:36:38 AM1/13/13
to vcap...@cloudfoundry.org, jake...@gmail.com
I encountered same problem.

1. modify redis config file. It should be listen from all IPs.(at netstat command, it seems as 0.0.0.0 )
2. start 'stager' manually. (maybe you can modify yml file before install. just add ' - stager' and run dev_setup )

I don't know whether these are real solution, just wanted to make this work.

2012년 9월 26일 수요일 오전 6시 19분 44초 UTC+9, jake...@gmail.com 님의 말:

student

unread,
Jan 13, 2013, 7:48:25 AM1/13/13
to vcap...@cloudfoundry.org, jake...@gmail.com
And maybe you must check postgres.
like this,
netstat -anp | grep LISTEN #get postgres pid
ps -fp postgres_pid # use pid at upper line. you get posgres's conf file at this time. maybe you would need -ww option with ps command
vi ~~~ #modiry conf file
sudo /etc/init.d/postgresql-#.# restart #restart.



2013년 1월 13일 일요일 오후 9시 36분 38초 UTC+9, student 님의 말:
Reply all
Reply to author
Forward
0 new messages