AWS CloudFormation Install Fails

250 views
Skip to first unread message

Ronan Mc Nulty

unread,
May 21, 2021, 7:02:36 AM5/21/21
to Kill Bill users mailing-list
The CloudFormation installation is failing on creating the AutoScaling groups, is this a known issue?

stephane brossier

unread,
May 21, 2021, 1:56:36 PM5/21/21
to Ronan Mc Nulty, Kill Bill users mailing-list
Hi Ronan,

We started to have reports recently about some occasional failures, and it seems like the startup sequence of killbill hangs in some situations. It is a bit of a mystery of why this happens suddenly since this image has not changed for a while.

In any case, we have investigated the issue and assuming your issue is the same, we have a fix in place: We will be releasing a new image and take the opportunity to upgrade to our latest stack at the same time and hopefully address this bug. We hope to have something within 2 weeks. If you are interested to verify whether the issue you are facing is the same as the one for which we have a fix, I can provide some instructions. The short story is that it looks like the 'systemd' service in charge of starting killbill suddenly hangs because of some service dependency not met.

Stéphane






On Fri, May 21, 2021 at 4:02 AM Ronan Mc Nulty <ronanm...@gmail.com> wrote:
The CloudFormation installation is failing on creating the AutoScaling groups, is this a known issue?

--
You received this message because you are subscribed to the Google Groups "Kill Bill users mailing-list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to killbilling-us...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/killbilling-users/45613698-2ce8-4c4a-811c-4ba48833e810n%40googlegroups.com.

stephane brossier

unread,
May 21, 2021, 2:11:00 PM5/21/21
to Ronan Mc Nulty, Kill Bill users mailing-list
[+ mailing list]

You need to log in to the killbill ec2 zone following these instructions: https://docs.killbill.io/latest/aws-cf.html#_practical_tips

Then, check if you see the following command being stuck: 'systemctl start killbill.service'. If so, you need to kill it and then rerun the command (as root).



On Fri, May 21, 2021 at 11:02 AM Ronan Mc Nulty <ronanm...@gmail.com> wrote:
Hi Stephane,

I really need to get the system up and running ASAP so whatever help you could give me would be appreciated

Regards,
Ronan 

Ronan Mc Nulty

unread,
May 24, 2021, 5:19:10 AM5/24/21
to Kill Bill users mailing-list
Thanks Stephane,

The command was not stuck and the health check came back healthy
{"com.codahale.metrics.health.jvm.ThreadDeadlockHealthCheck":{"healthy":true},"main.pool.Connection99Percent":{"healthy":true},"main.pool.ConnectivityCheck":{"healthy":true},"org.killbill.billing.server.healthchecks.KillbillHealthcheck":{"healthy":true,"message":"OK"},"org.killbill.billing.server.healthchecks.KillbillPluginsHealthcheck":{"healthy":true},"org.killbill.billing.server.healthchecks.KillbillQueuesHealthcheck":{"healthy":true,"externalBus":{"growing":false},"bus":{"growing":false}},"osgi.pool.Connection99Percent":{"healthy":true},"osgi.pool.ConnectivityCheck":{"healthy":true},"shiro.pool.Connection99Percent":{"healthy":true},"shiro.pool.ConnectivityCheck":{"healthy":true}}

The 2 autoscaling groups fail to create:
The following resource(s) failed to create: [KauiAutoScalingGroup, KBAutoScalingGroup]. Rollback requested by user.
Received 0 SUCCESS signal(s) out of 1. Unable to satisfy 100% MinSuccessfulInstancesPercent requirement

cfn-init.log:
2021-05-24 09:04:43,053 [WARNING] Timeout of 60 seconds breached
2021-05-24 09:04:43,054 [ERROR] Client-side timeout
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/cfnbootstrap/util.py", line 162, in _retry
    return f(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/cfnbootstrap/util.py", line 231, in _timeout
    raise TimeoutError("Execution did not succeed after %s seconds" % duration)
TimeoutError
2021-05-24 09:05:57,237 [WARNING] Timeout of 60 seconds breached
2021-05-24 09:05:57,237 [ERROR] Client-side timeout
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/cfnbootstrap/util.py", line 162, in _retry
    return f(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/cfnbootstrap/util.py", line 231, in _timeout
    raise TimeoutError("Execution did not succeed after %s seconds" % duration)
TimeoutError

Ronan Mc Nulty

unread,
May 24, 2021, 6:22:50 AM5/24/21
to Kill Bill users mailing-list
The issue was a routing issue in the VPC

stephane brossier

unread,
May 24, 2021, 7:52:00 PM5/24/21
to Ronan Mc Nulty, Kill Bill users mailing-list
Thanks for letting me know - do you think there is something we could improve in our AWS documentation to avoid this?

Reply all
Reply to author
Forward
0 new messages