How can i install a high available version of AWX? I want to test a fail over scenario where the current installation server is down, but i should be able to access to another instance.Any tip?
--
You received this message because you are subscribed to the Google Groups "AWX Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to awx-project+unsubscribe@googlegroups.com.
To post to this group, send email to awx-p...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/awx-project/65554be7-f42a-4f56-ad58-76cfc6e369f7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
awx-manage provision_instance --hostname=$CLUSTER_NODE
awx-manage register_queue --queuename=tower --hostnames=$CLUSTER_NODE
CLUSTER_HOST_ID = os.getenv(
"CLUSTER_NODE"
,
"awx"
)
image_build/files/supervisor_task.conf
command = /var/lib/awx/venv/awx/bin/celery worker -A awx -l ERROR --autoscale=
50
,
4
-Ofair -Q
tower_scheduler,tower_broadcast_all,tower,%(ENV_CLUSTER_NODE)s -n celery@%(ENV_CLUSTER_NODE)s
- name: Activate AWX Web Container
...
env:
CLUSTER_NODE:
"{{ cluster_node | default('localhost') }}"
...
RABBITMQ_USER:
"awx"
RABBITMQ_PASSWORD:
"<password>"
RABBITMQ_HOST:
"{{ cluster_node | default('localhost')}}"
RABBITMQ_PORT:
"5672"
RABBITMQ_VHOST:
"awx"
The same was changed for the AWX Task container
in front of the nodes a HAProxy is running with roundrobin load balacing.
env:
CLUSTER_NODE:
"{{ cluster_node | default('localhost') }}"
...
RABBITMQ_USER:
"awx"
RABBITMQ_PASSWORD:
"<password>"
RABBITMQ_HOST:
"{{ cluster_node | default('localhost')}}"
RABBITMQ_PORT:
"5672"
RABBITMQ_VHOST:
"awx"
2018-03-08 03:17:34,613: ERROR/MainProcess] Unrecoverable error: AccessRefused(403, u"ACCESS_REFUSED - access to exchange 'celeryev' in vhost 'awx' refused for user 'awx'", (40, 10), 'Exchange.declare')
[root@host rabbitmq]
# rabbitmqctl delete_user guest
[root@
host
rabbitmq]
# rabbitmqctl add_vhost awx
[root@
host
rabbitmq]
# rabbitmqctl add_user awx <password>
[root@
host
rabbitmq]
# rabbitmqctl set_permissions -p awx awx ".*" ".*" ".*"
[root@
host
rabbitmq]
# rabbitmqctl set_policy -p awx ha-all ".*" '{"ha-mode":"all","ha-sync-mode":"automatic"}'
--
You received this message because you are subscribed to the Google Groups "AWX Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to awx-project+unsubscribe@googlegroups.com.
To post to this group, send email to awx-p...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/awx-project/23ffd808-445e-4c69-a48e-a2d8445a3b92%40googlegroups.com.
image_build/files/supervisor_task.conf
command = /var/lib/awx/venv/awx/bin/celery worker -A awx -l DEBUG --autoscale=
50
,
4
-Ofair -Q
tower_scheduler,tower_broadcast_all,tower,%(ENV_CLUSTER_NODE)s -n celery@%(ENV_CLUSTER_NODE)s
It's probably worth pointing out at this point that clustered AWX is now supported on Openshift and Kubernetes without needing to hand-roll your own solution.
On Thu, Mar 8, 2018 at 1:36 PM, 'Philipp Wiesner' via AWX Project <awx-p...@googlegroups.com> wrote:
Hi Dnc92301,you should see both your nodes below the AWX instance group.Here is a link how Ansible Tower is setup as a cluster: http://docs.ansible.com/ansible-tower/latest/html/administration/clustering.html#job-runtime-behaviorThe image shows, that on each node RabbitMQ is running, which are providing the cluster functionality. With AWX you have on each node then docker containers running with the awx_web, awx_task and memcache images. For each node the settings.py is in this way configured, that the rabbitmq_host points to it one machine. Meaning that when you have two nodes: node1 and node2, that on node1 the settings.py rabbitmq_host points to node1 and on node2 the settings.py rabbitmq_host points to node2. The same goes for the celery worker. Therefore we have added this cluster_node inventory variable. We have the installation directory on each seperate node, set the cluster_node to the machine name and run the installation playbook on each node.The cluster work in this way, that when the capacity on one node is full, the next triggered job will be started on the next node of the cluster with free capacity. A failover of a job like mentioned in your example will not work, as the playbook run is triggered by one worker process on one node. When you restart this node, the worker process is lost.The page I provided to you gives you a good overview, of what the cluster functionality of Ansible Tower/AWX provides and how it works. For our setup I have basically reenginered this setup into AWX.
Am Donnerstag, 8. März 2018 19:15:21 UTC+1 schrieb dnc92301:Hi Phillipp ,
Yes it looks like it’s working somewhat now after setting the proper permission for the awx user . However , how do you have ensure cluster is configured correctly . Under instance group , do you see both nodes within the cluster . Right now I only see 1 node . I’ve tried kicking off a job and then reboot the server in the middle of the run , does the job fails over automatically to the other node within the cluster ?
Thanks
--
You received this message because you are subscribed to the Google Groups "AWX Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to awx-project...@googlegroups.com.
To post to this group, send email to awx-p...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/awx-project/23ffd808-445e-4c69-a48e-a2d8445a3b92%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/awx-project/f2e45865-2341-459b-8a30-3305919435a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
command = /var/lib/awx/venv/awx/bin/celery worker -A awx -l ERROR --autoscale=
50
,
4
-Ofair -Q
tower_scheduler,tower_broadcast_all,tower,%(ENV_CLUSTER_NODE)s -n celery@%(ENV_CLUSTER_NODE)s"?
In addition, I do not see any references to rabbitmg in main.yml or
set_image.yml (the only task imported in main.yml) in local_docker/tasks/ .We are running AWX in a clustered HA environment. But for this, some manual adjustments in the installation roles had been done. Further you need to create a RabbitMQ Cluster. For this we disabled the RabbitMQ containers for the installation and set them up beforehand on all the nodes. After the RabbitMQ cluster was running, we changed the RabbitMQ connection details in the roles.
The following files were changed:
image_build/files/launch_awx_task.shawx-manage provision_instance --hostname=$CLUSTER_NODE
awx-manage register_queue --queuename=tower --hostnames=$CLUSTER_NODE
image_build/files/settings.pyCLUSTER_HOST_ID = os.getenv(
"CLUSTER_NODE"
,
"awx"
)
image_build/files/supervisor_task.conf
command = /var/lib/awx/venv/awx/bin/celery worker -A awx -l ERROR --autoscale=
50
,
4
-Ofair -Q
tower_scheduler,tower_broadcast_all,tower,%(ENV_CLUSTER_NODE)s -n celery@%(ENV_CLUSTER_NODE)s
local_docker/tasks/main.yml
uncomment every rabbitmq container reference- name: Activate AWX Web Container
...
env:
CLUSTER_NODE:
"{{ cluster_node | default('localhost') }}"
...
RABBITMQ_USER:
"awx"
RABBITMQ_PASSWORD:
"<password>"
RABBITMQ_HOST:
"{{ cluster_node | default('localhost')}}"
RABBITMQ_PORT:
"5672"
RABBITMQ_VHOST:
"awx"
The same was changed for the AWX Task container
in front of the nodes a HAProxy is running with roundrobin load balacing.
--
You received this message because you are subscribed to the Google Groups "AWX Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to awx-project...@googlegroups.com.
To post to this group, send email to awx-p...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/awx-project/8db3d35f-9881-4a06-92b4-f65bfbd3dda2%40googlegroups.com.