Cleaning up the Ansible monolith deployment path

115 views
Skip to first unread message

tay...@openfn.org

unread,
Oct 5, 2017, 10:37:21 AM10/5/17
to CommCare Developers
Hey guys,

Hope all is well. Let me preface this with a thank you—I know you've got a lot going on and don't rely on ansible monolith deployments for your core work, so I realize that any help you provide here is going above and beyond. Thank you for that!

My objective is to get ansible-playbook -i inventories/monolith -u root -e '@vars/dev/dev_private.yml' -e '@vars/dev/dev_public.yml' deploy_stack.yml running on a freshly provisioned Ubuntu 14.04.5 LTS (GNU/Linux 3.13.0-125-generic x86_64) droplet with 2 gigs of memory.

While I think that's a solid goal for the whole CommCare open-source community, I'd like to disclose that we've also got a client at Open Function that wants to connect CommCare to another system using OpenFn, but CommCare needs to be hosted on their servers due to regulatory issues.

Note that we made a couple of changes vagrant and edited some ansible scripts. You can see this work here: https://github.com/rorymckinley/commcare-sandbox/pull/1/files. One significant change is that we are running the vagrant stuff as root.

To the issues:

Issue #1:
TASK [couchdb : Set CouchDB username and password] *****************************
ok: [165.227.172.214] => (item={u'username': u'commcarehq', u'name': u'commcarehq', u'is_https': False, u'host': u'165.227.172.214', u'password': u'commcarehq', u'port': 5984})
failed: [165.227.172.214] (item={u'username': u'commcarehq', u'name': u'commcarehq__users', u'is_https': False, u'host': u'165.227.172.214', u'password': u'commcarehq', u'port': 5984}) => {"cache_control": "must-revalidate", "content": "{\"error\":\"unauthorized\",\"reason\":\"You are not a server admin.\"}\n", "content_length": "64", "content_type": "text/plain; charset=utf-8", "date": "Thu, 05 Oct 2017 11:10:34 GMT", "failed": true, "item": {"host": "165.227.172.214", "is_https": false, "name": "commcarehq__users", "password": "commcarehq", "port": 5984, "username": "commcarehq"}, "msg": "Status code was not [200]: HTTP Error 401: Unauthorized", "redirected": false, "server": "CouchDB/1.6.1 (Erlang OTP/R16B03)", "status": 401, "url": "http://165.227.172.214:5984/_config/admins/commcarehq"}
to retry, use: --limit @/vagrant/ansible/deploy_stack.retry

PLAY RECAP *********************************************************************
165.227.172.214            : ok=135  changed=90   unreachable=0    failed=1  

Possible solution 1: This task runs twice, but each user in "items" has the same username and password. The failure can be stepped over, as we don't need to (and can't) set up two different couchdb users with commcarehq:commcarehq on the same box.

Issue #2&3: For both couchdb2 and redis, monit fails. After I reboot the system and start monit manually they pass and redis is running, but couchdb2 still shows "Execution failed".  After another system reboot, and manually starting monit, both now show as running and being monitored.

monit status: Process 'couchdb2'
  status                            Execution failed
  monitoring status                 Monitored
  data collected                    Thu, 05 Oct 2017 11:59:49

TASK [couchdb2 : monit] ********************************************************
fatal: [165.227.172.214]: FAILED! => {"changed": false, "failed": true, "msg": "couchdb2 process not presently configured with monit", "name": "couchdb2", "state": "monitored"}

RUNNING HANDLER [monit : reload monit] *****************************************
to retry, use: --limit @/vagrant/ansible/deploy_stack.retry

PLAY RECAP *********************************************************************
165.227.172.214            : ok=36   changed=20   unreachable=0    failed=1  

TASK [redis : monit] ***********************************************************
fatal: [165.227.172.214]: FAILED! => {"changed": false, "failed": true, "msg": "redis process not presently configured with monit", "name": "redis", "state": "monitored"}

RUNNING HANDLER [monit : reload monit] *****************************************

RUNNING HANDLER [redis : restart redis] ****************************************

RUNNING HANDLER [redis : restart rsyslog] **************************************
to retry, use: --limit @/vagrant/ansible/deploy_stack.retry

PLAY RECAP *********************************************************************
165.227.172.214            : ok=17   changed=10   unreachable=0    failed=1 

Issue 4:
TASK [touchforms : Touchforms user] ********************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ImportError: No module named django
fatal: [165.227.172.214 -> 165.227.172.214]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_iUft9p/ansible_module_django_user.py\", line 144, in <module>\n    main()\n  File \"/tmp/ansible_iUft9p/ansible_module_django_user.py\", line 125, in main\n    user.create_user()\n  File \"/tmp/ansible_iUft9p/ansible_module_django_user.py\", line 84, in create_user\n    superuser=repr(self.superuser),\n  File \"/usr/local/lib/python2.7/dist-packages/sh.py\", line 1427, in __call__\n    return RunningCommand(cmd, call_args, stdin, stdout, stderr)\n  File \"/usr/local/lib/python2.7/dist-packages/sh.py\", line 774, in __init__\n    self.wait()\n  File \"/usr/local/lib/python2.7/dist-packages/sh.py\", line 792, in wait\n    self.handle_command_exit_code(exit_code)\n  File \"/usr/local/lib/python2.7/dist-packages/sh.py\", line 815, in handle_command_exit_code\n    raise exc\nsh.ErrorReturnCode_1: \n\n  RAN: /home/cchq/www/dev/current/python_env/bin/python manage.py shell --plain\n\n  STDOUT:\n\n\n  STDERR:\nTraceback (most recent call last):\n  File \"manage.py\", line 9, in <module>\n    import django\nImportError: No module named django\n\n", "module_stdout": "Traceback (most recent call last):\n  File \"manage.py\", line 9, in <module>\n    import django\nImportError: No module named django\n\n", "msg": "MODULE FAILURE"}
to retry, use: --limit @/vagrant/ansible/deploy_stack.retry

Possible solution: Here, we need to SSH in and then:
# su - cchq
# cd www/dev/current
# source python_env/bin/activate
# pip install -r requirements/requirements.txt

At this point the whole ansible playbook succeeds, but when we visit our IP, we get the maintenance page and see this in the nginx logs:
2017/10/05 13:56:16 [error] 1064#1064: *18 connect() failed (111: Connection refused) while connecting to upstream, client: 186.106.251.211, server: 165.227.172.214, request: "GET /favicon.ico HTTP/1.1", upstream: "http://165.227.172.214:9010/favicon.ico", host: "165.227.172.214", referrer: "https://165.227.172.214/solutions/"

After activating the python_env we run runserver as `cchq`:
./manage.py runserver 0.0.0.0:9010
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/django/db/backends/postgresql/base.py", line 176, in get_new_connection
    connection = Database.connect(**conn_params)
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/psycopg2/__init__.py", line 130, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: ERROR:  pgbouncer cannot connect to server

At this point, we're wondering:
  1. Why isn't the server running itself?
  2. And how do we get it to run?
Best,
Taylor

tay...@openfn.org

unread,
Oct 5, 2017, 11:36:25 AM10/5/17
to CommCare Developers
Update: Rory found that one issue lay in the encrypted fs stuff. ran:

/etc/init.d/postgresql start
/etc/init.d/pgbouncer stop
/etc/init.d/pgbouncer start

and we can run the server. This was probably due to us having to reboot during the deployment process.

We run migrations (CCHQ_IS_FRESH_INSTALL=1 python manage.py migrate) and get:
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/botocore/client.py", line 599, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied

This appears to be an S3 issue, but I'm fairly certain I've configured my bucket properly and granted access via the access key and secret. (These are not part of version control in the shared repo, of course.) Will update as we go.

FWIW, python manage.py compress fails because it can't find the Font Awesome less file:
CommandError: An error occurred during rendering /home/cchq/www/dev/releases/2017-10-05_12.28/corehq/apps/registration/templates/registration/domain_request.html: 'font-awesome/less/font-awesome.less' could not be found in the COMPRESS_ROOT '/home/cchq/www/dev/releases/2017-10-05_12.28/staticfiles' or with staticfiles.

Simon Kelly

unread,
Oct 6, 2017, 8:19:09 AM10/6/17
to CommCare Developers
Hi Taylor

Our general process is as follows:
  1. Configure blank VMs (just OS)
  2. Create inventory file and vars files
  3. Run ansible deploy - there are often a few hiccoughs here since we don't do fresh installs that often
  4. Once everything is setup we deploy our code with fabric scripts as follows

    fab <environment> deploy

    environment is the name of an inventory file here: https://github.com/dimagi/commcare-hq-deploy/tree/master/fab/inventory

    This also makes use of this 'environments.yml' file which tells the deploy scripts which services to run where and a few other things: https://github.com/dimagi/commcare-hq-deploy/blob/master/fab/environments.yml

  5. That deploy will checkout the latest code, do the static file compression etc and also create the supervisor files needed to run the servers.

We've recently made some improvements to our couchdb setup (you should use couchdb2). I've linked them in comments on your PR.

We are about to do a whole new cluster setup so it's likely that there will be some more changes coming soon.

Re the issues:
1. Switch to using couchdb2
2&3. Resolved in latest master + this PR (https://github.com/dimagi/commcarehq-ansible/pull/971)
4. The virtual env should have already be setup by the deploy_commcarehq playbook which should execute prior to the touchforms playbook. Also touchforms is only necessary if you're going to be doing sms surveys.

Re the encrypted drives. We run the deploy_stack playbook with 'after-reboot' tag limited to the rebooted host. This should remount the encrypted drive and perform a few other actions.

I hope that helps and thanks for the feedback!

Simon Kelly
Director of Server Engineer | Dimagi

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tay...@openfn.org

unread,
Oct 9, 2017, 12:22:29 PM10/9/17
to CommCare Developers
Hey Simon, thanks so much. We've got the fab deploy scripts running now (albeit with lots of warning, sudo received non-zero exit codes*) and finishing successfully. When we ssh into our box, got to the newly created release, activate python and run `runserver` however, we get a server to start but it throws this 500** whenever it's accessed via the web:

OfflineGenerationError: You have offline compression enabled but key "89af02fe109c09d9c74742e99d8f3fea" is missing from offline manifest. You may need to run "python manage.py compress".
2017-10-09 16:15:37,638 ERROR "GET /accounts/login/ HTTP/1.0" 500 59

When running compress, we get this font-awesome package error:
CommandError: An error occurred during rendering /home/cchq/www/dev/releases/2017-10-09_16.04/corehq/motech/openmrs/templates/openmrs/importers.html: 'font-awesome/less/font-awesome.less' could not be found in the COMPRESS_ROOT '/home/cchq/www/dev/releases/2017-10-09_16.04/staticfiles' or with staticfiles.

Have you bumped into this before? Thanks!

*The non-zero exit codes all look pretty much like this:
[165.227.172.214] sudo: /home/cchq/www/dev/releases/2017-10-09_16.04/python_env/bin/python /home/cchq/www/dev/releases/2017-10-09_16.04/manage.py preindex_everything --check
[165.227.172.214] out: 2017-10-09 16:08:10,599 INFO Raven is not configured (logging is disabled). Please see the documentation for more information.
[165.227.172.214] out: 2017-10-09 16:08:12,031 INFO AXES: BEGIN LOG
[165.227.172.214] out: 


Warning: sudo() received nonzero return code 1 while executing '/home/cchq/www/dev/releases/2017-10-09_16.04/python_env/bin/python /home/cchq/www/dev/releases/2017-10-09_16.04/manage.py preindex_everything --check'!

tay...@openfn.org

unread,
Oct 9, 2017, 5:36:24 PM10/9/17
to CommCare Developers
Simon, my last update for the day:


I cannot get compress to run using either option 2 or option 3, and with option 1 (as you can probably see from the linked photo) I'm not actually getting the static assets I need from a CDN.

The error on my compress command is no longer on motech, it's now on "hqadmin":
CommandError: An error occurred during rendering /home/cchq/www/dev/releases/2017-10-09_18.02/corehq/apps/hqadmin/templates/hqadmin/loadtest.html: 'font-awesome/less/font-awesome.less' could not be found in the COMPRESS_ROOT '/home/cchq/www/dev/releases/2017-10-09_18.02/staticfiles' or with staticfiles.

Thanks again for all your help. Speak soon!

Taylor

P.S. — In an effort to make this repeatable, we've got a fork of the ansible repo going that includes a git submodule with your commcare-deploy repo. Our goal is to get this down to a single git clone and a few shell commands! Would love any feedback on the directory structure you use locally.

Jenny Schweers

unread,
Oct 10, 2017, 11:27:31 AM10/10/17
to commcare-...@googlegroups.com
Hi Taylor,

About that compress error: Have you run `bower update` recently? I'd run that, verify that the file ./bower_components/font-awesome/less/font-awesome.less does indeed exist afterwards, and then run collectstatic and compress again.

You can also double-check that your STATICFILES_DIRS contains bower_components (it should be set up by https://github.com/dimagi/commcare-hq/blob/master/settings.py#L87-L97)

-Jenny

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,
Oct 10, 2017, 5:46:31 PM10/10/17
to CommCare Developers
Been offline travelling so sorry for the slow response. Strange that you get that error if you're using the fabric deploy script since it should do a bower update but I'd check what Jenny suggested to make sure.

Re the "sudo received non-zero exit codes" messages, as long as it's only for the 'preindex' command that should be fine. If there are any other errors during deploy then it won't complete. (also PR to remove those warnings: https://github.com/dimagi/commcare-hq-deploy/pull/393)



Simon Kelly
Director of Server Engineer | Dimagi

rorymc...@capefox.co

unread,
Oct 11, 2017, 11:31:54 AM10/11/17
to CommCare Developers
Hi Simon

Yes, Jenny's advice helped us out immensely - we now have commcare up and serving the static assets. 

We are seeing what we think are errors connecting to the riak-cs instance - and I tried running `./manage.py ptop_preindex` which produces some iniital success, but then:

Starting pillow preindex ledgers
Traceback (most recent call last):
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/home/cchq/www/dev/releases/2017-10-09_18.02/corehq/apps/hqcase/management/commands/ptop_preindex.py", line 53, in do_reindex
    FACTORIES_BY_SLUG[reindex_command](**kwargs).build().reindex()
  File "/home/cchq/www/dev/releases/2017-10-09_18.02/corehq/pillows/case_search.py", line 137, in build
    initialize_index_and_mapping(get_es_new(), CASE_SEARCH_INDEX_INFO)
  File "./corehq/ex-submodules/pillowtop/es_utils.py", line 87, in initialize_index_and_mapping
    initialize_index(es, index_info)
  File "./corehq/ex-submodules/pillowtop/es_utils.py", line 92, in initialize_index
    return create_index_and_set_settings_normal(es, index_info.index, index_info.meta)
  File "./corehq/ex-submodules/pillowtop/es_utils.py", line 73, in create_index_and_set_settings_normal
    es.indices.create(index=index, body=metadata)
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/client/indices.py", line 103, in create
    params=params, body=body)
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 307, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 93, in perform_request
    self._raise_error(response.status, raw_data)
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 105, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
NotFoundError: TransportError(404, u'<HTML><HEAD><TITLE>404 Not Found</TITLE></HEAD><BODY><H1>Not Found</H1>The requested document was not found on this server.<P><HR><ADDRESS>mochiweb+webmachine web server</ADDRESS></BODY></HTML>')
<Greenlet at 0x7f9713dac2d0: do_reindex(u'case_search', False)> failed with NotFoundError

There are more errors in this ilk, the above is merely the first (note: I have added some debugging print statements, so line numbers may be slightly out). Does the above point to us doing something that is obviously wrong?

Thanks in advance.

Rory
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,
Oct 11, 2017, 2:09:47 PM10/11/17
to CommCare Developers
That seems like the Elasticsearch address may be incorrect. This error is happening when the command is trying to create a new index in elasticsearch.

I'd check that you've got your ES connection details correct in localsettings:
  • ELASTICSEARCH_HOST
  • ELASTICSEARCH_PORT
You can test the connection using curl:

$ curl <host>:<port>

{
  "status" : 200,
  "name" : "Albino",
  "cluster_name" : "agrajag",
  "version" : {
    "number" : "1.7.4",
    "build_hash" : "0d3159b9fc8bc8e367c5c40c09c2a57c0032b32e",
    "build_timestamp" : "2015-12-15T16:45:04Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}


Simon Kelly
Director of Server Engineer | Dimagi

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

rorymc...@capefox.co

unread,
Oct 12, 2017, 2:01:26 PM10/12/17
to CommCare Developers
Thanks Simon.

Just to make sure I am not missing something really obvious ("missing something really obvious" is in fact, quite an accurate summation of my adventure so far) - the ansible scripts set up riak-cs, and so I can point those ES connection strings at the local riak-cs instance?

Regards

Rory
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,
Oct 12, 2017, 3:30:09 PM10/12/17
to CommCare Developers
Hey

So riak-cs and elasticsearch are completely different systems. You can think of Riak-CS as and S3 service. Elasticsearch is a distributed search index.

In localsettings.py the settings for Elasticsearch are the ones I mentioned before. For Riak the settings are:
S3_BLOB_DB_SETTINGS = {
"url": "http://localhost:9980/",
"access_key": "admin-key",
"secret_key": "admin-secret",
"config": {"connect_timeout": 3, "read_timeout": 5},
}
Note that if you are just running a monolith then it's not necessary to have riak at all since you can just the the local filesystem. If you want to go that route then you should just remove the 'riak-cs' group from your inventory file completely. That should result in the above settings being removed from your localsettings file which will cause CommCare HQ to switch to using the filesystem to store binary objects (e.g. form xml).

You should also then set `shared_drive_enabled` to 'false' in your ansible vars file since you don't need a NFS drive for just one machine.

Sorry for the complexities here and the lack of docs.

Simon Kelly
Director of Server Engineer | Dimagi

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

rorymc...@capefox.co

unread,
Oct 13, 2017, 12:56:19 AM10/13/17
to CommCare Developers
D'oh! Thanks Simon, no this is totally my fault - at some point in the process my brain conflated elasticsearch and S3, and then never let go :( - I am not sure why - old age I guess ;).

Thanks for the tips - we will definitely factor them in.

R
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,
Oct 13, 2017, 7:52:55 AM10/13/17
to CommCare Developers
👍

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

rorymc...@capefox.co

unread,
Oct 16, 2017, 11:25:57 AM10/16/17
to CommCare Developers
Hi Simon

Quick question: 

We set up a trial account with a ES provider (just so we would not get distracted by the ElasticSearch rabbithole right now) - but the only way I could get `./manage.py ptop_preindex` to connect was to hack in the necessary params for an SSL connection in _es_hosts() in corehq/elastic.py. 

Is there a way to get commcare to work with ES using SSL?

Regards

Rory
👍

Simon Kelly

unread,
Oct 16, 2017, 12:51:16 PM10/16/17
to CommCare Developers
We don't use SSL for ES since we don't use external ES service. But you could submit a PR that adds the ability to provide the necessary parameters.

Simon Kelly
Director of Server Engineer | Dimagi

👍

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Message has been deleted

rorymc...@capefox.co

unread,
Oct 18, 2017, 12:04:29 PM10/18/17
to CommCare Developers
Thanks Simon - it turns out that the customer wants to use local ES, so we won't be using the offboard service after all (not even for testing).

It feels as if we have commcare most of the way there now, most pages seem to load and the number of obvious errors :) are very few. 

Taylor has tried sending out invites to users, but he says he never receives the mails. There is no obvious signs  that anything is going awry - the only clue I have found in the logs is as follows:

a.b.c.d - - - - - [18/Oct/2017:13:00:57 +0000] "POST /hq/notifications/service/ HTTP/1.1" 200 94 "https://y.y.y.y/a/xxxxxx/settings/users/web/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"

I am not sure if this is at all related to what Taylor is trying to do? The other logs have quite a regular complaint about something called toggle.js, but I am not sure if that is related either (as you may be picking up here, there is a lot I am not sure about :) ).

Thanks in advance

Rory

PS I sanitised some of the log entry.
👍

Simon Kelly

unread,
Oct 19, 2017, 8:56:17 AM10/19/17
to CommCare Developers
Hi Rory

You can customize email with these settings (you should set them in your 'localsettings.py' file): https://github.com/dimagi/commcare-hq/blob/6238482bace149b57b13ebaf66b669edd6e372f4/settings.py#L470-L499

You will also need to set the EMAIL_BACKEND according to your specific needs: https://docs.djangoproject.com/en/1.11/topics/email/#topic-email-backends

Simon Kelly
Director of Server Engineer | Dimagi

👍

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

tay...@openfn.org

unread,
Oct 23, 2017, 2:37:57 PM10/23/17
to CommCare Developers
Hi Simon, Jenny, and team,

The new master on commcarehq-ansible works much better. Thank you! I've got a branch that almost runs through on the very first go.

Three quick questions:
  1. Right now, we're only able to get ansible to run if we set a FORMPLAYER_INTERNAL_AUTH_KEY to an empty string, but since we later have issues with Formplayer I'm worried that this isn't the right move. What should we do here?
  2. With which user do you run ansible on a Digital Ocean box? (I noticed an "ansible" user gets configured, but presumably you've got to first run as root. Are you meant to run once as root and then after it fails subsequently run as ansible?)
  3. Where do you clone the commcare-hq-deploy repo and with which user do you run fab <env> deploy?
And then one higher-level question to make sure I'm understanding things correctly: It seems as though deployment on a new box requires the cloning and configuration of three separate repos: (1) commcarehq-ansible, (2) commcare-hq-deploy, and (3) formplayer. If we're trying to get this down to a single repo (or at least a single README) can you describe the relationship between these three repos theoretically and in user/directory terms? It would be amazing to know where you clone each of them and how you run them in relation to each other. We've been doing all the ansible stuff as root and the django stuff as cchq, but that may not be right.

Again, thank you so much. I owe you so many coffees/beers/loaves of bread/pretty-much-you-name-it next time I'm in South Africa.

Taylor
👍

Simon Kelly

unread,
Oct 30, 2017, 4:18:30 AM10/30/17
to CommCare Developers
Hey answers inline:
  1. Right now, we're only able to get ansible to run if we set a FORMPLAYER_INTERNAL_AUTH_KEY to an empty string, but since we later have issues with Formplayer I'm worried that this isn't the right move. What should we do here?
That is a shared secret between HQ and Formplayer which allows formplayer to authenticate API calls to HQ. You should set it to a secret key with a reasonable amount of entropy. 
  1. With which user do you run ansible on a Digital Ocean box? (I noticed an "ansible" user gets configured, but presumably you've got to first run as root. Are you meant to run once as root and then after it fails subsequently run as ansible?)
It's usually necessary to run it as a privileged user once to setup the user accounts. We would normally run just the 'users' tag as root and from then on we can run it as the 'ansible' user.
  1. Where do you clone the commcare-hq-deploy repo and with which user do you run fab <env> deploy?
For running deploy you can have the repo anywhere you like as long as you have access to the machines you're deploying to from there. For deploy you don't require any external dependencies (other than those defined in requirements.txt.
 
And then one higher-level question to make sure I'm understanding things correctly: It seems as though deployment on a new box requires the cloning and configuration of three separate repos: (1) commcarehq-ansible, (2) commcare-hq-deploy, and (3) formplayer. If we're trying to get this down to a single repo (or at least a single README) can you describe the relationship between these three repos theoretically and in user/directory terms? It would be amazing to know where you clone each of them and how you run them in relation to each other. We've been doing all the ansible stuff as root and the django stuff as cchq, but that may not be right.

None of these repo's need to be anywhere specific. The formplayer repo in particular should not be needed for anything. Currently when we deploy formplayer it pulls the latest version from our Jenkins build server.

The other two are related as follows (at least for our setup):
  • ansible repo: stores all the ansible playbooks and vault files (with all the secret keys etc).
  • deploy repo: has the deploy scripts and the ansible inventory files (required for both deploy and ansible)
We always setup one of the VMs in our clusters as a 'control' machine from where we can run the ansible playbooks (and also normal deploys if we want).  Once you have an account on this machine you can follow the instructions in the readme: https://github.com/dimagi/commcarehq-ansible#setting-up-a-dev-account-on-ansible-control-machine

This should setup the 'commcare-hq-deploy' repo and also the python virtualenv for ansible. It will also create some bash aliases that make it easier to run ansible playbooks.

I hope that answer's your questions. Let me know if you have follow ups.

Cheers
Simon

tay...@openfn.org

unread,
Oct 30, 2017, 7:11:57 AM10/30/17
to CommCare Developers
Simon, this is fantastic—thank you. Very quick one:
  1. I ran ansible-playbook -i inventories/monolith -e '@vars/dev/dev_private.yml' -e '@vars/dev/dev_public.yml' deploy_stack.yml -u root --tags=users
  2. Then ansible-playbook -i inventories/monolith -e '@vars/dev/dev_private.yml' -e '@vars/dev/dev_public.yml' deploy_stack.yml -u ansible
on a brand new box, but get:

TASK [apt] ***********************************************************************************************************************************************************************************
fatal: [159.203.132.215]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to 159.203.132.215 closed.\r\n", "module_stdout": "sudo: a password is required\r\n", "msg": "MODULE FAILURE", "rc": 1}

Was I misinterpreting the suggestion? Or is there a change that must be made either to the ansible user setup or with visudo to get this working? Conceivably, we could set up password-less sudo privs for ansible like this: ansible ALL=(ALL) NOPASSWD:ALL

Best,
Taylor

Simon Kelly

unread,
Oct 30, 2017, 8:54:41 AM10/30/17
to CommCare Developers
The basic workflow is to add `-K` or `--ask-become-pass` to the command line which will prompt you for the 'sudo' password.

If you're also using ansible vault you will want `--ask-vault-pass` as well. Supplying two passwords is inconvenient so you can actually store the 'become' password in the vault file as "ansible_become_pass".


Simon Kelly
Director of Server Engineer | Dimagi

John Harper

unread,
Dec 1, 2017, 7:42:45 PM12/1/17
to CommCare Developers
Simon,  I have been following along and have tried the ansible load as well...

the core ansible load went very well......few hiccups along the way but made it through it.

I am at the deployment stage but I have ran into this error.....

  File "/home/harperjo/commcarehq-ansible/commcare-hq-deploy/fab/fabfile.py", line 272, in read_inventory_file
    return get_inventory(filename).get_group_dict()
AttributeError: 'Inventory' object has no attribute 'get_group_dict'
harperjo@GCS-1:~/commcarehq-ansible/commcare-hq-deploy$


this is on my fav dev deploy execution.  I am wondering if i missed something in the fabfile that points to my ansible environment file?

your thoughts?

Simon Kelly

unread,
Dec 2, 2017, 12:46:34 AM12/2/17
to CommCare Developers
Hi John

I'm glad it's gone well. The error you're seeing is a bug after we upgraded too a new version of ansible. If you update the deploy repo to the latest version and make sure your requirements are up to date it should work.

Cheers
Simon

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

John Harper

unread,
Dec 2, 2017, 9:30:58 AM12/2/17
to CommCare Developers
okay thank

I will check the ansible version I have on the control and my host machines to ensure it is at the correct level.

should i just blow away my commcare-hq-deploy folder and pull the new one down using wget?

ill have to configure my fabfile.py, environment.yml and others again correct?


To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,
Dec 3, 2017, 1:21:03 AM12/3/17
to CommCare Developers
You shouldn't need to blow it away. You should just be able to do a git pull to got the latest version.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,
Dec 3, 2017, 1:22:32 AM12/3/17
to CommCare Developers
Though if you've made changes to the fabfile you'll want to commit those in a branch and then merge the h updated master branch into your environment branch.

On 03 Dec 2017 08:21, "Simon Kelly" <ske...@dimagi.com> wrote:
You shouldn't need to blow it away. You should just be able to do a git pull to got the latest version.
Reply all
Reply to author
Forward
0 new messages