Cleaning up the Ansible monolith deployment path

tay...@openfn.org

unread,

Oct 5, 2017, 10:37:21 AM10/5/17

to CommCare Developers

Hey guys,

Hope all is well. Let me preface this with a thank you—I know you've got a lot going on and don't rely on ansible monolith deployments for your core work, so I realize that any help you provide here is going above and beyond. Thank you for that!

My objective is to get ansible-playbook -i inventories/monolith -u root -e '@vars/dev/dev_private.yml' -e '@vars/dev/dev_public.yml' deploy_stack.yml running on a freshly provisioned Ubuntu 14.04.5 LTS (GNU/Linux 3.13.0-125-generic x86_64) droplet with 2 gigs of memory.

While I think that's a solid goal for the whole CommCare open-source community, I'd like to disclose that we've also got a client at Open Function that wants to connect CommCare to another system using OpenFn, but CommCare needs to be hosted on their servers due to regulatory issues.

Note that we made a couple of changes vagrant and edited some ansible scripts. You can see this work here: https://github.com/rorymckinley/commcare-sandbox/pull/1/files. One significant change is that we are running the vagrant stuff as root.

To the issues:

Issue #1:

TASK [couchdb : Set CouchDB username and password] *****************************

ok: [165.227.172.214] => (item={u'username': u'commcarehq', u'name': u'commcarehq', u'is_https': False, u'host': u'165.227.172.214', u'password': u'commcarehq', u'port': 5984})

failed: [165.227.172.214] (item={u'username': u'commcarehq', u'name': u'commcarehq__users', u'is_https': False, u'host': u'165.227.172.214', u'password': u'commcarehq', u'port': 5984}) => {"cache_control": "must-revalidate", "content": "{\"error\":\"unauthorized\",\"reason\":\"You are not a server admin.\"}\n", "content_length": "64", "content_type": "text/plain; charset=utf-8", "date": "Thu, 05 Oct 2017 11:10:34 GMT", "failed": true, "item": {"host": "165.227.172.214", "is_https": false, "name": "commcarehq__users", "password": "commcarehq", "port": 5984, "username": "commcarehq"}, "msg": "Status code was not [200]: HTTP Error 401: Unauthorized", "redirected": false, "server": "CouchDB/1.6.1 (Erlang OTP/R16B03)", "status": 401, "url": "http://165.227.172.214:5984/_config/admins/commcarehq"}

to retry, use: --limit @/vagrant/ansible/deploy_stack.retry

PLAY RECAP *********************************************************************

165.227.172.214 : ok=135 changed=90 unreachable=0 failed=1

Possible solution 1: This task runs twice, but each user in "items" has the same username and password. The failure can be stepped over, as we don't need to (and can't) set up two different couchdb users with commcarehq:commcarehq on the same box.

Issue #2&3: For both couchdb2 and redis, monit fails. After I reboot the system and start monit manually they pass and redis is running, but couchdb2 still shows "Execution failed". After another system reboot, and manually starting monit, both now show as running and being monitored.

monit status: Process 'couchdb2'

status Execution failed

monitoring status Monitored

data collected Thu, 05 Oct 2017 11:59:49

TASK [couchdb2 : monit] ********************************************************

fatal: [165.227.172.214]: FAILED! => {"changed": false, "failed": true, "msg": "couchdb2 process not presently configured with monit", "name": "couchdb2", "state": "monitored"}

RUNNING HANDLER [monit : reload monit] *****************************************

to retry, use: --limit @/vagrant/ansible/deploy_stack.retry

PLAY RECAP *********************************************************************

165.227.172.214 : ok=36 changed=20 unreachable=0 failed=1

TASK [redis : monit] ***********************************************************

fatal: [165.227.172.214]: FAILED! => {"changed": false, "failed": true, "msg": "redis process not presently configured with monit", "name": "redis", "state": "monitored"}

RUNNING HANDLER [monit : reload monit] *****************************************

RUNNING HANDLER [redis : restart redis] ****************************************

RUNNING HANDLER [redis : restart rsyslog] **************************************

	to retry, use: --limit @/vagrant/ansible/deploy_stack.retry

PLAY RECAP *********************************************************************

165.227.172.214            : ok=17   changed=10   unreachable=0    failed=1 

Issue 4:

TASK [touchforms : Touchforms user] ********************************************

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ImportError: No module named django

fatal: [165.227.172.214 -> 165.227.172.214]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Traceback (most recent call last):\n File \"/tmp/ansible_iUft9p/ansible_module_django_user.py\", line 144, in <module>\n main()\n File \"/tmp/ansible_iUft9p/ansible_module_django_user.py\", line 125, in main\n user.create_user()\n File \"/tmp/ansible_iUft9p/ansible_module_django_user.py\", line 84, in create_user\n superuser=repr(self.superuser),\n File \"/usr/local/lib/python2.7/dist-packages/sh.py\", line 1427, in __call__\n return RunningCommand(cmd, call_args, stdin, stdout, stderr)\n File \"/usr/local/lib/python2.7/dist-packages/sh.py\", line 774, in __init__\n self.wait()\n File \"/usr/local/lib/python2.7/dist-packages/sh.py\", line 792, in wait\n self.handle_command_exit_code(exit_code)\n File \"/usr/local/lib/python2.7/dist-packages/sh.py\", line 815, in handle_command_exit_code\n raise exc\nsh.ErrorReturnCode_1: \n\n RAN: /home/cchq/www/dev/current/python_env/bin/python manage.py shell --plain\n\n STDOUT:\n\n\n STDERR:\nTraceback (most recent call last):\n File \"manage.py\", line 9, in <module>\n import django\nImportError: No module named django\n\n", "module_stdout": "Traceback (most recent call last):\n File \"manage.py\", line 9, in <module>\n import django\nImportError: No module named django\n\n", "msg": "MODULE FAILURE"}

to retry, use: --limit @/vagrant/ansible/deploy_stack.retry

Possible solution: Here, we need to SSH in and then:

# su - cchq

# cd www/dev/current

# source python_env/bin/activate

# pip install -r requirements/requirements.txt

At this point the whole ansible playbook succeeds, but when we visit our IP, we get the maintenance page and see this in the nginx logs:

2017/10/05 13:56:16 [error] 1064#1064: *18 connect() failed (111: Connection refused) while connecting to upstream, client: 186.106.251.211, server: 165.227.172.214, request: "GET /favicon.ico HTTP/1.1", upstream: "http://165.227.172.214:9010/favicon.ico", host: "165.227.172.214", referrer: "https://165.227.172.214/solutions/"

After activating the python_env we run runserver as `cchq`:

./manage.py runserver 0.0.0.0:9010

  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/django/db/backends/postgresql/base.py", line 176, in get_new_connection
    connection = Database.connect(**conn_params)
  File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/psycopg2/__init__.py", line 130, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: ERROR:  pgbouncer cannot connect to server

At this point, we're wondering:

Why isn't the server running itself?
And how do we get it to run?

Best,

Taylor

tay...@openfn.org

unread,

Oct 5, 2017, 11:36:25 AM10/5/17

to CommCare Developers

Update: Rory found that one issue lay in the encrypted fs stuff. ran:

/etc/init.d/postgresql start
/etc/init.d/pgbouncer stop
/etc/init.d/pgbouncer start

and we can run the server. This was probably due to us having to reboot during the deployment process.

We run migrations (CCHQ_IS_FRESH_INSTALL=1 python manage.py migrate) and get:

File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/botocore/client.py", line 599, in _make_api_call

raise error_class(parsed_response, operation_name)

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied

This appears to be an S3 issue, but I'm fairly certain I've configured my bucket properly and granted access via the access key and secret. (These are not part of version control in the shared repo, of course.) Will update as we go.

FWIW, python manage.py compress fails because it can't find the Font Awesome less file:

CommandError: An error occurred during rendering /home/cchq/www/dev/releases/2017-10-05_12.28/corehq/apps/registration/templates/registration/domain_request.html: 'font-awesome/less/font-awesome.less' could not be found in the COMPRESS_ROOT '/home/cchq/www/dev/releases/2017-10-05_12.28/staticfiles' or with staticfiles.

Simon Kelly

unread,

Oct 6, 2017, 8:19:09 AM10/6/17

to CommCare Developers

Hi Taylor

Our general process is as follows:

Configure blank VMs (just OS)
Create inventory file and vars files
Run ansible deploy - there are often a few hiccoughs here since we don't do fresh installs that often
Once everything is setup we deploy our code with fabric scripts as follows

fab <environment> deploy

environment is the name of an inventory file here: https://github.com/dimagi/commcare-hq-deploy/tree/master/fab/inventory

This also makes use of this 'environments.yml' file which tells the deploy scripts which services to run where and a few other things: https://github.com/dimagi/commcare-hq-deploy/blob/master/fab/environments.yml
That deploy will checkout the latest code, do the static file compression etc and also create the supervisor files needed to run the servers.

We've recently made some improvements to our couchdb setup (you should use couchdb2). I've linked them in comments on your PR.

We are about to do a whole new cluster setup so it's likely that there will be some more changes coming soon.

Re the issues:

1. Switch to using couchdb2

2&3. Resolved in latest master + this PR (https://github.com/dimagi/commcarehq-ansible/pull/971)

4. The virtual env should have already be setup by the deploy_commcarehq playbook which should execute prior to the touchforms playbook. Also touchforms is only necessary if you're going to be doing sms surveys.

Re the encrypted drives. We run the deploy_stack playbook with 'after-reboot' tag limited to the rebooted host. This should remount the encrypted drive and perform a few other actions.

I hope that helps and thanks for the feedback!

Simon Kelly

Director of Server Engineer | Dimagi

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tay...@openfn.org

unread,

Oct 9, 2017, 12:22:29 PM10/9/17

to CommCare Developers

Hey Simon, thanks so much. We've got the fab deploy scripts running now (albeit with lots of warning, sudo received non-zero exit codes*) and finishing successfully. When we ssh into our box, got to the newly created release, activate python and run `runserver` however, we get a server to start but it throws this 500** whenever it's accessed via the web:

OfflineGenerationError: You have offline compression enabled but key "89af02fe109c09d9c74742e99d8f3fea" is missing from offline manifest. You may need to run "python manage.py compress".

2017-10-09 16:15:37,638 ERROR "GET /accounts/login/ HTTP/1.0" 500 59

When running compress, we get this font-awesome package error:

CommandError: An error occurred during rendering /home/cchq/www/dev/releases/2017-10-09_16.04/corehq/motech/openmrs/templates/openmrs/importers.html: 'font-awesome/less/font-awesome.less' could not be found in the COMPRESS_ROOT '/home/cchq/www/dev/releases/2017-10-09_16.04/staticfiles' or with staticfiles.

Have you bumped into this before? Thanks!

*The non-zero exit codes all look pretty much like this:

[165.227.172.214] sudo: /home/cchq/www/dev/releases/2017-10-09_16.04/python_env/bin/python /home/cchq/www/dev/releases/2017-10-09_16.04/manage.py preindex_everything --check

[165.227.172.214] out: 2017-10-09 16:08:10,599 INFO Raven is not configured (logging is disabled). Please see the documentation for more information.

[165.227.172.214] out: 2017-10-09 16:08:12,031 INFO AXES: BEGIN LOG

[165.227.172.214] out:

Warning: sudo() received nonzero return code 1 while executing '/home/cchq/www/dev/releases/2017-10-09_16.04/python_env/bin/python /home/cchq/www/dev/releases/2017-10-09_16.04/manage.py preindex_everything --check'!

**Here's the full 500 error: https://gist.github.com/taylordowns2000/cebc671a34431826a326b66cadccee9d

tay...@openfn.org

unread,

Oct 9, 2017, 5:36:24 PM10/9/17

to CommCare Developers

Simon, my last update for the day:

I've got the server running (and serving html!) when I follow LESS option 1: https://github.com/dimagi/commcare-hq#option-1-let-client-side-javascript-lessjs-handle-it-for-you.

I cannot get compress to run using either option 2 or option 3, and with option 1 (as you can probably see from the linked photo) I'm not actually getting the static assets I need from a CDN.

The error on my compress command is no longer on motech, it's now on "hqadmin":

CommandError: An error occurred during rendering /home/cchq/www/dev/releases/2017-10-09_18.02/corehq/apps/hqadmin/templates/hqadmin/loadtest.html: 'font-awesome/less/font-awesome.less' could not be found in the COMPRESS_ROOT '/home/cchq/www/dev/releases/2017-10-09_18.02/staticfiles' or with staticfiles.

Thanks again for all your help. Speak soon!

Taylor

P.S. — In an effort to make this repeatable, we've got a fork of the ansible repo going that includes a git submodule with your commcare-deploy repo. Our goal is to get this down to a single git clone and a few shell commands! Would love any feedback on the directory structure you use locally.

Jenny Schweers

unread,

Oct 10, 2017, 11:27:31 AM10/10/17

to commcare-...@googlegroups.com

Hi Taylor,

About that compress error: Have you run `bower update` recently? I'd run that, verify that the file ./bower_components/font-awesome/less/font-awesome.less does indeed exist afterwards, and then run collectstatic and compress again.

You can also double-check that your STATICFILES_DIRS contains bower_components (it should be set up by https://github.com/dimagi/commcare-hq/blob/master/settings.py#L87-L97)

-Jenny

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,

Oct 10, 2017, 5:46:31 PM10/10/17

to CommCare Developers

Been offline travelling so sorry for the slow response. Strange that you get that error if you're using the fabric deploy script since it should do a bower update but I'd check what Jenny suggested to make sure.

Re the "sudo received non-zero exit codes" messages, as long as it's only for the 'preindex' command that should be fine. If there are any other errors during deploy then it won't complete. (also PR to remove those warnings: https://github.com/dimagi/commcare-hq-deploy/pull/393)

Simon Kelly

Director of Server Engineer | Dimagi

rorymc...@capefox.co

unread,

Oct 11, 2017, 11:31:54 AM10/11/17

to CommCare Developers

Hi Simon

Yes, Jenny's advice helped us out immensely - we now have commcare up and serving the static assets.

We are seeing what we think are errors connecting to the riak-cs instance - and I tried running `./manage.py ptop_preindex` which produces some iniital success, but then:

Starting pillow preindex ledgers

Traceback (most recent call last):

File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run

result = self._run(*self.args, **self.kwargs)

File "/home/cchq/www/dev/releases/2017-10-09_18.02/corehq/apps/hqcase/management/commands/ptop_preindex.py", line 53, in do_reindex

FACTORIES_BY_SLUG[reindex_command](**kwargs).build().reindex()

File "/home/cchq/www/dev/releases/2017-10-09_18.02/corehq/pillows/case_search.py", line 137, in build

initialize_index_and_mapping(get_es_new(), CASE_SEARCH_INDEX_INFO)

File "./corehq/ex-submodules/pillowtop/es_utils.py", line 87, in initialize_index_and_mapping

initialize_index(es, index_info)

File "./corehq/ex-submodules/pillowtop/es_utils.py", line 92, in initialize_index

return create_index_and_set_settings_normal(es, index_info.index, index_info.meta)

File "./corehq/ex-submodules/pillowtop/es_utils.py", line 73, in create_index_and_set_settings_normal

es.indices.create(index=index, body=metadata)

File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped

return func(*args, params=params, **kwargs)

File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/client/indices.py", line 103, in create

params=params, body=body)

File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 307, in perform_request

status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)

File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 93, in perform_request

self._raise_error(response.status, raw_data)

File "/home/cchq/www/dev/current/python_env/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 105, in _raise_error

raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)

NotFoundError: TransportError(404, u'<HTML><HEAD><TITLE>404 Not Found</TITLE></HEAD><BODY><H1>Not Found</H1>The requested document was not found on this server.<P><HR><ADDRESS>mochiweb+webmachine web server</ADDRESS></BODY></HTML>')

<Greenlet at 0x7f9713dac2d0: do_reindex(u'case_search', False)> failed with NotFoundError

There are more errors in this ilk, the above is merely the first (note: I have added some debugging print statements, so line numbers may be slightly out). Does the above point to us doing something that is obviously wrong?

Thanks in advance.

Rory

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,

Oct 11, 2017, 2:09:47 PM10/11/17

to CommCare Developers

That seems like the Elasticsearch address may be incorrect. This error is happening when the command is trying to create a new index in elasticsearch.

I'd check that you've got your ES connection details correct in localsettings:

ELASTICSEARCH_HOST
ELASTICSEARCH_PORT

You can test the connection using curl:

$ curl <host>:<port>

{
"status" : 200,
"name" : "Albino",
"cluster_name" : "agrajag",
"version" : {
"number" : "1.7.4",
"build_hash" : "0d3159b9fc8bc8e367c5c40c09c2a57c0032b32e",
"build_timestamp" : "2015-12-15T16:45:04Z",
"build_snapshot" : false,
"lucene_version" : "4.10.4"
},
"tagline" : "You Know, for Search"
}

Simon Kelly

Director of Server Engineer | Dimagi

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

rorymc...@capefox.co

unread,

Oct 12, 2017, 2:01:26 PM10/12/17

to CommCare Developers

Thanks Simon.

Just to make sure I am not missing something really obvious ("missing something really obvious" is in fact, quite an accurate summation of my adventure so far) - the ansible scripts set up riak-cs, and so I can point those ES connection strings at the local riak-cs instance?

Regards

Rory

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,

Oct 12, 2017, 3:30:09 PM10/12/17

to CommCare Developers

Hey

So riak-cs and elasticsearch are completely different systems. You can think of Riak-CS as and S3 service. Elasticsearch is a distributed search index.

In localsettings.py the settings for Elasticsearch are the ones I mentioned before. For Riak the settings are:

S3_BLOB_DB_SETTINGS = {
"url": "http://localhost:9980/",
"access_key": "admin-key",
"secret_key": "admin-secret",
"config": {"connect_timeout": 3, "read_timeout": 5},
}

Note that if you are just running a monolith then it's not necessary to have riak at all since you can just the the local filesystem. If you want to go that route then you should just remove the 'riak-cs' group from your inventory file completely. That should result in the above settings being removed from your localsettings file which will cause CommCare HQ to switch to using the filesystem to store binary objects (e.g. form xml).

You should also then set `shared_drive_enabled` to 'false' in your ansible vars file since you don't need a NFS drive for just one machine.

Sorry for the complexities here and the lack of docs.

Simon Kelly

Director of Server Engineer | Dimagi

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

rorymc...@capefox.co

unread,

Oct 13, 2017, 12:56:19 AM10/13/17

to CommCare Developers

D'oh! Thanks Simon, no this is totally my fault - at some point in the process my brain conflated elasticsearch and S3, and then never let go :( - I am not sure why - old age I guess ;).

Thanks for the tips - we will definitely factor them in.

R

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,

Oct 13, 2017, 7:52:55 AM10/13/17

to CommCare Developers

👍

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

rorymc...@capefox.co

unread,

Oct 16, 2017, 11:25:57 AM10/16/17

to CommCare Developers

Hi Simon

Quick question:

We set up a trial account with a ES provider (just so we would not get distracted by the ElasticSearch rabbithole right now) - but the only way I could get `./manage.py ptop_preindex` to connect was to hack in the necessary params for an SSL connection in _es_hosts() in corehq/elastic.py.

Is there a way to get commcare to work with ES using SSL?

Regards

Rory

👍

Simon Kelly

unread,

Oct 16, 2017, 12:51:16 PM10/16/17

to CommCare Developers

We don't use SSL for ES since we don't use external ES service. But you could submit a PR that adds the ability to provide the necessary parameters.

Simon Kelly

Director of Server Engineer | Dimagi

👍

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Message has been deleted

rorymc...@capefox.co

unread,

Oct 18, 2017, 12:04:29 PM10/18/17

to CommCare Developers

Thanks Simon - it turns out that the customer wants to use local ES, so we won't be using the offboard service after all (not even for testing).

It feels as if we have commcare most of the way there now, most pages seem to load and the number of obvious errors :) are very few.

Taylor has tried sending out invites to users, but he says he never receives the mails. There is no obvious signs that anything is going awry - the only clue I have found in the logs is as follows:

a.b.c.d - - - - - [18/Oct/2017:13:00:57 +0000] "POST /hq/notifications/service/ HTTP/1.1" 200 94 "https://y.y.y.y/a/xxxxxx/settings/users/web/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"

I am not sure if this is at all related to what Taylor is trying to do? The other logs have quite a regular complaint about something called toggle.js, but I am not sure if that is related either (as you may be picking up here, there is a lot I am not sure about :) ).

Thanks in advance

Rory

PS I sanitised some of the log entry.

👍

Simon Kelly

unread,

Oct 19, 2017, 8:56:17 AM10/19/17

to CommCare Developers

Hi Rory

You can customize email with these settings (you should set them in your 'localsettings.py' file): https://github.com/dimagi/commcare-hq/blob/6238482bace149b57b13ebaf66b669edd6e372f4/settings.py#L470-L499

You will also need to set the EMAIL_BACKEND according to your specific needs: https://docs.djangoproject.com/en/1.11/topics/email/#topic-email-backends

Simon Kelly

Director of Server Engineer | Dimagi

👍

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

tay...@openfn.org

unread,

Oct 23, 2017, 2:37:57 PM10/23/17

to CommCare Developers

Hi Simon, Jenny, and team,

The new master on commcarehq-ansible works much better. Thank you! I've got a branch that almost runs through on the very first go.

Three quick questions:

Right now, we're only able to get ansible to run if we set a FORMPLAYER_INTERNAL_AUTH_KEY to an empty string, but since we later have issues with Formplayer I'm worried that this isn't the right move. What should we do here?
With which user do you run ansible on a Digital Ocean box? (I noticed an "ansible" user gets configured, but presumably you've got to first run as root. Are you meant to run once as root and then after it fails subsequently run as ansible?)
Where do you clone the commcare-hq-deploy repo and with which user do you run fab <env> deploy?

And then one higher-level question to make sure I'm understanding things correctly: It seems as though deployment on a new box requires the cloning and configuration of three separate repos: (1) commcarehq-ansible, (2) commcare-hq-deploy, and (3) formplayer. If we're trying to get this down to a single repo (or at least a single README) can you describe the relationship between these three repos theoretically and in user/directory terms? It would be amazing to know where you clone each of them and how you run them in relation to each other. We've been doing all the ansible stuff as root and the django stuff as cchq, but that may not be right.

Again, thank you so much. I owe you so many coffees/beers/loaves of bread/pretty-much-you-name-it next time I'm in South Africa.

Taylor

👍

Simon Kelly

unread,

Oct 30, 2017, 4:18:30 AM10/30/17

to CommCare Developers

Hey answers inline:

Right now, we're only able to get ansible to run if we set a FORMPLAYER_INTERNAL_AUTH_KEY to an empty string, but since we later have issues with Formplayer I'm worried that this isn't the right move. What should we do here?

That is a shared secret between HQ and Formplayer which allows formplayer to authenticate API calls to HQ. You should set it to a secret key with a reasonable amount of entropy.

With which user do you run ansible on a Digital Ocean box? (I noticed an "ansible" user gets configured, but presumably you've got to first run as root. Are you meant to run once as root and then after it fails subsequently run as ansible?)

It's usually necessary to run it as a privileged user once to setup the user accounts. We would normally run just the 'users' tag as root and from then on we can run it as the 'ansible' user.

Where do you clone the commcare-hq-deploy repo and with which user do you run fab <env> deploy?

For running deploy you can have the repo anywhere you like as long as you have access to the machines you're deploying to from there. For deploy you don't require any external dependencies (other than those defined in requirements.txt.

And then one higher-level question to make sure I'm understanding things correctly: It seems as though deployment on a new box requires the cloning and configuration of three separate repos: (1) commcarehq-ansible, (2) commcare-hq-deploy, and (3) formplayer. If we're trying to get this down to a single repo (or at least a single README) can you describe the relationship between these three repos theoretically and in user/directory terms? It would be amazing to know where you clone each of them and how you run them in relation to each other. We've been doing all the ansible stuff as root and the django stuff as cchq, but that may not be right.

None of these repo's need to be anywhere specific. The formplayer repo in particular should not be needed for anything. Currently when we deploy formplayer it pulls the latest version from our Jenkins build server.

The other two are related as follows (at least for our setup):

ansible repo: stores all the ansible playbooks and vault files (with all the secret keys etc).
deploy repo: has the deploy scripts and the ansible inventory files (required for both deploy and ansible)

We always setup one of the VMs in our clusters as a 'control' machine from where we can run the ansible playbooks (and also normal deploys if we want). Once you have an account on this machine you can follow the instructions in the readme: https://github.com/dimagi/commcarehq-ansible#setting-up-a-dev-account-on-ansible-control-machine

This should setup the 'commcare-hq-deploy' repo and also the python virtualenv for ansible. It will also create some bash aliases that make it easier to run ansible playbooks.

I hope that answer's your questions. Let me know if you have follow ups.

Cheers

Simon

tay...@openfn.org

unread,

Oct 30, 2017, 7:11:57 AM10/30/17

to CommCare Developers

Simon, this is fantastic—thank you. Very quick one:

I ran ansible-playbook -i inventories/monolith -e '@vars/dev/dev_private.yml' -e '@vars/dev/dev_public.yml' deploy_stack.yml -u root --tags=users
Then ansible-playbook -i inventories/monolith -e '@vars/dev/dev_private.yml' -e '@vars/dev/dev_public.yml' deploy_stack.yml -u ansible

on a brand new box, but get:

TASK [apt] ***********************************************************************************************************************************************************************************

fatal: [159.203.132.215]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to 159.203.132.215 closed.\r\n", "module_stdout": "sudo: a password is required\r\n", "msg": "MODULE FAILURE", "rc": 1}

Was I misinterpreting the suggestion? Or is there a change that must be made either to the ansible user setup or with visudo to get this working? Conceivably, we could set up password-less sudo privs for ansible like this: ansible ALL=(ALL) NOPASSWD:ALL

Best,

Taylor

Simon Kelly

unread,

Oct 30, 2017, 8:54:41 AM10/30/17

to CommCare Developers

The basic workflow is to add `-K` or `--ask-become-pass` to the command line which will prompt you for the 'sudo' password.

If you're also using ansible vault you will want `--ask-vault-pass` as well. Supplying two passwords is inconvenient so you can actually store the 'become' password in the vault file as "ansible_become_pass".

Simon Kelly

Director of Server Engineer | Dimagi

John Harper

unread,

Dec 1, 2017, 7:42:45 PM12/1/17

to CommCare Developers

Simon, I have been following along and have tried the ansible load as well...

the core ansible load went very well......few hiccups along the way but made it through it.

I am at the deployment stage but I have ran into this error.....

File "/home/harperjo/commcarehq-ansible/commcare-hq-deploy/fab/fabfile.py", line 272, in read_inventory_file
return get_inventory(filename).get_group_dict()
AttributeError: 'Inventory' object has no attribute 'get_group_dict'
harperjo@GCS-1:~/commcarehq-ansible/commcare-hq-deploy$

this is on my fav dev deploy execution. I am wondering if i missed something in the fabfile that points to my ansible environment file?

your thoughts?

Simon Kelly

unread,

Dec 2, 2017, 12:46:34 AM12/2/17

to CommCare Developers

Hi John

I'm glad it's gone well. The error you're seeing is a bug after we upgraded too a new version of ansible. If you update the deploy repo to the latest version and make sure your requirements are up to date it should work.

Cheers

Simon

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

John Harper

unread,

Dec 2, 2017, 9:30:58 AM12/2/17

to CommCare Developers

okay thank

I will check the ansible version I have on the control and my host machines to ensure it is at the correct level.

should i just blow away my commcare-hq-deploy folder and pull the new one down using wget?

ill have to configure my fabfile.py, environment.yml and others again correct?

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,

Dec 3, 2017, 1:21:03 AM12/3/17

to CommCare Developers

You shouldn't need to blow it away. You should just be able to do a git pull to got the latest version.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "CommCare Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to commcare-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Simon Kelly

unread,

Dec 3, 2017, 1:22:32 AM12/3/17

to CommCare Developers

Though if you've made changes to the fabfile you'll want to commit those in a branch and then merge the h updated master branch into your environment branch.

On 03 Dec 2017 08:21, "Simon Kelly" <ske...@dimagi.com> wrote:

You shouldn't need to blow it away. You should just be able to do a git pull to got the latest version.

Reply all

Reply to author

Forward