Docker and AtoM Server Configuration

617 views
Skip to first unread message

tat...@gmail.com

unread,
Feb 13, 2019, 3:33:04 PM2/13/19
to AtoM Users
Dear Dan, 

For the instructions for AtoM Server Configuration, how can I translate it so I can used it for docker installation of AtoM and Archivematica?

Thanks,
Tatiana Canelhas

Dan Gillean

unread,
Feb 14, 2019, 12:05:08 PM2/14/19
to ICA-AtoM Users
Hi Tatiana, 

This is not something we have tested, which is why it's not currently documented. I can ask our team if they have some suggestions, but would need to know more about your installation environment. For example: 
  • Are you installing both Archivematica and AtoM using Docker?
  • Are they part of the same container, or separate containers?
  • Are the Docker instances on the same server or different ones?
  • Did you follow these instructions for Archivematica and these ones for AtoM? Or have you made changes? If changes, can you tell us more?
Any other information you can provide about the installation environment will help! I'm not sure if we will be able to provide an exact configuration, but hopefully with more information I can get some suggestions from our team to point you in the right direction. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/4a963d8a-4ead-496a-af7e-775aab5b3814%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tatiana

unread,
Feb 15, 2019, 9:16:01 AM2/15/19
to ica-ato...@googlegroups.com
Dear Dan,

Thank you for your reply!

Are you installing both Archivematica and AtoM using Docker?
Yes.

Are they part of the same container, or separate containers?
Separate containers, since I am using both docker-compose.yml files to start AtoM and Archivematica separately (so, yes, one percona for each, one elasticsearch for each etc)

Are the Docker instances on the same server or different ones?
Same server.

Did you follow these instructions for Archivematica and these ones for AtoM? Or have you made changes? If changes, can you tell us more?
We are following both instructions, exactly, and then trying to integrate AtoM and Archivematica from there, but without much success.

We initially tried integrating following the regular integration instructions. However, we did get far with those instructions due to a few key problems:

- We are not sure what to use as AtoM's hostname on Archivematica's configuration
- We are not sure how to transfer files from Archivematica to AtoM. We tried the rsync approach, but the instructions for installing rsync on AtoM do not work (since it is an Alpine Linux, quite distinct from Ubuntu).

It seems to us that the regular integration instructions might not apply, correct?

We are currently mapping out better strategies. So far, we have two:

1. Author a new docker-compose.yml file with AtoM *and* Archivematica together. That would mean we can use only one percona server, one elasticsearch etc, and we could work out this communication by putting the AtoM and Archivematica containers on the same internal network
2. Create a bridge network between AtoM and Archivematica (declaring it as "external" on both docker-compose.yml)

Approach #1 seems a bit too far fetched for us at the moment, specially since we do not wish to wander too far from the "official" docker-compose files (but maybe you guys could consider authoring such a docker-compose file in the future).

Approach #2 seems promising. We were able to connect AtoM and Archivematica with an external bridged network (simple modifications to both docker-compose.yml files), and with that in place we can simply use "atom" as the hostname for AtoM on Archivematica, which gets resolved to the correct IP address. However, that only solves the API calls. We are not sure how to proceed with the file transfer. We are also not sure if the API calls from Archivematica should be pointed directly to the AtoM container or to AtoM's nginx container.

Those are our thoghts so far. Could you please point us in the right direction?

Thanks again for your invaluable assistance, it is greatly appreciated as always!

Tatiana Canelhas

You received this message because you are subscribed to a topic in the Google Groups "AtoM Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ica-atom-users/GCIhtN4LsDg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ica-atom-user...@googlegroups.com.

To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.

Dan Gillean

unread,
Feb 15, 2019, 10:53:29 AM2/15/19
to ICA-AtoM Users
Hi Tatiana, 

I checked in with one of our developers (who has worked the most with our AtoM Docker instance). He said that you are on a good and promising path, but at this point you have surpassed anything he has explored. However, you have apparently piqued his interest, and if he's able, he may take some time to look into this further over the weekend. Hopefully at that point, he might be able to provide some general guidance or suggestions of things to try. 

In the meantime, I wish you the best of luck! Please keep us posted on your progress! 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

Karl Goetz

unread,
Feb 18, 2019, 5:13:05 PM2/18/19
to ica-ato...@googlegroups.com
Hi Tatiana,
I don’t have enough docker experience to help with the question posed, but I want to observe that you seem to be making an atypical and possibly slightly awkward custom setup on the basis of two very solvable looking problems. (I’m going to ignore that you aren’t actually using Ubuntu as is recommended, I assume you’re trying to run diskless or in a similar configuration).

All that aside, it may be beneficial for future readers if you could give some sort of overview of the issues you faced - someone may try and with Alpine in future and having something to fall back on may help them with their project too.

Karl.

On 16 Feb 2019, at 1:15 am, Tatiana <tat...@gmail.com> wrote:

Did you follow these instructions for Archivematica and these ones for AtoM? Or have you made changes? If changes, can you tell us more?
We are following both instructions, exactly, and then trying to integrate AtoM and Archivematica from there, but without much success.

We initially tried integrating following the regular integration instructions. However, we did get far with those instructions due to a few key problems:

- We are not sure what to use as AtoM's hostname on Archivematica's configuration
- We are not sure how to transfer files from Archivematica to AtoM. We tried the rsync approach, but the instructions for installing rsync on AtoM do not work (since it is an Alpine Linux, quite distinct from Ubuntu).

It seems to us that the regular integration instructions might not apply, correct?

We are currently mapping out better strategies. So far, we have two:

1. Author a new docker-compose.yml file with AtoM *and* Archivematica together. That would mean we can use only one percona server, one elasticsearch etc, and we could work out this communication by putting the AtoM and Archivematica containers on the same internal network
2. Create a bridge network between AtoM and Archivematica (declaring it as "external" on both docker-compose.yml)



-- 
Karl Goetz,  Senior Library Officer (Library Systems)
University of Tasmania, Private Bag 25, Hobart 7001
Available Tuesday, Wednesday, Thursday



University of Tasmania Electronic Communications Policy (December, 2014).
This email is confidential, and is for the intended recipient only. Access, disclosure, copying, distribution, or reliance on any of it by anyone outside the intended recipient organisation is prohibited and may be a criminal offence. Please delete if obtained in error and email confirmation to the sender. The views expressed in this email are not necessarily the views of the University of Tasmania, unless clearly intended otherwise.

Tatiana

unread,
Feb 18, 2019, 11:21:55 PM2/18/19
to ica-ato...@googlegroups.com
Dear Dan,

I am grateful to inform that we have been able to integrate AtoM and Archivematica in docker!

I'll document the steps we've taken here. There's a lot of room for improvement, but I feel I should document this procedure as-is nonetheless, since it was our first successful attempt out of many many failures.

First, a side note regarding Karl's reply: we are not using Alpine Linux as host system, we are using Ubuntu 16. When we mentioned Alpine Linux before, we were referring to the image on which AtoM's dockerfile is based, as stated here.

The steps below should be performed immediately before creating the containers (before "docker-compose up -d"), that is, after cloning the repositories and their submodules.

1. Create a bridged network to connect AtoM and Archivematica:

docker network create -d bridge atom-am

2. Create a volume to send DIP packages from Archivematica to AtoM:

docker volume create dip-uploads

3. Change file am/src/archivematica/MCPServer.Dockerfile, adding this to the end, before "USER archivematica":

RUN set -ex \
&& mkdir -p /var/dip-uploads \
&& chown -R archivematica:archivematica /var/dip-uploads

Note: this is necessary because we'll mount the "dip-uploads" volume on this folder, and if this action is not performed then docker mounts the volume having root as owner. Since Archivematica runs as unpriviledged "archivematica" user, in that case it is not able to write to the mounted volume (even thought the volume is mounted with "read-write" mode).

4. Change file am/compose/docker-compose.yml adding the external network and volume. Complete docker-compose.yml file below, with changes highlighted (please see notes below regarding changes):

---
version: "2.1"

volumes:

  # Internal named volumes.
  # These are not accessible outside of the docker host and are maintained by
  # Docker.
  mysql_data:
  elasticsearch_data:
  archivematica_storage_service_staging_data:

  # External named volumes.
  # These are intended to be accessible beyond the docker host (e.g. via NFS).
  # They use bind mounts to mount a specific "local" directory on the docker
  # host - the expectation being that these directories are actually mounted
  # filesystems from elsewhere.
  archivematica_pipeline_data:
    external:
      name: "am-pipeline-data"
  archivematica_storage_service_location_data:
    external:
      name: "ss-location-data"
  dip_uploads:
    external:
      name: "dip-uploads"

networks:

  common:
  atom_am:
    external:
      name: "atom-am"


services:

  mysql:
    image: "percona:5.6"
    user: "mysql"
    environment:
      MYSQL_ROOT_PASSWORD: "12345"
    volumes:
      - "mysql_data:/var/lib/mysql"
    ports:
      - "127.0.0.1:62001:3306"
    networks:
      - "common"

  elasticsearch:
    environment:
      - cluster.name=am-cluster
      - node.name=am-node
      - network.host=0.0.0.0
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - "elasticsearch_data:/usr/share/elasticsearch/data"
    ports:
      - "127.0.0.1:62002:9200"
    networks:
      - "common"

  redis:
    image: "redis:3.2-alpine"
    command: '--save "" --appendonly no'  # Persistency disabled
    user: "redis"
    ports:
      - "127.0.0.1:62003:6379"
    networks:
      - "common"

  gearmand:
    image: "artefactual/gearmand:1.1.17-alpine"
    command: "--queue-type=redis --redis-server=redis --redis-port=6379"
    user: "gearman"
    ports:
      - "127.0.0.1:62004:4730"
    links:
      - "redis"
    networks:
      - "common"

  fits:
    image: "artefactual/fits-ngserver:0.8.4"
    ports:
      - "127.0.0.1:62005:2113"
    volumes:
      - "archivematica_pipeline_data:/var/archivematica/sharedDirectory:rw"  # Read and write needed!
    networks:
      - "common"

  clamavd:
    image: "artefactual/clamav:latest"
    environment:
      CLAMAV_MAX_FILE_SIZE: "${CLAMAV_MAX_FILE_SIZE}"
      CLAMAV_MAX_SCAN_SIZE: "${CLAMAV_MAX_SCAN_SIZE}"
      CLAMAV_MAX_STREAM_LENGTH: "${CLAMAV_MAX_STREAM_LENGTH}"
    ports:
      - "127.0.0.1:62006:3310"
    volumes:
      - "archivematica_pipeline_data:/var/archivematica/sharedDirectory:ro"
    networks:
      - "common"

  nginx:
    image: "nginx:stable-alpine"
    volumes:
      - "./etc/nginx/nginx.conf:/etc/nginx/nginx.conf:ro"
      - "./etc/nginx/conf.d/archivematica.conf:/etc/nginx/conf.d/archivematica.conf:ro"
      - "./etc/nginx/conf.d/default.conf:/etc/nginx/conf.d/default.conf:ro"
    ports:
      - "62080:80"
      - "62081:8000"
    networks:
      - "common"

  archivematica-mcp-server:
    build:
      context: "../src/archivematica/src"
      dockerfile: "MCPServer.Dockerfile"
    environment:
      DJANGO_SECRET_KEY: "12345"
      DJANGO_SETTINGS_MODULE: "settings.common"
      ARCHIVEMATICA_MCPSERVER_CLIENT_USER: "archivematica"
      ARCHIVEMATICA_MCPSERVER_CLIENT_PASSWORD: "demo"
      ARCHIVEMATICA_MCPSERVER_CLIENT_HOST: "mysql"
      ARCHIVEMATICA_MCPSERVER_CLIENT_DATABASE: "MCP"
      ARCHIVEMATICA_MCPSERVER_MCPSERVER_MCPARCHIVEMATICASERVER: "gearmand:4730"
      ARCHIVEMATICA_MCPSERVER_SEARCH_ENABLED: "${AM_SEARCH_ENABLED:-true}"
    volumes:
      - "../src/archivematica/src/archivematicaCommon/:/src/archivematicaCommon/"
      - "../src/archivematica/src/dashboard/:/src/dashboard/"
      - "../src/archivematica/src/MCPServer/:/src/MCPServer/"
      - "archivematica_pipeline_data:/var/archivematica/sharedDirectory:rw"
      - "dip_uploads:/var/dip-uploads:rw"
    links:
      - "mysql"
      - "gearmand"
    networks:
      - "common"
      - "atom_am"

  archivematica-mcp-client:
    build:
      context: "../src/archivematica/src"
      dockerfile: "MCPClient.Dockerfile"
    environment:
      DJANGO_SECRET_KEY: "12345"
      DJANGO_SETTINGS_MODULE: "settings.common"
      NAILGUN_SERVER: "fits"
      NAILGUN_PORT: "2113"
      ARCHIVEMATICA_MCPCLIENT_CLIENT_USER: "archivematica"
      ARCHIVEMATICA_MCPCLIENT_CLIENT_PASSWORD: "demo"
      ARCHIVEMATICA_MCPCLIENT_CLIENT_HOST: "mysql"
      ARCHIVEMATICA_MCPCLIENT_CLIENT_DATABASE: "MCP"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_ELASTICSEARCHSERVER: "elasticsearch:9200"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_MCPARCHIVEMATICASERVER: "gearmand:4730"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_SEARCH_ENABLED: "${AM_SEARCH_ENABLED:-true}"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CAPTURE_CLIENT_SCRIPT_OUTPUT: "${AM_CAPTURE_CLIENT_SCRIPT_OUTPUT:-true}"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_SERVER: "clamavd:3310"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_CLIENT_MAX_FILE_SIZE: "${CLAMAV_MAX_FILE_SIZE}"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_CLIENT_MAX_SCAN_SIZE: "${CLAMAV_MAX_SCAN_SIZE}"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_CLIENT_MAX_STREAM_LENGTH: "${CLAMAV_MAX_STREAM_LENGTH}"
      ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_CLIENT_BACKEND: "clamdscanner" # Option: clamdscanner or clamscan;
    volumes:
      - "../src/archivematica/src/archivematicaCommon/:/src/archivematicaCommon/"
      - "../src/archivematica/src/dashboard/:/src/dashboard/"
      - "../src/archivematica/src/MCPClient/:/src/MCPClient/"
      - "archivematica_pipeline_data:/var/archivematica/sharedDirectory:rw"
      - "dip_uploads:/var/dip-uploads:rw"
    links:
      - "fits"
      - "clamavd"
      - "mysql"
      - "gearmand"
      - "elasticsearch"
      - "archivematica-storage-service"
    networks:
      - "common"
      - "atom_am"

  archivematica-dashboard:
    build:
      context: "../src/archivematica/src"
      dockerfile: "dashboard.Dockerfile"
    environment:
      FORWARDED_ALLOW_IPS: "*"
      AM_GUNICORN_ACCESSLOG: "/dev/null"
      AM_GUNICORN_RELOAD: "true"
      AM_GUNICORN_RELOAD_ENGINE: "auto"
      DJANGO_SETTINGS_MODULE: "settings.local"
      ARCHIVEMATICA_DASHBOARD_DASHBOARD_GEARMAN_SERVER: "gearmand:4730"
      ARCHIVEMATICA_DASHBOARD_DASHBOARD_ELASTICSEARCH_SERVER: "elasticsearch:9200"
      ARCHIVEMATICA_DASHBOARD_CLIENT_USER: "archivematica"
      ARCHIVEMATICA_DASHBOARD_CLIENT_PASSWORD: "demo"
      ARCHIVEMATICA_DASHBOARD_CLIENT_HOST: "mysql"
      ARCHIVEMATICA_DASHBOARD_CLIENT_DATABASE: "MCP"
      ARCHIVEMATICA_DASHBOARD_SEARCH_ENABLED: "${AM_SEARCH_ENABLED:-true}"
    volumes:
      - "../src/archivematica/src/archivematicaCommon/:/src/archivematicaCommon/"
      - "../src/archivematica/src/dashboard/:/src/dashboard/"
      - "archivematica_pipeline_data:/var/archivematica/sharedDirectory:rw"
      - "dip_uploads:/var/dip-uploads:rw"
    links:
      - "mysql"
      - "gearmand"
      - "elasticsearch"
      - "archivematica-storage-service"
    networks:
      - "common"
      - "atom_am"

  archivematica-storage-service:
    build:
      context: "../src/archivematica-storage-service"
    environment:
      FORWARDED_ALLOW_IPS: "*"
      SS_GUNICORN_ACCESSLOG: "/dev/null"
      SS_GUNICORN_RELOAD: "true"
      SS_GUNICORN_RELOAD_ENGINE: "auto"
      DJANGO_SETTINGS_MODULE: "storage_service.settings.local"
      SS_DB_URL: "mysql://archivematica:demo@mysql/SS"
      SS_GNUPG_HOME_PATH: "/var/archivematica/storage_service/.gnupg"
    volumes:
      - "../src/archivematica-storage-service/:/src/"
      - "../src/archivematica-sampledata/:/home/archivematica/archivematica-sampledata/:ro"
      - "archivematica_pipeline_data:/var/archivematica/sharedDirectory:rw"
      - "archivematica_storage_service_staging_data:/var/archivematica/storage_service:rw"
      - "archivematica_storage_service_location_data:/home:rw"
      - "dip_uploads:/var/dip-uploads:rw"
    links:
      - "mysql"
    networks:
      - "common"
      - "atom_am"


Notes:

- Apparently, if the "networks" section is altogether omitted, like in the original docker-compose.yml file, all containers are placed on the same "implicit" network. Because of this, we had to change the docker-compose.yml file much more than we'd like, adding a clause to each container putting it on a "common" network (otherwise containers wouldn't communicate after adding our external network). Perhaps there's another way to declare the external network that doesn't cause this, maybe inside the "links" section?

- Since we weren't sure which of the containers -- archivematica-dashboard, archivematica-mcp-server or archivematica-mcp-client -- actually call AtoM's API, we connected the external volume and network to all 3 containers. We plan to pinpoint which containers actually participate in the communication, so as to reduce the necessary changes to the docker-compose.yml file

5. Change file atom/docker/docker-compose.dev.yml adding the external volume and network. Complete file below, changes highlighted (a lot less changes on this file, thankfully):

---
version: "2"

volumes:

  elasticsearch_data:
    driver: "local"

  percona_data:
    driver: "local"

  dip_uploads:
    external:
      name: "dip-uploads"

networks:

  net_cache:
  net_db:
  net_jobs:
  net_http:
  net_search:
  atom_am:
    external:
      name: "atom-am"

services:

  elasticsearch:
    image: "elasticsearch:1.7"
    # chown only seems to be solving a problem only happeing with osx+boot2docker and vboxsf
    command: "bash -c 'chown -R elasticsearch:elasticsearch /elasticsearch-data && elasticsearch -Des.network.host=0.0.0.0 -Des.path.data=/elasticsearch-data'"
    volumes:
      - "elasticsearch_data:/elasticsearch-data:rw"
    ports:
      - "63002:9200"
    expose:
      - "9300"
    networks:
      - "net_search"

  percona:
    image: "percona:5.6"
    environment:
      - "MYSQL_ROOT_PASSWORD=my-secret-pw"
      - "MYSQL_DATABASE=atom"
      - "MYSQL_USER=atom"
      - "MYSQL_PASSWORD=atom_12345"
    volumes:
      - "percona_data:/var/lib/mysql:rw"
      - "./etc/mysql/conf.d/:/etc/mysql/conf.d:ro"
    expose:
      - "3306"
    networks:
      - "net_db"

  memcached:
    image: "memcached"
    command: "-p 11211 -m 128 -u memcache"
    expose:
      - "11211"
    networks:
      - "net_cache"
      - "net_jobs"

  gearmand:
    image: "artefactual/gearmand"
    expose:
      - "4730"
    networks:
      - "net_cache"
      - "net_jobs"

  atom:
    build:
      context: "../"
      dockerfile: "./docker/Dockerfile"
    command: "fpm"
    volumes:
      - "../:/atom/src:rw"
      - "dip_uploads:/var/dip-uploads:rw"
    networks:
      - "net_cache"
      - "net_db"
      - "net_http"
      - "net_jobs"
      - "net_search"
    environment:
      - "ATOM_DEVELOPMENT_MODE=on"
      - "ATOM_ELASTICSEARCH_HOST=elasticsearch"
      - "ATOM_MEMCACHED_HOST=memcached"
      - "ATOM_GEARMAND_HOST=gearmand"
      - "ATOM_MYSQL_DSN=mysql:host=percona;port=3306;dbname=atom;charset=utf8"
      - "ATOM_MYSQL_USERNAME=atom"
      - "ATOM_MYSQL_PASSWORD=atom_12345"
      - "ATOM_DEBUG_IP=172.22.0.1"
    ports:
      - "63022:22"

  atom_worker:
    build:
      context: "../"
      dockerfile: "./docker/Dockerfile"
    command: "worker"
    volumes:
      - "../:/atom/src:rw"
      - "dip_uploads:/var/dip-uploads:rw"
    networks:
      - "net_cache"
      - "net_db"
      - "net_jobs"
      - "net_search"
    environment:
      - "ATOM_DEVELOPMENT_MODE=on"
      - "ATOM_ELASTICSEARCH_HOST=elasticsearch"
      - "ATOM_MEMCACHED_HOST=memcached"
      - "ATOM_GEARMAND_HOST=gearmand"
      - "ATOM_MYSQL_DSN=mysql:host=percona;port=3306;dbname=atom;charset=utf8"
      - "ATOM_MYSQL_USERNAME=atom"
      - "ATOM_MYSQL_PASSWORD=atom_12345"

  nginx:
    image: "nginx:latest"
    ports:
      - "63001:80"
    volumes:
      - "../:/atom/src:ro"
      - "./etc/nginx/prod.conf:/etc/nginx/nginx.conf:ro"
    networks:
      - "net_http"
      - "atom_am"
    depends_on:
      - "atom"


Notes:

- Since the API calls will be directed to the nginx container, we only needed to connect that container to the external network

- We guessed it was only necessary to connect the atom_worker container to the external volume, but ended up connecting the atom container to it as well

6. Proceed with instructions for creating Archviematica and AtoM containers ("docker-compose up -d" and the rest of the instructions for each)

7. Test HTTP communication from Archivematica's MCP Server container to AtoM's nginx container:

docker exec -u archivematica compose_archivematica-mcp-server_1 curl http://docker_nginx_1/

Expected output: AtoM's index HTML page, showing that from Archivematica we can get to AtoM referring only to its container name (no local ephemeral IP addresses involved!)

8. Test access to the shared volume:

docker exec -u archivematica compose_archivematica-mcp-server_1 touch /var/dip-uploads/foo
docker exec docker_atom_worker_1 rm /var/dip-uploads/foo

Expected output: none (no errors means the user "archivematica" was able to create a file on the /var/dip-uploads directory from the MCP Server container and that from the atom worker container we can read and remove said file).

9. Workaround for problem on atom worker with qtSwordPlugin plugin (this was the really challenging part)

Check your atom_worker's log:

docker-compose logs atom_worker

Look for this warning message:

Ability not defined: qtSwordPluginWorker. Please ensure the job is in the lib/task/job directory or that the plugin is enabled.

If you don't see the message above, but see the message "New ability: qtSwordPluginWorker" (which is unlikely at this point), then you can skip the rest of this step.

If you do see the warning message above, you must first solve it. The integration will not work while this issue is not addressed. If you continue with the integration, you'll later encounter an error when Archivematica calls AtoM stating that no worker is able to process the request (I can't find the exact message, but something along those lines).

Searching the web for the warning message above, we found it on several log outputs posted by users. Stranger still, we noticed that:

- Restarting the container (docker-compose restart atom_worker) does not solve the issue
- If we repeat the entrypoint command while the container is running, the warning comes up again: docker exec docker_atom_worker_1 php symfony jobs:worker
- If we first access the plugins configuration page on AtoM's dashboard and then repeat the command above the warning goes away

Repeating: merely accessing the plugins configuration page causes the qtSwordPluginWorker to be loaded correctly! We noticed this by accessing the page to ensure the plugin was enabled (per Dan's recommendation on several related threads). The plugin is enabled by default on the docker, but merely accessing the page is enough to get the qtSwordPlugin loaded (clicking the "Save" button is not required).

This, of course, does not solve the problem for our container, since we need the worker started by the initial entrypoint command to succeed, not an additional worker executed "on the side".

By looking at the source code for the sfPluginAdminPlugin, responsible for showing the plugin configuration page, we guessed that the method sfPluginAdminPluginConfiguration.initialize(), executed when the plugin configuration page is first accessed, was making the difference -- specially this part. For some reason, it seems the qtSwordPluginWorker is not loaded if only  php symfony jobs:worker   is executed. However, some actions during the normal operation of AtoM's dashboard cause it to be loaded. This might explain why so many log outputs seem to have that warning, and why a simple restart of the worker usually solves the issue.

Anyway, we were painstakingly able to come up with a workaround for this issue. I must say it is quite ugly and we do not expect this to be a permanent solution at all!

The fix consists of manually changing the file ~/atom/lib/task/jobs/jobWorkerTask.class.php on the host machine with the following highlighted lines:

<?php

// Part of the workaround for qtSwordPlugin autoload problem
require_once "/atom/src/plugins/qtSwordPlugin/config/qtSwordPluginConfiguration.class.php";

(...)

  protected function execute($arguments = array(), $options = array())
  {
    $configuration = ProjectConfiguration::getApplicationConfiguration($options['application'], $options['env'], false);
    $context = sfContext::createInstance($configuration);

    // Part of the workaround for qtSwordPlugin autoload problem
    $pluginConfig = new qtSwordPluginConfiguration($configuration, "/atom/src/plugins/qtSwordPlugin", "qtSwordPlugin");
    $pluginConfig->initializeAutoload();
    $pluginConfig->initialize();

    // Using the current context, get the event dispatcher and suscribe an event in it
    $context->getEventDispatcher()->connect('gearman.worker.log', array($this, 'gearmanWorkerLogger'));

(...)

PS: Did I mention it was ugly?

After saving that file, the warning is gone:

docker-compose restart atom_worker
docker-compose logs atom_worker

atom_worker_1    | 2019-02-18 19:52:37 > New ability: arFindingAidJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arInheritRightsJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arObjectMoveJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arInformationObjectCsvExportJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: qtSwordPluginWorker
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arUpdatePublicationStatusJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arFileImportJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arInformationObjectXmlExportJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arXmlExportSingleFileJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arGenerateReportJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arActorCsvExportJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arActorXmlExportJob
atom_worker_1    | 2019-02-18 19:52:37 > New ability: arRepositoryCsvExportJob
atom_worker_1    | 2019-02-18 19:52:37 > Running worker...
atom_worker_1    | 2019-02-18 19:52:37 > PID 12


10. Configure the DIP upload on Archivematica with the following settings:

- Upload URL: http://docker_nginx_1
- Rsync target: /var/dip-uploads
- Rsync command: (empty)

The rest of the settings (login email, password, REST API key) should be filled out as usual with credentials for AtoM.


11. Configure AtoM

The default setting is for AtoM to look for DIP uploads on the "/tmp" directory. We must change that on Admin > Settings > Global > SWORD deposit directory. Set it to "/var/dip-uploads"


12. Restart atom_worker

docker-compose restart atom_worker

Note: atom worker reads the setting above when it starts, so we must restart it once after changing the setting


That's it, now the DIP upload should work. Phew!


Like I said, this is very much a work in progress. IMO the main steps that need improving are:

- Reduce the "customization footprint" on Archivemetica's docker-compose.yml file by better understanding the whole "implicit networking" configuration, which seem to be related to the usage of the "links" configuration instead of "network"

- Fix the mentioned issue with the loading of qtSwordPlugin on atom worker (Dan, do you think we should go ahead and report this as a bug?)

- Find a way to avoid needing to change the file MCPServer.Dockerfile (perhaps if we mount the volume to /archivematica/dip-uploads it will be created having archivematica as owner?)


That's it for now. Please let me know if any of this can be of use for you guys at Artefactual, and if you need anything from my part.

Cheers,

Tatiana Canelhas









--
You received this message because you are subscribed to a topic in the Google Groups "AtoM Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ica-atom-users/GCIhtN4LsDg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.

Ricardo Pinho

unread,
Feb 19, 2019, 5:19:39 AM2/19/19
to ica-ato...@googlegroups.com
Oi Tatiana,
Thank you so much for sharing
We have been using AtoM on Docker 2.4 on a production environment for several months now.
We have found that the 2.5 version is not stable enough for production.

We are also starting implementing Archivematica and intent to go further on Docker environment and integrate with AtoM.
So we appreciate this sharing and are open for further direct interaction, in Portuguese if you wish!
Cheers,
Ricardo Pinho





You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.

To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.

For more options, visit https://groups.google.com/d/optout.


--
Ricardo Pinho

José Raddaoui

unread,
Feb 19, 2019, 12:12:02 PM2/19/19
to AtoM Users
Hi Tatiana,

That's a great detective work and a nice integration! Thank you for sharing!

As Dan mentioned, I wanted to give it a try too and I came out with a different solution, more oriented for a one go in a development or a testing instance than to a solid integration, since we don't recommend to use the docker environments as they are in production.

I think the network situation could be simplified a lot in our Docker Compose configuration and for this integration. We should definitely stop using links (a Docker legacy feature) and, at least for AtoM, we should use just the default network instead of specific ones. However, with the current implementation, most of the ports are exposed to the host and you could connect the required AM and AtoM containers without adding an external network. For example, in a default setup of both environments with everything running, executing the following command from the `am/compose` folder, should show you a 200 status code and the headers from the AtoM site:

docker-compose exec archivematica-mcp-client curl -I http://192.168.1.135:63001

Where `192.168.1.135` is the local IP of the host where the containers are running. In the DIP upload process, by default, there are two interactions between AM and AtoM containers, an HTTP request from the MCPClient to the AtoM site (Nginx container) and a SSH connection between the MCPClient and the AtoM worker container to `rsync` the DIP folder. The above command checks that the HTTP request can be performed.

For the `rsync` process, I really like your solution to use a shared external volume and not perform the SSH connection. That's aligns more with a full Docker environment and it avoids to pass the SSH key and create the `archivematica` user in the AtoM worker container. Nevertheless, if an external shared volume is not possible or not desirable, it's still possible to `rsync` the files through a SSH connection between the containers. As mentioned, this is more for a one go setup, but a similar setup could be added to the Dockerfiles for a more permanent installation.

First, expose the SSH port to the host from the AtoM worker container, by adding at the end of the `atom_worker` service in the `docker-compose` file:

ports:
  - "63022:22"

You'll need to recreate the container if it's already built to reflect the changes.

Second, create a SSH key for the `archivematica` user in the MCPClient container. From the `am/compose` folder in the host:

docker-compose exec -u root archivematica-mcp-client bash
apt-get update
apt-get install openssh-client
su archivematica
ssh-keygen
cat ~/.ssh/id_rsa.pub
exit
exit

Copy the public key from the `cat` output as it will be needed after in the AtoM worker container.

Third, enable SSH connection in the AtoM worker container. In the Alpine Docker container `openrc` is not installed, if you want to run the SSH daemon as a service you can install `openrc` too, but it's not actually a requirement. From the AtoM folder in the host, enter in the shell of the AtoM worker container:

docker-compose exec atom_worker sh
apk add --no-cache rsync openssh openrc
rc-update add sshd
rc-status
mkdir -p /run/openrc
touch /run/openrc/softlevel
service sshd restart

Fourth, create `archivematica` user in the AtoM worker container and authorize its key. Still inside the AtoM worker container:

addgroup archivematica
adduser -h /home/archivematica -s /bin/sh -G archivematica -D archivematica
passwd -u archivematica
mkdir /home/archivematica/.ssh
cat <<EOF >/home/archivematica/.ssh/authorized_keys
<key_from_cat_output>
EOF
chmod 0700 /home/archivematica/.ssh
chmod 0600 /home/archivematica/.ssh/authorized_keys
chown -R archivematica:archivematica /home/archivematica/.ssh
exit

Fifth, verify SSH connection and accept host in MCPClient container. Back to the `am/compose` folder:

docker-compose exec archivematica-mcp-client bash
ssh 192.168.1.135 -p 63022

Verify host and exit.

For this setup, the parameters entered for the AtoM/Binder DIP upload configuration in AM should be:

  - Upload URL:   http://192.168.1.135:63001
  - Set email and pass from an AtoM user
  - Rsync target:   192.168.1.135:/tmp
  - Rsync command:   ssh -p 63022

The SWORD plugin needs to be enabled in AtoM and the worker restarted. As you noted, there is currently an issue in the Docker environment where the required ability never gets added to the worker. I faced the same issue on my tests and just filed this Redmine ticket with a proposed fix, similar to yours but a little cleaner. We'll add that fix to the `stable/2.4.x` and `qa/2.5.x` branches soon, enabling the SWORD plugin by default.

Also, for those using the `qa/2.5.x` the following issues have been recently fixed (thanks to Ricardo and Steve):


And I also faced a new one in this tests. Still to be fixed but reported in:


Thanks again for the contribution Tatiana! I think that with the simplified network connection and the external shared volume we are in a great point to move forward.

Best regards,
Radda.

tat...@gmail.com

unread,
Mar 12, 2019, 3:48:29 PM3/12/19
to AtoM Users
Hi Jose, thanks for your reply. I will try your code too.

I was reading atom's docker page and I found that it was ready for production environment. Here:

Docker Compose

Linux containers and Docker are radically changing the way that applications are developed, built, distributed and deployed. The AtoM team is experimenting with new workflows that make use of containers. This document introduces our new development workflow based on Docker and Docker Compose. The latter is a tool that help us to run multi-container applications like AtoM and it is suitable for both development and production environments.


I know archivematica is only for development or testing instance as you mentioned, but what about AtoM? I was trying to find anything were it says that AtoM Docker is only for development also, but I coudn't.


Thanks,

Tatiana Canelhas

Message has been deleted
Message has been deleted
Message has been deleted

Ricardo Pinho

unread,
Mar 12, 2019, 6:03:02 PM3/12/19
to ica-ato...@googlegroups.com
+1 vote for AtoM and Archivematica production Docker installation!
Containers (Docker or Kubernetes) for their advantages are certainly the choice for the future, in particularly for complex multi-component solutions like these!

just an (good) example:
"The only officially supported installs of Discourse are Docker based"
https://github.com/discourse/discourse/blob/master/docs/INSTALL.md

Why do you only officially support Docker?

Hosting Rails applications is complicated. Even if you already have Postgres, Redis and Ruby installed on your server, you still need to worry about running and monitoring your Sidekiq and Rails processes, as well as configuring Nginx. With Docker, our fully optimized Discourse configuration is available to you in a simple container, along with a web-based GUI that makes upgrading to new versions of Discourse as easy as clicking a button.

Cheers,
--
Ricardo Pinho

Ricardo Pinho

unread,
Mar 12, 2019, 6:16:21 PM3/12/19
to ica-ato...@googlegroups.com
+1 vote for AtoM and Archivematica production Docker installation!
Containers (Docker or Kubernetes) for their advantages are certainly the choice for the future, in particularly for complex multi-component solutions like these!

just an (good) example:
"The only officially supported installs of Discourse are Docker based"
https://github.com/discourse/discourse/blob/master/docs/INSTALL.md

Why do you only officially support Docker?

Hosting Rails applications is complicated. Even if you already have Postgres, Redis and Ruby installed on your server, you still need to worry about running and monitoring your Sidekiq and Rails processes, as well as configuring Nginx. With Docker, our fully optimized Discourse configuration is available to you in a simple container, along with a web-based GUI that makes upgrading to new versions of Discourse as easy as clicking a button.



For more options, visit https://groups.google.com/d/optout.


--
Ricardo Pinho

Tatiana

unread,
Mar 15, 2019, 6:23:42 PM3/15/19
to ica-ato...@googlegroups.com
Ops. Thanks José. I missunderstood it. Your are right.
:)

Em ter, 12 de mar de 2019 19:16, Ricardo Pinho <ricardo...@gmail.com> escreveu:
+1 vote for AtoM and Archivematica production Docker installation!
Containers (Docker or Kubernetes) for their advantages are certainly the choice for the future, in particularly for complex multi-component solutions like these!

just an (good) example:
"The only officially supported installs of Discourse are Docker based"
https://github.com/discourse/discourse/blob/master/docs/INSTALL.md

Why do you only officially support Docker?

Hosting Rails applications is complicated. Even if you already have Postgres, Redis and Ruby installed on your server, you still need to worry about running and monitoring your Sidekiq and Rails processes, as well as configuring Nginx. With Docker, our fully optimized Discourse configuration is available to you in a simple container, along with a web-based GUI that makes upgrading to new versions of Discourse as easy as clicking a q!!button.

Reply all
Reply to author
Forward
0 new messages