Archivematica storage service

292 views
Skip to first unread message

BrzI Channel

unread,
Apr 9, 2019, 5:24:27 PM4/9/19
to archivematica
Hi folks,

I am planning to split my installation across two servers...one will be the dashboard while the other one will run the storage service and will store all the AIPs

Question : Do I need to install the archivematica storage service on the machine that runs the dashboard ? (sudo apt-get install archivematica-storage-service)

Thanks







rspe...@artefactual.com

unread,
Apr 15, 2019, 12:33:16 PM4/15/19
to archivematica
Hi Brzl,

You should be able to connect Archivematica to a Storage Service on a different machine, e.g. using RSync: https://www.archivematica.org/en/docs/storage-service-0.14/administrators/#nfs 

You would need to configure the Storage Service URL, username and API key, via Admin -> General in the dashboard itself.

I haven't much experience of this yet, but have been working with it recently. I'll be interested to hear how it goes for you.

Cheers,
Ross

BrzI Channel

unread,
Apr 15, 2019, 3:35:37 PM4/15/19
to archivematica
Hi,

I have configured the Storage Service URL, username and API key correctly. When I log in to the MCP server I can see the folder containing the files that I want to transfer. But it seems there is more to it than that.

When I click on "Start Transfer" the process hangs and I get this error on the MCP server machine :

/var/log/archivematica/MCPServer/MCPServer.debug.log:14873:ERROR     2019-04-15 19:22:20  archivematica.mcp.server:package:wrap:353:  Exception: Filepath /var/archivematica/sharedDirectory/tmp/tmpzyVfVI/123 does not exist.

So....would I be correct in stating that I need to create some SPACES on the storage server machine ?

Thanks



Ross Spencer

unread,
Apr 15, 2019, 3:52:36 PM4/15/19
to archiv...@googlegroups.com
Hi again Brzl, 

That's right. The documentation link I provided might help elaborate on little, but let us know if it's helpful and how it goes - it sounds like you're making lots of great progress on your installation! 

Ross


--
You received this message because you are subscribed to a topic in the Google Groups "archivematica" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/archivematica/q4GdcXW6Ttw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to archivematic...@googlegroups.com.
To post to this group, send email to archiv...@googlegroups.com.
Visit this group at https://groups.google.com/group/archivematica.
For more options, visit https://groups.google.com/d/optout.


--
Ross Spencer
Software Developer, Artefactual Systems Inc.

BrzI Channel

unread,
Apr 15, 2019, 3:57:03 PM4/15/19
to archivematica
Not sure I would call it great progress :-)

So my thinking is:

Install NFS on the storage server
Create a shared network location for the file transfers
Specify a new SPACE that points to the shared location and set it up as the default Transfer Source...

Ross Spencer

unread,
Apr 15, 2019, 4:22:19 PM4/15/19
to archiv...@googlegroups.com
That sounds right - and you'll need to configure an AIP (and DIP) location as well and you should be able to get an end-to-end transfer going. 

--
You received this message because you are subscribed to a topic in the Google Groups "archivematica" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/archivematica/q4GdcXW6Ttw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to archivematic...@googlegroups.com.
To post to this group, send email to archiv...@googlegroups.com.
Visit this group at https://groups.google.com/group/archivematica.
For more options, visit https://groups.google.com/d/optout.

BrzI Channel

unread,
Apr 26, 2019, 1:53:54 PM4/26/19
to archivematica
So, after looking at the logs when starting transfer:

No warnings or errors on Storage Service server.

On the MCP server I get this :
/var/log/archivematica/MCPServer/MCPServer.debug.log:174:ERROR     2019-04-26 17:46:58  archivematica.mcp.server:package:wrap:353: 
Exception: Filepath /var/archivematica/sharedDirectory/tmp/tmpMWywOw/213 does not exist.

When I examine the actual path I see that  /var/archivematica/sharedDirectory/tmp/tmpMWywOw gets created but the /213 subfolder does not get created.

Every time I start the process the same thing happens. The last folder in the path does not get created. This looks like a permissions issue but I am not sure.

Would anybody have any ideas how to overcome this ?

rspe...@artefactual.com

unread,
Apr 29, 2019, 10:59:11 AM4/29/19
to archivematica
Hi Brzl, 

I am wondering if there are any additional logs that precede the error in the MCP Server, or any parallel logs in the Storage Service? I feel like with the tmp directory being created as expected permissions should be okay. 

Is the issue for a single transfer? (transfer 213?) - if so, can you maybe provide some details about that transfer? Directory tree? Transfer size? etc. 

Cheers,
Ross

BrzI Channel

unread,
Apr 29, 2019, 3:08:02 PM4/29/19
to archivematica
Hi Ross,

this happens for all transfers. Thsi particular one is a folder with three jpgs..

On restart I get this in the logs :

/var/log/archivematica/dashboard/dashboard.log:1:WARNING   2019-04-29 19:01:34  elasticsearch:base:log_request_fail:97:  HEAD http://127.0.0.1:9200/aips,aipfiles,transfers,transferfiles [status:N/A request:0.024s]
/var/log/archivematica/dashboard/dashboard.log:28:WARNING   2019-04-29 19:01:36  elasticsearch:base:log_request_fail:97:  HEAD http://127.0.0.1:9200/aips,aipfiles,transfers,transferfiles [status:N/A request:0.001s]
/var/log/archivematica/dashboard/dashboard.log:55:WARNING   2019-04-29 19:01:39  elasticsearch:base:log_request_fail:97:  HEAD http://127.0.0.1:9200/aips,aipfiles,transfers,transferfiles [status:N/A request:0.001s]
/var/log/archivematica/dashboard/dashboard.log:82:WARNING   2019-04-29 19:01:46  elasticsearch:base:log_request_fail:97:  HEAD http://127.0.0.1:9200/aips,aipfiles,transfers,transferfiles [status:N/A request:0.001s]
/var/log/archivematica/dashboard/dashboard.debug.log:1:WARNING   2019-04-29 19:01:34  elasticsearch:base:log_request_fail:97:  HEAD http://127.0.0.1:9200/aips,aipfiles,transfers,transferfiles [status:N/A request:0.024s]
/var/log/archivematica/dashboard/dashboard.debug.log:28:WARNING   2019-04-29 19:01:36  elasticsearch:base:log_request_fail:97:  HEAD http://127.0.0.1:9200/aips,aipfiles,transfers,transferfiles [status:N/A request:0.001s]
/var/log/archivematica/dashboard/dashboard.debug.log:55:WARNING   2019-04-29 19:01:39  elasticsearch:base:log_request_fail:97:  HEAD http://127.0.0.1:9200/aips,aipfiles,transfers,transferfiles [status:N/A request:0.001s]
/var/log/archivematica/dashboard/dashboard.debug.log:82:WARNING   2019-04-29 19:01:46  elasticsearch:base:log_request_fail:97:  HEAD http://127.0.0.1:9200/aips,aipfiles,transfers,transferfiles [status:N/A request:0.001s]

I am guessing this is the juts Elasticsearch trying to start - which it eventually does succeed in doing. When I check for it's status it says Active / Loaded.

No errors...

Thanks

rspe...@artefactual.com

unread,
Apr 30, 2019, 4:34:47 AM4/30/19
to archivematica
Hi Brzl, 

I'll ask around the team. I am surprised there isn't a storage service log if it is permissions, but if we proceed on that basis for now, can I ask what the permissions are for the transfer source that you're working with? 

I'm sure we can figure it out.
Ross
Message has been deleted

BrzI Channel

unread,
May 8, 2019, 1:43:23 PM5/8/19
to archivematica
Sorry, I just saw your post. Will do another 2-server build and respond to this question then...

BrzI Channel

unread,
May 8, 2019, 5:19:30 PM5/8/19
to archivematica
Here we go...

innopac@amaticastore:~$ ls -la /home
total 12
drwxr-xr-x  3 root    root    4096 Apr  9 11:25 .
drwxr-xr-x 23 root    root    4096 Apr  9 13:33 ..
drwxr-xr-x 15 innopac innopac 4096 May  8 13:44 innopac

Once again: I am running the dashboard on one VM and trying to import from /home on another VM
No errors or warnings on the remote storage VM
On the dashboard VM I get this :

/var/log/archivematica/MCPServer/MCPServer.log:28:ERROR     2019-05-08 21:12:14  archivematica.mcp.server:package:wrap:353:  Exception: Filepath /var/archivematica/sharedDirectory/tmp/tmpuBdNVM/123 does not exist.
/var/log/archivematica/MCPServer/MCPServer.debug.log:193:ERROR     2019-05-08 21:12:14  archivematica.mcp.server:package:wrap:353:  Exception: Filepath /var/archivematica/sharedDirectory/tmp/tmpuBdNVM/123 does not exist.

Thanks

BrzI Channel

unread,
May 10, 2019, 3:57:32 PM5/10/19
to archivematica
Just to test things out I changed the permissions on the transfer folder to www-data:www-data.

Same thing happens: The processing server attempts to make a folder in /var/archivematica/sharedDirectory/tmp/%tmpfoldername%/****

The tmp folder gets created but not the folder within it. No errors on the storage server. Firewalls are off on both machines.

Here is my processing server setup...this is all local as my storage service server has 4 TB

Here is my storage server config - all these paths are on the local storage service server (4TB)

Note: AMATICA is the processing server

Any tips welcome.

Thanks


Donald Mennerich

unread,
May 15, 2019, 1:40:27 PM5/15/19
to archivematica
H,

I'm running into the same problem with the Currently Processing space. I've created a location and spaces on our  NAS, which works fine for all of the other purposes, AIP Storage, Transfer Source, etc. Here's what my configuration looks like



If I try to do a transfer with this configuration I get the following error in the MCPServer log:

ERROR     2019-05-15 15:39:38  archivematica.mcp.server:package:wrap:353:  Exception: Filepath /var/archivematica/sharedDirectory/tmp/tmpT9lcKT/StorageServiceRCS does not exist.

The system seems to be trying to write to the default location on the host even though that location is disabled in the storage service. Could this be a bug? I'm running Archivematica 1.9.1 on a Centos 7.6.

Also, I noticed that the system created the temp directory in the default location:

drwxrwx---. 2 archivematica archivematica 4096 May 15 11:39 tmpT9lcKT

Thanks!

Don

Ross Spencer

unread,
May 15, 2019, 4:18:52 PM5/15/19
to archivematica
Hi Both,

I went back to the team and Miguel suggested You may need to configure a new storage service Space (Pipeline Local 
Filesystem) or use a shared filesystem for /var/archivematica (nfs for example).

I've extracted a section from our deployment notes for you if it helps.


An example storage service instance we're running for testing looks like this (note the separate space for the currently processing):



Miguel also noted that it is very important that the storage service can ssh the pipeline virtual machine as the Archivematica user without the password.

Unfortunately I haven't had to do this in a development environment yet, hopefully this helps you get on your way. 

Please let us know how it goes and keep us posted.
All the best,
Ross
Message has been deleted
Message has been deleted

BrzI Channel

unread,
May 15, 2019, 4:54:43 PM5/15/19
to archivematica
Thanks for that ssh setup document. I find that a pain in the neck...Will come in very useful.

BrzI Channel

unread,
May 15, 2019, 5:48:09 PM5/15/19
to archivematica
Just a quick question:

On the AM machine the archivematica account is defined in /etc/passwd as
archivematica:x:333:333::/var/lib/archivematica/:/usr/sbin/nologin

What exactly do I change that to ?
archivematica:x:333:333::/var/lib/archivematica/:/usr/sbin/sh
or
something else ?

Thanks


Miguel Angel Medinilla Luque

unread,
May 16, 2019, 5:55:19 AM5/16/19
to archivematica
Hi,

You can switch to archivematica user on pipeline VM with command:

sudo su - archivematica -s /bin/bash

This way you can add the SS pub key to the authorized_keys file.

Miguel Angel Medinilla Luque

unread,
May 16, 2019, 6:05:44 AM5/16/19
to archivematica
The are several ways to add the ssh key, but I think this is an easy one:

A) On SS VM:

switch to archivematica user: 

sudo su - archivematica -s /bin/bash

Create ssh pub key:

ssh-keygen -t rsa 

(and hit enter several times)

Print and copy the new ssh key:

cat .ssh/id_rsa.pub

B) On Pipeline VM:

Switch to archivematica user:

sudo su - archivematica -s /bin/bash

Create ssh directory with permissions:

ssh-keygen -t rsa 

(click enter several times)

Create .ssh/authorized_keys file with the SS ssh pub key:

vi .ssh/authorized_keys

Change authorized_keys permissions:

chmod 0600 .ssh/authorized_keys

C) Test, from SS VM:

sudo su - archivematica -s /bin/bash
ssh archivematica@PIPELINE_IP_ADDRESS_OR_HOSTNAME

BrzI Channel

unread,
May 16, 2019, 4:53:02 PM5/16/19
to archivematica
I have a problem with the last bit :

innopac@AMATICA-STORAGE:~$ sudo su - archivematica -s /bin/bash
archivematica@AMATICA-STORAGE:~$ ssh archiv...@1XX.2XX.43.170
Welcome to Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-48-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Thu May 16 13:48:04 PDT 2019

  System load:  0.04               Processes:             270
  Usage of /:   21.4% of 38.14GB   Users logged in:       0
  Memory usage: 70%                IP address for ens160: 1XX.2XX.43.170
  Swap usage:   36%

 * Ubuntu's Kubernetes 1.14 distributions can bypass Docker and use containerd
   directly, see https://bit.ly/ubuntu-containerd or try it now with

     snap install microk8s --classic

 * Canonical Livepatch is available for installation.
   - Reduce system reboots and improve kernel security. Activate at:
     https://ubuntu.com/livepatch

8 packages can be updated.
0 updates are security updates.


Last login: Thu May 16 13:41:07 2019 from 1XX.2XX.43.171
This account is currently not available.
Connection to 1XX.2XX.43.170 closed.


I was expecting to a login prompt from the Pipeline VM (1XX.2XX.43.170) - instead it closes the connection.

Does this look good to you ?

Thanks

BrzI Channel

unread,
May 16, 2019, 7:17:09 PM5/16/19
to archivematica
This may or may not be relevant :

After setting all this up (including the test above this post) and rebooting I find that ElasticSearch does not start - among other things...

innopac@AMATICA-MCP:~$ grep -rn WARNING /var/log/archivematica
/var/log/archivematica/MCPServer/MCPServer.log:1:WARNING   2019-05-16 22:35:52  py.warnings:RPCServer:<module>:34:  /usr/lib/archivematica/MCPServer/package.py:168: SyntaxWarning: name '_default_location_uuid' is used prior to global declaration
/var/log/archivematica/MCPServer/MCPServer.log:4:WARNING   2019-05-16 22:35:55  py.warnings:__init__:<module>:80:  /usr/share/archivematica/virtualenvs/archivematica-mcp-server/local/lib/python2.7/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!
/var/log/archivematica/MCPServer/MCPServer.debug.log:1:WARNING   2019-05-16 22:35:52  py.warnings:RPCServer:<module>:34:  /usr/lib/archivematica/MCPServer/package.py:168: SyntaxWarning: name '_default_location_uuid' is used prior to global declaration
/var/log/archivematica/MCPServer/MCPServer.debug.log:4:WARNING   2019-05-16 22:35:55  py.warnings:__init__:<module>:80:  /usr/share/archivematica/virtualenvs/archivematica-mcp-server/local/lib/python2.7/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!
/var/log/archivematica/dashboard/dashboard.log:1:WARNING   2019-05-16 23:06:25  elasticsearch:base:log_request_fail:82:  PUT http://127.0.0.1:9200/aips [status:N/A request:0.001s]
/var/log/archivematica/dashboard/dashboard.debug.log:7:WARNING   2019-05-16 23:06:25  elasticsearch:base:log_request_fail:82:  PUT http://127.0.0.1:9200/aips [status:N/A request:0.001s]

I am able to start ElasticSearch manually...but still no data transfer after folder selection....

I get this on data transfer start :

/var/log/archivematica/MCPServer/MCPServer.log:1:WARNING   2019-05-16 22:35:52  py.warnings:RPCServer:<module>:34:  /usr/lib/archivematica/MCPServer/package.py:168: SyntaxWarning: name '_default_location_uuid' is used prior to global declaration
/var/log/archivematica/MCPServer/MCPServer.log:4:WARNING   2019-05-16 22:35:55  py.warnings:__init__:<module>:80:  /usr/share/archivematica/virtualenvs/archivematica-mcp-server/local/lib/python2.7/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!
/var/log/archivematica/MCPServer/MCPServer.debug.log:1:WARNING   2019-05-16 22:35:52  py.warnings:RPCServer:<module>:34:  /usr/lib/archivematica/MCPServer/package.py:168: SyntaxWarning: name '_default_location_uuid' is used prior to global declaration
/var/log/archivematica/MCPServer/MCPServer.debug.log:4:WARNING   2019-05-16 22:35:55  py.warnings:__init__:<module>:80:  /usr/share/archivematica/virtualenvs/archivematica-mcp-server/local/lib/python2.7/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!

/var/log/archivematica/MCPServer/MCPServer.log:34:ERROR     2019-05-16 23:15:22  archivematica.mcp.server:package:wrap:347:  Exception: Filepath /var/archivematica/sharedDirectory/tmp/tmpQA83e2/123 does not exist.
/var/log/archivematica/MCPServer/MCPServer.debug.log:280:ERROR     2019-05-16 23:15:22  archivematica.mcp.server:package:wrap:347:  Exception: Filepath /var/archivematica/sharedDirectory/tmp/tmpQA83e2/123 does not exist.

Something is still not right...

BrzI Channel

unread,
May 16, 2019, 7:25:32 PM5/16/19
to archivematica
And here is an illustration of what is happening :
tmp folders with corresponding files get created on the SS (right)
but it seems like the folder contents do not get created on the AM Pipeline machine (left) 

BrzI Channel

unread,
Jun 7, 2019, 6:59:52 PM6/7/19
to archivematica
Hi Miguel,

I followed your instructions and it all seems OK until the test at the end.

when I try to login from storage server to dashboard server I get a brief welcome screen and a message that the account is not available and that the session is closed - but it certainly looks like there was a brief connection

then I changed the shell for archivematica on the dashboard server and on my next attempt to test ssh I get prompted for a password for the archivematica user (which is not set).

PS. I changed the archivematica shell from usr/sbin/nologin to usr/sbin/bash. Should I have done something else ?

Thanks


BrzI Channel

unread,
Jun 10, 2019, 6:09:43 PM6/10/19
to archivematica
I am still having the same issue - no errors or warnings anywhere in /var/log/archivematica on either machine.

However nginx on the storage server has this :


2019/06/10 14:09:51 [error] 1810#1810: *321 upstream prematurely closed connection while reading response header from upstream, client: %AMserverIP%, server: , request: "GET /api/v2/location/5c9d858b-0d98-4a1f-9c10-3c1a8386bb5d/browse/?path=L2hvbWUvaW5ub3BhYw%3D%3D HTTP/1.1", upstream: "http://127.0.0.1:8001/api/v2/location/5c9d858b-0d98-4a1f-9c10-3c1a8386bb5d/browse/?path=L2hvbWUvaW5ub3BhYw%3D%3D", host: "%AMATICASTORAGESERVER%:8000"

This looks very similar to what I think is an error in the SSH configuration

posted earlier...

This account is currently not available.
Connection to %AMserverIP% closed.


I think my problem is likely the ssh key setup...but I cannot figuree out what I am missing...


 

BrzI Channel

unread,
Jul 3, 2019, 1:47:46 PM7/3/19
to archivematica
Issue resolved. Ross's snapshot of the configuration cleared things up.

The other minor issue was that for passwordless data transfer the login shell of  the archivematica user on the dashboard server had to be changed as per the following:

archivematica:x:333:333::/var/lib/archivematica:/bin/bash

This is for Ubuntu 18.04 in the /etc/passwd file...

Thanks folks !!
Reply all
Reply to author
Forward
0 new messages