"Store the AIP" failing with a 504 error

362 views
Skip to first unread message

Andrew Berger

unread,
Jul 16, 2014, 3:47:08 PM7/16/14
to archiv...@googlegroups.com
Hi all,

We've recently set up two separate installs of Archivematica 1.1 for testing: both are running Ubuntu 12.04.4, we installed using the packages, and in both cases Archivematica and the Storage Service are running on the same machine. One is running in VirtualBox on a Mac running Mavericks, the other is on a dedicated server (I can get more details on this configuration if necessary).

In both installations, everything seems to be working fine when we try to ingest smaller SIPs (small in terms of total SIP size, not necessarily number of files). However, we're running into a problem where SIPs above a certain size fail at the "Store the AIP" job with a Server Error 504. Has anyone else run into this?

More, possibly excessive, details below:

The problem: When we try to ingest SIPs that have a total size of around 1.5 GB (uncompressed) or larger, they consistently fail at the "Store the AIP" job during the Store AIP microservice. Archivematica then puts the SIP into the failed directory and stops processing that SIP. According to the Dashboard, at least, no AIP was created.

The strange thing is that what appears to be a valid AIP corresponding to the failed SIP - named transfername-UUID.7z - actually does get put into the AIP store at some point in this process. It does not appear on the list of AIPs in the Archival Storage tab, but it's in /var/archivematica/sharedDirectory/www/AIPsStore/ and it's in the list of packages in the Storage Service. It's also possible to find it by searching the AIPs in the Archival Storage tab, but the search results list shows "(data missing)" where the AIP name would be. The UUID listed in the search result matches the SIP UUID and the package can be downloaded (by clicking on "data missing") just like any other AIP, although the "view raw" output is essentially blank:

{
  "_id": "None", 
  "_index": "aips", 
  "_type": "aip", 
  "exists": false
}

What this means is that, for a given failed SIP, a package named transfername-UUID.7z is placed in the AIP store and a failed SIP directory is created with the following structure:

sharedDirectory/failed/transfername-UUID/
├── METS.UUID.xml.validatorTester.html
├── pointer.xml
├── processingMCP.xml
└── transfername-UUID.7z

I've spot-checked a few of these and the md5 checksum on the 7zip package in the failed SIP directory always matches the checksum on the corresponding 7zip package in the AIP Store. I've also extracted a few of the AIP Store packages and run checksums for the files in /data/objects and they match the checksums for the original files along with the checksums in the BagIt manifests (as I would expect since they all pass the Verify AIP job earlier in the micro-service chain). 

We've looked through the error messages/logs and here is what we've found (here I'm using the output from our VirtualBox installation): 

The output of STDERR for the Store the AIP job is the same each time:

AIP creation failed.  See Storage Service logs for more details

However, the JSON file at http://localhost:8000/api/v2/file/?format=json actually shows the AIP that just failed with a status of "UPLOADED". 

There doesn't appear to be anything wrong in /tmp/storage-service.log

The /tmp/archivematica.log shows a 504 error, which suggests it's a problem with nginx:

INFO:root:Creating file with {'size': 1203858511, 'package_type': 'AIP', 'current_path': 'soundtest-a0a73c9c-fa20-4f71-a711-e45af6d95957.7z', 'origin_pipeline': '/api/v2/pipeline/9ae836f6-7e6f-454a-b950-650e8d1208e4/', 'uuid': 'a0a73c9c-fa20-4f71-a711-e45af6d95957', 'current_location': '/api/v2/location/45ea03e3-85c2-4262-b9a4-d9dba13a8d32/', 'origin_path': '/var/archivematica/sharedDirectory/currentlyProcessing/soundtest-a0a73c9c-fa20-4f71-a711-e45af6d95957/soundtest-a0a73c9c-fa20-4f71-a711-e45af6d95957.7z', 'origin_location': '/api/v2/location/04fed302-0938-44a0-b0ca-c35b37b1e700/'}
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): localhost
WARNING:root:Could not connect to storage service: Server Error 504: http://localhost:8000/api/v2/file/ (<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.1.19</center>
</body>
</html>
)
WARNING:root:AIP unabled to be created: Server Error 504: http://localhost:8000/api/v2/file/.  See logs for more details.

The log at /var/log/nginx/error.log shows an error during a POST request that corresponds to the time of the failed AIP:

2014/07/15 15:21:38 [error] 2634#0: *64 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 127.0.0.1, server: , request: "POST /api/v2/file/ HTTP/1.1", upstream: "uwsgi://unix:/tmp/storage.uwsgi.sock:", host: "localhost:8000"

Finally, the log at /var/log/uwsgi/app/storage.log shows closed/broken pipe errors at around the same time:

Tue Jul 15 15:22:13 2014 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /api/v2/file/ (ip 127.0.0.1) !!!
Tue Jul 15 15:22:13 2014 - writev(): Broken pipe [proto/uwsgi.c line 120] during POST /api/v2/file/ (127.0.0.1)
Tue Jul 15 15:22:13 2014 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /api/v2/file/ (ip 127.0.0.1) !!!
Tue Jul 15 15:22:13 2014 - write(): Broken pipe [proto/uwsgi.c line 138] during POST /api/v2/file/ (127.0.0.1)

This is about as far as we've gotten investigating the problem. I've tried upgrading uwsgi to a newer version on the VM installation but still ran into the same issue. I also tried upgrading nginx as well, but haven't been able to get Archivematica working at all after the upgrade. Any advice would be appreciated.

Thanks,
Andrew

Justin Simpson

unread,
Jul 16, 2014, 4:03:43 PM7/16/14
to archiv...@googlegroups.com
Hi Andrew,

The issue is probably the default setting for http timeouts in nginx.  The Store AIP microservice in the dashboard does an HTTP POST, to send the AIP to the storage service.  By default http POSTS timeout after 60 seconds I believe.  For a 1.5 GB aip, it is probaby taking longer than that, so you need to increase the value for timeouts and restart nginx.

The Archivematica Storage Service uses nginx, when it is installed, it creates a configuration file at

/etc/nginx/sites-available/storage

You can add 2 lines to it like so:

change:

    location / {
        uwsgi_pass storage;
        include /etc/nginx/uwsgi_params;
    }

to:

    location / {
        uwsgi_pass storage;
        uwsgi_read_timeout 3600;
        uwsgi_send_timeout 3600;
        include /etc/nginx/uwsgi_params;
    }

3600 means 1 hour.  You could use a smaller number, like 600, and see if that works.  Using 3600 means if the storage service is down, you won't know about it for 1 hour, so there is a downside to setting very large timeout values. 

In a future version of the storage service, we plan to move to an asynchronous method of moving AIP's from the Archivematica pipeline to the storage service, which will avoid this issue entirely.




Justin Simpson
Director of Archivematica Technical Services
www.artefactual.com
604-527-2056


--
You received this message because you are subscribed to the Google Groups "archivematica" group.
To unsubscribe from this group and stop receiving emails from it, send an email to archivematic...@googlegroups.com.
To post to this group, send email to archiv...@googlegroups.com.
Visit this group at http://groups.google.com/group/archivematica.
For more options, visit https://groups.google.com/d/optout.

Andrew Berger

unread,
Jul 16, 2014, 6:33:04 PM7/16/14
to archiv...@googlegroups.com
Hi Justin,

Thank you for responding so quickly. Changing the timeout setting to an hour fixed the problem. As for the risks involved in having a longer timeout period, I think until there's a new version of the storage service, we might end up incorporating changes to that setting into our workflow for larger SIPs, as we have files much larger than 1.5 GB to ingest down the line.

Andrew

Courtney Mumma

unread,
Jul 16, 2014, 6:35:56 PM7/16/14
to archiv...@googlegroups.com

That's great news Andrew! Let us know how your continued use progresses and if we can be of more help.
Courtney

Justin Simpson

unread,
Jul 16, 2014, 6:37:39 PM7/16/14
to archiv...@googlegroups.com
Great, glad that helped.  As a note, we have found that a timeout setting of 1 hour will work for AIP's between 40GB and 100GB or even larger, depending very much on available cpu, memory and the speed of the underlying filesystem.



Justin Simpson
Director of Archivematica Technical Services
www.artefactual.com
604-527-2056


Reply all
Reply to author
Forward
0 new messages