Default Size of Thumbnails and Disply Copies

145 views
Skip to first unread message

catharin...@gmail.com

unread,
Mar 16, 2018, 9:10:19 AM3/16/18
to AtoM Users
Hi everbody,
I am looking for some information about the default sizes of thumbnails and disply copies in AtoM. Can anybody help me with that?
Thanks anb best regards
Catharina

Dan Gillean

unread,
Mar 16, 2018, 11:43:44 AM3/16/18
to ICA-AtoM Users
Hi Catharina, 

In AtoM, digital object derivatives are managed by a series of external libraries such as imagemagick, ffmpeg, ghostscript, etc. See: 
For image thumbnails, I believe that the defaults we are using are: 
  • max width: 270px
  • max height: 1024px
They may be set elsewhere (I'm not a developer), but I found this in our code here: 
This also conforms with what I found by downloading some of our thumbnails and inspecting their properties. 

For the reference image, the maximum width is a user-configurable setting - in 2.4 you will find this in Admin > Settings > Global. See: 
In 2.5, I believe this setting is being moved to Admin > Settings > Digital object derivatives. 

The default value at installation for the maximum width is 480px. I believe that this value is also used for max height - if you upload a very long vertical image, it will be scaled to a maximum of 480px height for the reference image. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/01c0a671-5fee-40b5-b81f-60a0d4e5ea39%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ombric...@gmail.com

unread,
Oct 16, 2018, 6:32:08 AM10/16/18
to AtoM Users
Hi Dan!

I'm experiencing a problem with my DIP Upload from Archivematica 1.7.1 to my AtoM 2.4.

In my package there is, as digital object, only one PDF multipage (927 pages for 26 MB) that AtoM is unable to display.
The DIP is correctly created, but in AtoM I only see the thumbnail and when I try to open it this is the result:
With a consequent AtoM Worker crash.

I've tried with a smaller PDF (only 3 pages) and it works so I'm trying to understand if there's a size limit and what it is or if it's possible to make it higher.

p.s. Does the pdfinfo tool have any influence on this?

Thanks for any help!

Giò.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.

Dan Gillean

unread,
Oct 16, 2018, 8:15:18 PM10/16/18
to ICA-AtoM Users
Hi Giò,

That is a pretty big PDF!

There are a few places in AtoM where a limit to the size of digital objects can be set, and quite a few factors that can impact the size of uploads. 

In your case, the first thing I would suggest is checking the webserver error logs, and sharing with us the exact error message you got from that upload. See: 
Without knowing what the error is, my first guess would be that you've run into one of the PHP execution limits. See: 
My suspicion is that either the current max_execution_time or memory_limit values were surpassed, and the upload was aborted. 

Note that there is also an upload_max_filesize value as well - this is generally set by default to 64MB, so it shouldn't be the issue in this case. 

There are also a couple other places in AtoM where file size limits can be set - so if none of the above turns out to be the issue, you might want to check the following: 

First, there is a limit  that can be set in the config/app.yml configuration file. See: 

Next, in Admin > Settings, there is a place where you can set a default upload limit for all repositories. See: 

If you are using AtoM in multi-repository environment, you might also want to make sure there is not a specific limit on your repository. See: 

As I said, I would start by getting more information on the 500 error from the webserver logs, and checking the PHP execution limit values and potentially increasing them if needed. Let us know what you find! 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

arth...@gmail.com

unread,
Oct 17, 2018, 3:03:57 PM10/17/18
to ica-ato...@googlegroups.com

Hi Dan,

I'm a colleague of Giò, I try to add some info about our issue and… sorry for the very long post.

 

First of all, note that:

1.       The environment is composed by Archivematica 1.7.1 and AtoM 2.4.0-156, both on RedHat 7.5 and the issue appears with a big PDF of 26MB (we expect to have more files of this type/weight, even bigger than this one).

2.       Starting from the same 26MB PDF file, I deleted lot of pages, reduced the file to 8MB and retry the entire process: no issues found, the PDF file is complete and readable in AtoM.

3.       I retried the entire process in a similar environment but using Ubuntu 16 instead of RedHat 7.5 and all worked well: the 26MB file was transferred, ingested and uploaded to AtoM. I didn't change anything in PHP or in any other configuration of the environment.

 

Before making the last try shown below, I followed your suggestions…

 

LOGGING WITH QUBIT

We are on RedHat 7.5 (Maipo) and we can't find any atom.conf file. Strange but true!

[root@atom /]# cd ..

[root@atom /]# ls

bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

[root@atom /]# sudo find . -name atom.conf

[root@atom /]#

So we can't add the client address in env[ATOM_DEBUG_IP] to use the "qubit_dev.php" and all

·         /usr/share/nginx/atom/log/qubit_cli.log

·         /usr/share/nginx/atom/log/qubit_prod.log

·         /usr/share/nginx/atom/log/qubit_worker.log

are empty.

We can't also follow the suggestions in https://www.accesstomemory.org/en/docs/2.5/admin-manual/maintenance/logging/#maintenance-webserver, as we think that modifying the /usr/share/nginx/atom/config/factories.yml has no consequences cause we can't use "qubit_dev.php".

 

LOGGING WITH NGINX

We use Nginx so we hoped to collect some info from /var/log/nginx/error.log

 

PHP.INI SETTINGS

In the RedHat installation we found the php.ini file in

·         ./etc/opt/rh/rh-php70/php.ini

·         ./opt/rh/rh-php70/register.content/etc/opt/rh/rh-php70/php.ini

We suppose the operative php.ini is the first one but - to be excessive - we changed also the second with the following parameters:

·         max_execution_time = 30, changed to 120

·         memory_limit = 128M, changed to 1024M

·         post_max_size = 8M, changed to 512M

·         upload_max_filesize = 2M, changed to 100M

·         max_file_uploads = 20, changed to 100

We then followed your suggestions in http://php.net/manual/en/features.file-upload.common-pitfalls.php and noted that:

·         Max_file_size is not on our php.ini so we argued the upload_max_filesize is enough

·         Memory_limit = 1024M seems to be large enough compared to post_max_size = 512M

·         Max_execution_time = 120 is large enough (see the above considerations regarding Ubuntu)

·         Post_max_size = 512M is large enough

·         Max_file_uploads = 100 may be large enough, even too much, if every upload made by the "DIP Upload to Atom" procedure does only one "post" for every single file.

 

ATOM SPECIFIC LIMITS - APP.YML

The file is in /usr/share/nginx/atom/config/app.yml

·         upload_limit: -1

·         download_timeout: 120 (default is 10, we modified it as for the php max_execution_time

·         cache_engine: sfAPCCache

·         read_only: false

·         htmlpurifier_enabled: false

 

ATOM SPECIFIC LIMITS - ARCHIVAL INSTITUTION UPLOAD LIMIT

We still don't have an Archival Institution definition in AtoM. Is this a problem?

We found the upload working in another similar environment without this definition (see the above considerations regarding Ubuntu).

AtoM is multi repository (if we are not wrong it's a default setting) but - as already mentioned - we didn't set any specific Archival Institution.


follows in the post below...

arth...@gmail.com

unread,
Oct 17, 2018, 3:10:42 PM10/17/18
to ica-ato...@googlegroups.com

... follows from the previous post


This is what still happens…


1.       Atom-worker is active…

[root@atom ~]# sudo systemctl status atom-worker

● atom-worker.service - AtoM worker

   Loaded: loaded (/usr/lib/systemd/system/atom-worker.service; enabled; vendor preset: disabled)

   Active: active (running) since Wed 2018-10-17 16:35:45 CEST; 3min 35s ago

 Main PID: 1654 (php)

   CGroup: /system.slice/atom-worker.service

           └─1654 /opt/rh/rh-php70/root/bin/php -d memory_limit=-1 -d error_reporting="E_ALL" symfony jobs:worker

 

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > New ability: arUpdatePublicationStatusJob

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > New ability: arFileImportJob

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > New ability: arInformationObjectXmlExportJob

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > New ability: arXmlExportSingleFileJob

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > New ability: arGenerateReportJob

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > New ability: arActorCsvExportJob

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > New ability: arActorXmlExportJob

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > New ability: arRepositoryCsvExportJob

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > Running worker...

Oct 17 16:35:49 atom.xxx.yyy php[1654]: 2018-10-17 07:35:49 > PID 1654

[root@atom ~]#

2.       All AtoM plugins are enabled, particularly the qtSwordPlugin

3.       We start the Archivematica Transfer and Ingest.

Archivematica reaches the "Upload DIP" job and we choose AtoM DIP Upload setting the slug.

4.       Archivematica does his job and the big file is transferred to Atom (all AM jobs are green and we can see the new upload for a moment in the AtoM /tmp directory)

5.       Meanwhile in AtoM we launched a "sudo journalctl -f -u atom-worker" and this is the result…

[root@atom ~]# sudo journalctl -f -u atom-worker

-- Logs begin at Wed 2018-10-17 16:54:42 CEST. --

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > New ability: arUpdatePublicationStatusJob

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > New ability: arFileImportJob

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > New ability: arInformationObjectXmlExportJob

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > New ability: arXmlExportSingleFileJob

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > New ability: arGenerateReportJob

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > New ability: arActorCsvExportJob

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > New ability: arActorXmlExportJob

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > New ability: arRepositoryCsvExportJob

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > Running   worker...

Oct 17 17:00:26 atom.xxx.yyy php[1784]: 2018-10-17 08:00:26 > PID 1784

Oct 17 17:06:11 atom.xxx.yyy php[1784]: 2018-10-17 08:06:11 > Job started.

Oct 17 17:06:11 atom.xxx.yyy php[1784]: 2018-10-17 08:06:11 > A package was deposited by reference.

Oct 17 17:06:11 atom.xxx.yyy php[1784]: 2018-10-17 08:06:11 > Location: file:///BEC-1-ORG-002-768698ca-f1ec-4638-ad44-c3bed5a6b808

Oct 17 17:06:11 atom.xxx.yyy php[1784]: 2018-10-17 08:06:11 > Processing...

Oct 17 17:06:11 atom.xxx.yyy php[1784]: 2018-10-17 08:06:11 > Object slug: bec-3

Oct 17 17:06:19 atom.xxx.yyy php[1784]: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away

Oct 17 17:06:19 atom.xxx.yyy systemd[1]: atom-worker.service: main process exited, code=exited, status=1/FAILURE

Oct 17 17:06:19 atom.xxx.yyy systemd[1]: atom-worker.service: control   process exited, code=exited status=1

Oct 17 17:06:19 atom.xxx.yyy systemd[1]: Unit atom-worker.service entered failed state.

Oct 17 17:06:19 atom.xxx.yyy systemd[1]: atom-worker.service failed.

The “General error: 2006 MySQL server has gone away” suggests a problem with mariadb, but knowing that DIPs are not stored as blobs in the db and noting that there are only 8 seconds from the error and the line immediately above, we can’t imagine what may be the problem.

6.       The /var/log/nginx/error.log reports only two lines…

2018/10/17 17:08:58 [error] 1323#0: *50 FastCGI sent in stderr: "PHP message: No publication status set for information object id: 3546" while reading response header from upstream, client: 192.168.12.105, server: _, request: "GET /index.php/bec-01-pdf-4 HTTP/1.1", upstream:   "fastcgi://unix:/run/php7.0-fpm.atom.sock:", host: "192.168.XXX.YYY",   referrer: "http://192.168.XXX.YYY/index.php/bec-3"

2018/10/17 17:09:06 [error] 1323#0: *50 FastCGI sent in stderr: "PHP message: No publication status set for information object id: 3546" while reading response header from upstream, client: 192.168.12.105, server: _, request: "GET /index.php/bec-01-pdf-4 HTTP/1.1", upstream:   "fastcgi://unix:/run/php7.0-fpm.atom.sock:", host: "192.168.XXX.YYY",   referrer: "http://192.168.XXX.YYY/index.php/bec-3"

However, it does not seem to be a relationship between these errors and the problem encountered.

7.       Strangely in AtoM we can see the first page thumbnail of the document in the right identified slug position but when we try to view the document, AtoM responds with an

500 | Internal Server Error | sfException

No publication status set for information object id: 3546

And in the /var/log/nginx/error.log a new line appears, again with

2018/10/17 17:41:48 [error] 1322#0: *66 FastCGI sent in stderr: "PHP message: No publication status set for information object id: 3546" while reading response header from upstream, client: 192.168.12.105, server: _, request: "GET /index.php/bec-01-pdf-4 HTTP/1.1", upstream:   "fastcgi://unix:/run/php7.0-fpm.atom.sock:", host: "192.168.XXX.YYY",   referrer: "http://192.168.XXX.YYY/index.php/bec-3"

 

 Thanks in advance for your suggestions.


Arthy

Dan Gillean

unread,
Oct 17, 2018, 4:51:59 PM10/17/18
to ICA-AtoM Users
Hi Arthy, 

Thank you for the wealth of detail here! I see two things that are worth looking into further. First, on this point: 

3.       I retried the entire process in a similar environment but using Ubuntu 16 instead of RedHat 7.5 and all worked well: the 26MB file was transferred, ingested and uploaded to AtoM. I didn't change anything in PHP or in any other configuration of the environment.

This suggests to me that there is something about your RHEL installation that needs to be investigated, rather than a problem in AtoM itself. Unfortunately, we don't test or develop AtoM with CentOS or RHEL, so I'm not sure what to recommend. I would ensure that you have all the necessary PHP extensions and other dependencies in place, as a starting point. We haven't tested it ourselves, but someone did previously share a guide to how they got AtoM working on RHEL, here: 
If you want to browse all posts in the last year that relate to centOS/RHEL, you can use our tags and/or follow this link: 
Second, I wanted to elaborate on the error message you are seeing: 

2018/10/17 17:41:48 [error] 1322#0: *66 FastCGI sent in stderr: "PHP message: No publication status set for information object id: 3546" while reading response header from upstream, client: 192.168.12.105, server: _, request: "GET /index.php/bec-01-pdf-4 HTTP/1.1", upstream: "fastcgi://unix:/run/php7.0-fpm.atom.sock:", host: "192.168.XXX.YYY", referrer: "http://192.168.XXX.YYY/index.php/bec-3"

A missing publication status will definitely cause issues. It's possible that AtoM timed out before properly completing the creation of the information object (i.e. archival description) to which it was attaching your large PDF, and did not add a publication status, which introduces data corruption into your database. Fortunately, this is a fairly common form of data corruption, and one which we can try fixing manually - especially since this error message helpfully provides you with the information object ID. Please see this section of our new 2.5 troubleshooting documentation (it will be the same instructions for 2.4): 
If you continue to experience issues, we do have one suggestion to get a more verbose error message, in regards to the "MySQL has gone away" error you encountered. See: 
Let us know how it goes!

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

In the end, this is what still happens…

The “General error: 2006 MySQL server has gone away” suggest a problem with mariadb, but knowing that DIPs are not stored as blob in the db and noting that there are only 8 seconds from the error and the line immediately above, we can’t imagine what may be the problem.


6.       The /var/log/nginx/error.log reports only two lines…


2018/10/17 17:08:58 [error] 1323#0: *50 FastCGI sent in stderr: "PHP message: No publication status set for information object id: 3546" while reading response header from upstream, client: 192.168.12.105, server: _, request: "GET /index.php/bec-01-pdf-4 HTTP/1.1", upstream: "fastcgi://unix:/run/php7.0-fpm.atom.sock:", host: "192.168.XXX.YYY", referrer: "http://192.168.XXX.YYY/index.php/bec-3"

2018/10/17 17:09:06 [error] 1323#0: *50 FastCGI sent in stderr: "PHP message: No publication status set for information object id: 3546" while reading response header from upstream, client: 192.168.12.105, server: _, request: "GET /index.php/bec-01-pdf-4 HTTP/1.1", upstream: "fastcgi://unix:/run/php7.0-fpm.atom.sock:", host: "192.168.XXX.YYY", referrer: "http://192.168.XXX.YYY/index.php/bec-3"


However, it does not seem to be a relationship between these errors and the problem encountered.


7.       Strangely in AtoM we can see the first page thumbnail of the document in the right identified slug position but when we try to view the document, AtoM responds with an


500 | Internal Server Error | sfException

No publication status set for information object id: 3546


And in the /var/log/nginx/error.log a new line appears, again with


2018/10/17 17:41:48 [error] 1322#0: *66 FastCGI sent in stderr: "PHP message: No publication status set for information object id: 3546" while reading response header from upstream, client: 192.168.12.105, server: _, request: "GET /index.php/bec-01-pdf-4 HTTP/1.1", upstream: "fastcgi://unix:/run/php7.0-fpm.atom.sock:", host: "192.168.XXX.YYY", referrer: "http://192.168.XXX.YYY/index.php/bec-3"

 

Thanks in advance for your suggestions.


Arthy


arth...@gmail.com

unread,
Oct 19, 2018, 11:59:01 AM10/19/18
to ica-ato...@googlegroups.com
Hi Dan,
we investigated following your suggestions and solved the issue changing the max_allowed_packet parameter in /etc/my.cnf.
We found that MariaDB 5.5 (the version used during AtoM installation on RedHat) has a default max_allowed_packet=16M. This explains also why our 26MB file failed while the 8MB one succeded.

Thanks again for your quick and useful reply.

Arthy

Dan Gillean

unread,
Oct 19, 2018, 12:04:43 PM10/19/18
to ICA-AtoM Users
Hi Arthy, 

I'm glad to hear you've figured it out! Thanks for letting us know, and for sharing your solution. I do still recommend that you run the SQL query in our documentation to check for some of the most common forms of data corruption. It's possible that there are in fact still descriptions without a publication status saved in your database that could cause issues down the line if unaddressed. See: 
Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

Thanks again for your quick and usefull reply.

Arthy

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.

arth...@gmail.com

unread,
Oct 19, 2018, 1:04:36 PM10/19/18
to AtoM Users
Great Dan!
We were able to detect 6 items with NULL publication status, we updated them and ... they came back to life.
Thanks again, bye.

Arthy
Reply all
Reply to author
Forward
0 new messages