500 Internal Server Error- Third time in a month

479 views
Skip to first unread message

maria.pap...@embl.de

unread,
Jan 29, 2024, 4:34:51 AMJan 29
to ica-ato...@googlegroups.com

Dear colleagues,

 

Good morning everyone!

 

This is the third time in a month that the following problem has occurred in our AtoM installation. I appreciate that we now know how to deal with it, but my question is how we will prevent this problem from happening again and again and in such a short period of time.

 

 

 

 

Yours sincerely,

Maria

 

 

Maria Papanikolaou

Archives and Records Manager

EMBL Archives and Records Management,

Office for Scientific Information Management (OSIM),

 

European Molecular Biology Laboratory
Meyerhofstraße 1
69117 Heidelberg Germany
maria.papanikolaou @embl.de
+49 (0)6221 387-8719
www.embl.org
www.embl.org/archive

 

image001.png

rakkitha samaraweera

unread,
Jan 29, 2024, 5:04:01 AMJan 29
to ica-ato...@googlegroups.com
Dear Maria,

This is a internal server error which is generated by IIS or your virtual web server. First of all please let me know are you going to install throw WAMP server or IIS?

This may cause when some services (like PHP, MySqual database) do not run properly .

Rakkitha.
National Archives Sri Lanka

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/010d01da5296%246c2ecf30%24448c6d90%24%40embl.de.

maria.pap...@embl.de

unread,
Jan 29, 2024, 11:28:09 AMJan 29
to ica-ato...@googlegroups.com

Dear Rakkitha,

 

Thank you for your prompt reply and information.

 

The IT people fixed it in a minute. I will check when it happens again.

 

Thank you very much for your prompt reply.

 

Kind regards,

 

Maria

image001.png

Dan Gillean

unread,
Jan 29, 2024, 12:01:43 PMJan 29
to ica-ato...@googlegroups.com
Hi Maria, 

To prevent 500 errors from recurring repeatedly, we need to first understand the root cause of the issue. You say that "we now know how to deal with it" - what have you been doing to alleviate the error thus far?

You are right that what to do first when a 500 error is encountered should be known to most members of this group by now - the question is asked almost daily (you can search and see all threads with the labels "500-error" and "FAQ" here in the forum), and my initial response is almost always the same, though I will give it here again before also suggesting some initial thoughts for long term resolution. 

Encountering 500 errors

The first step should be gathering as much information as you can. A 500 error is a general sort of "critical failure" error message - to diagnose the issue we need to know more specifics. The first place we generally recommend checking for that is the webserver error logs. If you have followed our recommended installation instructions and are using Nginx for your webserver, you can check the most recent error log entries with: 
In addition, it's helpful to know more about the specifics of your installation when recommending solutions, as this can impact what we suggest for next steps. Along with whatever you find in the webserver error logs, other information it is helpful to share in a forum post about a 500 error include: 
  • What is the full AtoM version number or your installation, as found in Admin > Settings?
  • Did you follow exactly the recommended installation instructions for your version? If not, what changes have you made?
  • Are you using a custom theme plugin, and/or does your installation have any other local code customizations?
    • If not, are you using the older default Bootstrap 2 Dominion theme, or the new Bootstrap 5 Dominion theme included in 2.7 and later?
  • Does your AtoM installation meet the recommended minimum hardware requirements listed here? How much memory does your installation have?
  • Are there any important previous actions that might have caused this worth mentioning? For example, did you upgrade recently? Did someone attempt an import? When and why, to the best of your knowledge, did this error start happening? What was the triggering event before the error message?
  • What, if anything, have you tried so far to investigate and/or resolve the issue? And what happened?
Sharing this information alongside the webserver error log messages when reporting a 500 error allows us to give better suggestions that can help resolve the issue. 

Diagnosing the problem

Ok, so we have checked the error logs and gathered our information - what next? How do we figure out and resolve the root problem?

The truth is, it depends on so many things included in the above info that I cannot give simple answers to these. However, depending on what is found, I can provide a few suggestions on next steps. 

atom-worker and job scheduler issues

You are likely dealing with a job scheduler issue if you look in the webserver error logs and see a message like: 

  • "No Gearman worker available that can handle the job."
Generally, you will need to restart the job scheduler, as well as reset the fail-counter. Here is a previous forum thread with more information: 
If this problem keeps recurring, this may indicate that your system needs more memory. In general, our recommended technical requirements suggest that for a small to medium-sized AtoM installation, you should ensure you have at least 7GB of memory. However, we have seen cases  (such as this forum thread) where users have needed to increase the system memory to 10GB or more to resolve constant Gearman failures.  

Elasticsearch errors

If you see a mention of Elasticsearch or Elastica in the webserver error message, then you are likely dealing with a search index problem. For example, here is a commonly reported Elasticsearch (ES) error message: 
  • Elasticsearch error: Elastica\Exception\Connection\HttpException
Elasticsearch is an indexing library used in AtoM to provide search and browse functionality. Since AtoM 2.5 until the current version (2.8), we have used Elasticsearch version 5.6 in AtoM. 

We have a command-line task that can gather more useful information to share about your search index and its current status: 
Sharing the output of this task is helpful if you know the issue is related to Elasticsearch. 

There are also a number of suggested next steps for resolving ES issues in this thread:
Once again, a lack of sufficient memory can be the cause of constantly recurring Elasticsearch issues. If nothing else has worked, try increasing your available system memory. 

i18n object does not exist errors

We sometimes see errors like the following reported: 
  • PHP message: The "i18n" object does not exist in the current context" ...
Generally, we have found that this tends to be one of the following:
  1. Incorrect permissions
  2. Missing or empty configuration files
  3. Typos or other errors in the PHP pool
  4. Insufficient memory
See this forum thread for some general suggestions on next steps: 

Database corruption

This is one of the trickier ones to diagnose, as it can manifest in a number of different ways, and produce a number of different error messages, depending on where the data corruption exists. Some examples of the types of errors you might find in the logs include: 
  • Unknown record property "xxx" on "yyy"
  • Invalid culture supplied: "xxx" while reading response header from upstream
  • Couldn't find information object (id:xxxxx)
  • Unable to execute INSERT statement. [wrapped: SQLSTATE[23000]: Integrity constraint violation
  • etc...
Database corruption can happen a number of ways, but we most commonly see it when a long-running process - especially something like a CSV import - is interrupted, or times out and fails before it is completed. Could also be a large deletion or move operation that times out, for example. In these cases, the system is still in the process of updating the database when the process is aborted, meaning that some rows in the database are left partially incomplete, and/or have applied the wrong information to the wrong database columns.

In general, we strongly recommend that you make a database backup before proceeding, as resolving data corruption involves directly manipulating the MySQL database, and we want to make sure we can get back to where we started if anything goes wrong. See: 
After you have created a backup, our troubleshooting documentation has a suggested query that users can run to check for common forms of data corruption. It also includes suggestions for how to resolve the issue, depending on what you find using the query. See: 
Now, if you do not want to do the investigation yourself AND YOU HAVE MADE A BACKUP and wish to proceed at your own risk, then we have also previously shared an experimental script that automates the check for data corruption and will attempt to resolve any issues it encounters. We hope to include this as a command-line task in a future version of AtoM, but for now, it is just a script - however, others have used it successfully. For more information, see this thread: 
Problems following an upgrade

If you have just upgraded from a previous AtoM version to a new one and are now experiencing a number of 500 errors, the most common causes we have seen for this are:
  1. Did not read the upgrade instructions at all and attempted to just "upgrade in place" by loading a new AToM tarball version on top of the old one
  2. Forgot to upgrade some dependencies (generally related to 1)
  3. Forgot to drop and recreate the database BEFORE loading the previous version's database dump
  4. Forgot to run the upgrade task AFTER loading the previous database dump
For issues 1 and 2

Our upgrading documentation is found here: 
In reading it, you will see that the steps are basically to first follow the installation instructions for AtoM for the new version, and install a whole new version alongside the old version. Then you copy over your data and run some tasks, after which you can delete/remove the old version. Many people will ignore this and simply attempt to drop a new version of AtoM on top of the old one, without reading the upgrade documentation or even checking to see what dependencies might have changed between versions. 

There are two types of AtoM releases - major and minor: 
  • Major: 2.5, 2.6, 2.7, 2.8, 2.9, etc
  • Minor: 2.8.1, 2.8.2, 2.8.3, etc
Minor releases generally only include bug fixes and security patches. More importantly, we try not to include any changes that require updates to the database schema in a minor version, such as new features (which might require new database tables or columns). We also try not to change the larger dependency versions (such as PHP, MySQL, Elasticsearch, Bootstrap, etc) in minor releases. 

Major releases are the ones in which we will include any new features (and therefore may also include changes to the underlying database schema), as well as where we will implement any major dependency changes - newer versions of Ubuntu, PHP, MySQL, Elasticsearch, or similar - or even full replacements for these (as we are considering for Elasticsearch in the 2.9 release).

There are also two different ways to install AtoM, described in our installation documentation
  • Option 1 in the docs is to install from the downloadable tarball we prepare and maintain on the AtoM website (here)
  • Option 2 is to install using our GitHub code repository (here)
Now, with that understanding: 

If you have installed following Option 2, then it is generally safe to  "upgrade in place" for MINOR release versions, since there are generally no database schema changes included, nor any dependency changes. You would simply run a git pull --rebase to pull in the latest code, run the general maintenance tasks (such as build nested set, clear cache, restart PHP-FPM, and repopulate the search index), and be ready to go. This is the ONLY scenario in which we recommend  "upgrading in place."

As such, for issues 1 and 2, affecting major version upgrade attempts: really the only fix is to read the docs and restart the upgrade process and do it as recommended this time. 

For issues 3 and 4

A good first check is to compare your database schema version against what is expected for the release. If you navigate to Admin > Settings and check the full version number  - or you run php symfony tools:get-version from the command line (docs for this task here), you will see that a full AtoM version number includes 2 numbers, like these examples: 
  • 2.7.3 - 192
  • 2.8.0 - 193
The first number is the AtoM version. The second number is the database schema version. Since 2.6, we have started always including the expected schema version for the current release in the Release notes we maintain on the wiki for formal Release announcements. See: 
If you check your full version number, and the database schema number does not match the one in the release notes, then this is likely the source of the issue - the current database schema does not match what is expected, which can cause many of the errors similar to what you would see for data corruption - SQL errors, invalid entries in fields that shouldn't support them, etc. The fix suggestion is essentially to go back to the upgrade process and repeat the steps from wherever a step might have been skipped - i.e. drop and re-create the database, reload your sqldump, run the upgrade task, and restart services / repopulate the search index. 

See also this previous user forum thread: 
What about 504 errors, execution time error messages, etc?

For 504-timeout errors, and error messages like: 
  • "Error Connection Timed Out," 
  • Fatal error: Maximum execution time of 60 seconds exceeded in ...
  • Fatal error: Allowed memory size of xxxxxxxx bytes exhausted (tried to allocate yyyyyyy bytes) in ...
  • (Blank white screen in the UI that never resolves)
...these generally happen when a long-running process is terminated before it could complete. 

In some cases, such as digital object uploads done via the web browser to AtoM's user interface, this may be a limitation of the browser itself. Most browsers have a built-in timeout limit of about 1 minute, after which they terminate a connection so that the browser does not use up all the client's system resources trying to complete the request. In such cases, the best answer is to use the command-line for these long-running steps. See for example: 
The other option, if you are committed to using the user interface (or perhaps you don't have access to the command-line), is to simply break up what you are trying to import into smaller chunks, such as less digital objects at a time, and/or smaller CSV files (perhaps 1 per series, etc). Be sure to review the CSV import documentation carefully to understand how parent-child relationships work, and how to properly break up your CSV files so they will import to the right place. 

If the timeouts are for back-end processes, and/or no amount of breaking up the content will help, then perhaps the issue is the PHP Execution limits. 

During installation we create a PHP pool, and set a number of configuration options for that PHP pool, which set limits on things like maximum file size, maximum memory, etc. See: 
You can see in the block included in the link above that the PHP pool sets values like: 

php_admin_value[memory_limit] = 512M
php_admin_value[max_execution_time] = 120
php_admin_value[post_max_size] = 72M
php_admin_value[upload_max_filesize] = 64M
php_admin_value[max_file_uploads] = 10

These can also be set globally in your server, via a php.ini file. You can adjust the values in both places as needed - just be careful increasing them too much! See: 
Be sure to restart PHP-FPM after making any configuration changes. 

Other general troubleshooting tips and investigation suggestions

We have tried to include most of what is above in a single documentation page, which is a good starting point for any system problems. See: 
If you run into a problem and it does not seem related to any of the examples given above, from this documentation page, the suggestions are: 
And if none of that helps clarify the issue, then: 
  • Get your full AtoM version number
  • Answer the questions about your installation at the top of this message
  • Include any information found by Debug mode, the search:status task for checking your search index, or in the webserver error logs
  • Tell us what you have tried already, and what (if anything) effect those attempts had on the problem
  • Post all of that here!
With all that information, we can usually provide some good suggestions on what to check or try next. 

I hope this helps! Please let us know what you find, and how it goes. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--

Maria Papanikolaou

unread,
Mar 3, 2024, 1:21:39 PMMar 3
to Dan Gillean' via AtoM Users

Dear Dan, 

Thank you for your reply and I apologise for getting back to you so late. 

Regarding the above problem, which happened for the fourth time last week, when I said "we now know how to deal with it", I meant that we know how to restore AtoM with a command, and the system comes back after 5 minutes. 

But we do not know what is causing the problem. 

We upgraded our system to version 2.7.3-192  last December and since then we have had 4 crashes. This was not the case with the previous version (2.6) that it was really stable. 

I will try to follow your guidance as suggested in the email below, in particular the steps you advised on "encountering 500 errors" and "problems after an upgrade": 

I assume that the root of the problem lies in the upgrade process.

With kind regards, 

Maria 

Dan Gillean

unread,
Mar 4, 2024, 8:14:13 AMMar 4
to ica-ato...@googlegroups.com
Hi Maria, 

There is nothing significant that has changed about the upgrade process in AtoM itself. I would not necessarily assume that AtoM itself is broken merely because you are dealing with repeated errors. We still need more information. 

I have outlined a number of processes in my last email - which are not only intended to restore AtoM to working functionality, but ALSO to help you determine the underlying cause of an issue. Similarly, if it is in fact an AtoM bug causing these issues, then our team needs to be able to reproduce the exact conditions described to be able to identify the problem and resolve it. However, so far you have not shared the information needed to be able to determine underlying causes and next steps. 

So: we know your full version number (2.7.3 - 192) - thanks! From the schema version I can at least determine that the upgrade task was run successfully... but there's still a lot we don't know. 

For example: 
  • What message do you see in the error logs when this error occurs?
  • Have you checked the logs every time? Can you confirm that it is the SAME error message every time you've encountered a 500 error?
  • What is this command that you use to "restore" AtoM? That alone might tell us a lot about the underlying issue
  • What is happening in the system when the error occurs? Is it always the same thing or different things? e.g. trying to create a record; performing a search; running an import, etc...
And, it would  be very helpful if you can answer the other initial questions included in the "Encountering 500 errors" part of my last message: 

  • Did you follow exactly the recommended installation instructions for your version? If not, what changes have you made?
  • Are you using a custom theme plugin, and/or does your installation have any other local code customizations?
    • If not, are you using the older default Bootstrap 2 Dominion theme, or the new Bootstrap 5 Dominion theme included in 2.7 and later?
  • Does your AtoM installation meet the recommended minimum hardware requirements listed here? How much memory does your installation have?
  • Are there any important previous actions that might have caused this worth mentioning? For example, did you upgrade recently? Did someone attempt an import? When and why, to the best of your knowledge, did this error start happening? What was the triggering event before the error message?
  • What, if anything, have you tried so far to investigate and/or resolve the issue? And what happened?

Hopefully once we understand what exact error is occurring, then we can determine WHY it is occurring and how to resolve it going forward. Additionally, depending on what you find in the webserver error logs, my previous email may give you and your team some ideas on next steps you can try yourself. Good luck, and let us know what you find!

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

Roger Rutishauser

unread,
Apr 5, 2024, 8:28:55 AMApr 5
to AtoM Users
Hi,

Great compilation, Dan!
In this case of Maria Papanikolaou, another reason led to the 500 errors. For the sake of completeness, I'll report it here
It always appeared after the system was rebooted. It seems when the MySQL instance was started by systemd, no IP had yet been assigned by DHCP. 
I had to change the file /usr/lib/systemd/system/mysql.service like this:

Before:
[Unit] Description=MySQL Community Server After=network.target [... etc...]

After:
[Unit] Description=MySQL Community Server After=network.target network-online.target [... etc...]

Now it waits for it.

Cheers, Roger

Dan Gillean

unread,
Apr 5, 2024, 11:49:54 AMApr 5
to ica-ato...@googlegroups.com
Hi Roger, 

I appreciate you updating the thread with this additional information! I will pass this on to the Maintainers - I don't personally know much about service configuration, but at first glance this looks like a good default that we should consider in our official documentation. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

Reply all
Reply to author
Forward
0 new messages