Production system is down with 500 Internal Server Error

147 views
Skip to first unread message

John Coquitlam

unread,
Jul 30, 2021, 12:33:39 PM7/30/21
to AtoM Users
Hi everyone,
Our production system is down and according to the error log it has pages of The "i18n" object does not exist in the current context.

We have tried reboot our ubuntu server
Also tried restarting fpm, apache and doing symphony cc and still cannot access our site.

Any help appreciated.
Thanks,
John

Dan Gillean

unread,
Jul 30, 2021, 6:51:09 PM7/30/21
to ICA-AtoM Users
Hi John, 

If you do a search in the forum for this error message, you'll find some older threads you can peruse for ideas, but I'll try to summarize some of the previous solutions I've seen here (this issue can be caused by a number of things, it seems). 

I should note first that we don't test or develop AtoM with Apache, so if the issue has to do with web server configuration, we will be limited in the suggestions we can offer. It would help to know what version of AtoM you have installed (the full version number listed in Admin > Settings, or via this command-line task, if the system will let you get it), as well as if you've made any other changes to the recommended requirements and installation instructions for your version. 

With that in mind, here's a roundup of general suggestions:

First, have you upgraded recently? If yes, did you ensure you followed all steps of the upgrade process, including dropping and recreating the database before loading your backup and running the upgrade task? Often if this step is missed (or the upgrade task is not run), there can be a mismatch between the application code and the database schema, leading to unexpected outcomes. 

Second, sometimes this error is caused by a mismatch in filesystem permissions in the AtoM directory. AtoM expects everything below the root installation directory to be owned by the www-data user. You can reapply the expected permissions with the following command: 
  • sudo chown -R www-data:www-data /usr/share/nginx/atom
Next, how much memory does your system have available? I've seen this error be resolved before by increasing the available system memory. You might also want to check things such as the PHP execution limits, and the PHP pool parameters set up during installation. See: 
More importantly, this error has sometimes been caused in the past when there is a typo or other error in the PHP pool configuration file. Assuming you're running 2.6.x, be sure to review /etc/php/7.2/fpm/pool.d/atom.conf against the sample pool configuration in the link above. You can also make sure that PHP-FPM is running as expected with: 
  • sudo php7.2-fpm --test
Additionally, we are investigating some reports around memory leaks with the AtoM job scheduler, so in case it is holding onto the available memory, you can restart it with: 
  • sudo systemctl restart atom-worker
  • sudo systemctl reset-failed atom-worker
We also have some general suggestions on monitoring active processes with htop in our Troubleshooting documentation, here: 
Have you added a new language via the Language menu recently without reindexing, or imported content with a different culture value than the installation culture without first adding the language to the language menu and reindexing?

Trying to run the search reindex task is something worth trying, as it could resolve the first of these errors. If you think you've run into the second case and are able to access the language menu, add the relevant culture and reindex. To reindex: 
Finally, you can double-check the contents of the apps/qubit/config/settings.yml configuration file. It *should* look the same as what's found in this temporary file, other than any changes in default culture and timezone you might have made during the installation process. In one forum thread, a user reported that a timeout in the web installer led to this file ending up as a zero-length file, which caused the same error. He resolved it by removing the zero-length file, copying the temp file (without the .tmpl extension), making any needed local changes, then restarting all services. In the upcoming 2.7 version of AtoM, we have finally fully removed the web installer and replaced it with a command-line installer to avoid these types of issues. 

Hopefully one of these suggestions might point you in the right direction, so you're able to troubleshoot further! 

If you're still stuck, please let us know more information about your installation (application version, MySQL version, PHP version, etc), what you've tried from the list of suggestions above and what you found, and we can go from there. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/16aea62e-c2b9-47af-89c2-b40c3f5f93aen%40googlegroups.com.

John Coquitlam

unread,
Aug 3, 2021, 5:22:16 PM8/3/21
to AtoM Users
Thanks Dan,
We ended up restoring from a snapshot.
We have backed up the database and will use that for further investigation.
Will update everyone on our findings.
Thanks,
John

John Coquitlam

unread,
Aug 6, 2021, 1:37:30 PM8/6/21
to AtoM Users
Hi everyone,
I took the backup database, upload, and downloads directories to our soon to be production system running 2.6.1 and things appears to be running normal.
However, when I tried to login, it gave me the error 'Sorry, you do not have permission to access that page'
I even tried using the command line to create another admin user but got same error.
There is no error in the mysql error log.
Searching on this form the only possible cause was the Authentication Method.
When I checked in default-auth-orride.cnf, it contains 'default-authtication-plugin = mysql_native_password'
Therefore, it should be ok.
Can anyone think of anything?

Thanks,
John

Dan Gillean

unread,
Aug 6, 2021, 3:37:05 PM8/6/21
to ICA-AtoM Users
Hi John, 

Very strange. What version of MySQL, PHP, and AtoM are we working with here? I'm assuming 2.6.x and MySQL 8.0 at least, but it's good to check. 

A couple of ideas, in case the issue is MySQL and authentication related: 

First, I think you can also declare the preferred authentication method in the configuration file at /etc/mysql/conf.d/mysqld.cnf we recommend creating during installation. Try adding the following line, then restarting MySQL: 
  •  default-authentication-plugin=mysql_native_password
Additionally, it may be that the existing database users do not have their password stored via the old method, which could be causing the error. I found this query on StackExchange, which you could potentially use to modify to alter your existing MySQL users as needed. Keep in mind I haven't tested it myself, so proceed at your own risk, and consider making a backup first! 
  • ALTER USER 'root'@'localhost' IDENTIFIED BY 'password' PASSWORD EXPIRE NEVER; ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '{NewPassword}';
Let us know if that helps! If not, I'll ask our team for further suggestions. 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


John Coquitlam

unread,
Aug 6, 2021, 5:57:01 PM8/6/21
to AtoM Users
Hi Dan,
I updated mysqld.cnf and restarted but still got the same error.
Our system consist of php 7.2   mysql 8.0.22  atom 2.6.1
Attached are pictures of the error message and the content of our updated mysqld.cnf
Thanks,
John
atom_auth.jpg
mysqldcnf.jpg

Dan Gillean

unread,
Aug 9, 2021, 11:28:34 AM8/9/21
to ICA-AtoM Users
Hi John, 

One of our developers looked at the thread, and doesn't think it's related to the SQL authentication. In that case, I don't think even the permission denied message would load. 

Is this the homepage where you're seeing this message after trying to log in, or a different page? I'm wondering if someone might have changed the default group permissions on either the Admin group, or the Authenticated group... but I also can't think of a setting available in those group permissions that would prevent access to static pages like the homepage, so I'm not sure. Are you able to use the main menu to navigate to any other pages, or do you get the same message for any page?

Another thought: are you able to change to the default Dominion theme? There's a chance that some customization in your custom theme is causing issues...


Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

John Coquitlam

unread,
Aug 9, 2021, 12:45:59 PM8/9/21
to AtoM Users
Hi Dan,
We are seeing this message immediately after entering the login information.
We also noticed that once this message shows up, we cannot go to another page regardless where we click. It is after we go to '.../user/logout' then we can go to the home page and other pages of the site.

Regarding  using another theme, I don't think we can change theme without logging in first.

Thanks,
John

Dan Gillean

unread,
Aug 9, 2021, 3:47:16 PM8/9/21
to ICA-AtoM Users
Hi John, 

Right - sorry, thought I'd check just in case. 

Here are the theories I have after discussing this with our developers:
  • There's some change in your custom theme that has caused this issue, or else other custom code elsewhere causing the issue
  • One of the security.yml files in AtoM has been changed or overwritten - these files are used to set group permissions (here's an example for static pages)
    • Possibly the theme has something that is overwriting the default security.yml files?
  • Something has been changed in the Authenticated or Admin groups that is causing this issue - though since there's not a setting in the UI specifically for setting these, I'm not sure exactly how this proposal would work
That said, it's really hard to say without doing some investigation.

Perhaps more importantly: this is a post-mortem on a system that sounds like it is running fine after reloading an older snapshot. This may suggest that it has nothing to do with the above, though this fact alone doesn't tell us anything. So: what can you tell us about what has happened between the date of the snapshot and the appearance of the error that might have led to the initial problem? 

Additionally, is there any difference between the development environment where you are running these tests and your production environment? Might any differences be involved in this? What happens if you try to load a sqldump of the now-functional production system into this problematic installation - does this resolve the issue? If yes, then it suggests the problem is in the database somewhere, and if it doesn't then at least we can rule that out. 

I can possibly work with our team to figure out a SQL query to change the theme to the default, but I thought that I would first check on these possibilities - and also check in to see if solving a problem that's not currently affecting your production environment is worth the effort to test a theory at this time. In the meantime, let us know what you find! 


Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

John Coquitlam

unread,
Aug 9, 2021, 4:36:37 PM8/9/21
to AtoM Users
Hi Dan,
Let me clarify the situation from the beginning.
Last week our production system (version 2.4) got the 500 server error.
We restored a snapshot of a few days ago and now is running fine.

I took a sqldump of that database and migrated it to our development system (version 2.6.1).
This system was tested and production ready other than the data migration.
If not for this login issue, it could have been production.
So right now, this system tested fine with the exception of the login issue.

I hope that clarified it a bit more.

Thanks,
John

Dan Gillean

unread,
Aug 10, 2021, 10:11:04 AM8/10/21
to ICA-AtoM Users
Hi John, 

Okay, thanks for the clarification and sorry for the confusion - I think I understand better now. 

I'm still not sure as to the best way to proceed in troubleshooting this - our team can't recall seeing this before. One more clarifying question about the 2.4 snapshot you are trying to upgrade and load into the 2.6.1 development site: am I correct in assuming that you created this sqldump AFTER you had restored from backup and your production system was working as expected?

I'm trying to determine if we can rule out any sources of the issue. If the current production database can be loaded fine (i.e. you were trying to load a dump that had the original 500 error), then that should suggest that the pre-recovery snapshot has a setting or some kind of corruption in the database causing the problem. If it also doesn't work, then the issue is either not in the DB but in the deployment - or else some code change is affecting settings that previously worked in 2.4. 

If you wanted to try creating a new sqldump and loading that, it might also be a good way of confirming that this isn't due to accidentally skipping one of the upgrade steps (such as dropping and recreating the database before loading your SQL dump, or running the upgrade task). It would also be useful to check what the database schema version is in your 2.4 production site (the second number shown next to the release number in Admin > Settings), to ensure that it's running the correct schema. Looking here in the stable/2.4.x code, I believe the schema version should be v156 for a 2.4.1 installation.

Additionally - when upgrading, have you performed the following steps for custom themes in the 2.6.x site?
I don't know how likely it is that the custom theme is the source of the login issue, unless your custom theme also contains template overrides (i.e. custom version of page templates in the theme plugin), but it's worth double-checking. 

Another thing you could try to narrow down the issue if you do want to rule out the theme as much as possible - switch your production site temporarily to the base Dominion theme, just long enough to create another sqldump, and then try loading and upgrading that into the development environment. That should help us determine if there is anything about the custom theme causing the issue. Additionally, you could double-check that there are no unexpected customizations to the Authenticated, and/or Administrator group permissions on the production site before creating your SQL dump, so we can also determine if the behavior is related to group permission customizations. If you do have customizations in the production site that you don't want to change, at least we can record what they are, to see if there's a change between 2.4 and 2.6.x that might have caused the resulting behavior of your customizations to change. 

Anyway, hope this helps advance your troubleshooting. Let us know what you find! 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

Reply all
Reply to author
Forward
0 new messages