ElasticSearch reindexing is breaking

252 views
Skip to first unread message

Brandon Uhlman

unread,
Jan 25, 2021, 3:37:40 PM1/25/21
to ica-ato...@googlegroups.com
Hi, all.

Sorry in advance for this being so long.

For context, our regular workflow is that we complete regular data maintenance in a development AtoM instance, and periodically refresh our production instance with a database snapshot from the development instance. When I completed such a refresh last night, the ElasticSearch refresh (php symfony search:populate) failed. After recovering production to a backup of its previous state and reindexing again (which worked fine), I then tried to reindex the development instance in-place -- so I could rule out a problem with a problem with our snapshot process, even though it has worked flawless for us for more than a year.

The development reindexing failed with the same message as prod:

root@[hostname-redacted]:/var/www/atom-2.6.1# php symfony search:populate
Defining mapping QubitAip...

  name cannot be empty string

...and if one excludes AIP from the types to be reindexed, it will then try to index terms, get 500 entries through (that's batch_size in $ATOM_HOME/config/search.yml), and fail with a bunch of messages like this:

  index: /atom/QubitTerm/703727 caused type[QubitTerm] missing [index: atom]  
  index: /atom/QubitTerm/703739 caused type[QubitTerm] missing [index: atom]  
  index: /atom/QubitTerm/703852 caused type[QubitTerm] missing [index: atom]  
  index: /atom/QubitTerm/704048 caused type[QubitTerm] missing [index: atom]  
  index: /atom/QubitTerm/704223 caused type[QubitTerm] missing [index: atom]  
  index: /atom/QubitTerm/693745 caused type[QubitTerm] missing [index: atom]  
  index: /atom/QubitTerm/708346 caused type[QubitTerm] missing [index: atom]  
                                                                               
This is similar to the problem Joel Marchand reported in https://groups.google.com/g/ica-atom-users/c/mhfPJ1BgzE8/m/AXH3ZnbcAQAJ. Like Joel, we are running Ubuntu 18.04, PHP 7.2, ElasticSearch 5.6.16 and MySQL 8.0.22 (on a separate server), but current with AtoM itself, running 2.6.1.

For a bit of additional context, when the initial full reindex fails, the correspondening detailed stack trace in ElasticSearch's logs looks like this, and following that, the successful log entry from a copy of our database before we started experiencing the issue.

[2021-01-25T14:59:26,205][INFO ][o.e.c.m.MetaDataDeleteIndexService] [7p0NKF7] [atom/H32bdnvqRymYniTVjLwg4Q] deleting index
[2021-01-25T14:59:26,289][INFO ][o.e.c.m.MetaDataCreateIndexService] [7p0NKF7] [atom] creating index, cause [api], templates [], shards [4]/[1], mappings []
[2021-01-25T14:59:26,475][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [7p0NKF7] failed to put mappings on indices [[[atom/i_6nvJSuS3WqrMYBARXQFw]]], type [QubitAip]
java.lang.IllegalArgumentException: name cannot be empty string
        at org.elasticsearch.index.mapper.ObjectMapper.<init>(ObjectMapper.java:326) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.ObjectMapper$Builder.createMapper(ObjectMapper.java:160) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:152) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:95) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:143) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:95) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:143) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:95) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:143) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.DocumentMapper$Builder.<init>(DocumentMapper.java:69) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:111) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:91) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:644) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.applyRequest(MetaDataMappingService.java:264) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.execute(MetaDataMappingService.java:230) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.cluster.service.ClusterService.executeTasks(ClusterService.java:634) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.cluster.service.ClusterService.calculateTaskOutputs(ClusterService.java:612) ~[elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:571) [elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) [elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:576) [elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) [elasticsearch-5.6.16.jar:5.6.16]
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) [elasticsearch-5.6.16.jar:5.6.16]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]

I've tried all the suggested troubleshooting steps in Joel's thread.

Any other suggestions?

Brandon


                                         

     

Dan Gillean

unread,
Jan 26, 2021, 9:50:34 AM1/26/21
to ICA-AtoM Users
Hi Brandon, 

If you followed the steps in Joel's thread exactly then you may have implemented some outdated SQL modes for 2.6 that could be causing the issues you're seeing. 

Essentially, we've found that the AIP table in particular (which is used to store technical metadata during DIP uploads from Archivematica) seems to throw these kinds of errors when STRICT_TRANS_TABLES is enabled as a SQL mode.  See the bottom half of this section, which specifies what sqlmodes should be used in 2.6.x: 
If you want to check or change the SQL modes of an installation, the following page might help:
If you're new to using the MySQL command prompt, we have basic instructions on how to access it, and how to find out your MySQL credentials if you don't know them, here:
Remember to restart MySQL after - I also recommend that you run a few other maintenance tasks before reindexing, just to make sure everything is in a good state. I'd suggest that you clear the application cache, restart PHP-FPM, and rebuild the nested set

Let us know how that goes and if it helps. If not, I will check in with our team and see if they have further suggestions. In the meantime, I'd say you could: 
Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/YTOPR0101MB1402F82C6B765FA7ED6032E9FDBD0%40YTOPR0101MB1402.CANPRD01.PROD.OUTLOOK.COM.

Brandon Uhlman

unread,
Jan 26, 2021, 10:49:43 AM1/26/21
to ICA-AtoM Users
Hi, Dan.

Thanks for your excellent suggestions.

We were/are not running with STRICT_TRANS_TABLES, and I had already completed the remaining troubleshooting steps listed.

~B


From: ica-ato...@googlegroups.com <ica-ato...@googlegroups.com> on behalf of Dan Gillean <d...@artefactual.com>
Sent: January 26, 2021 9:50 AM
To: ICA-AtoM Users <ica-ato...@googlegroups.com>
Subject: Re: [atom-users] ElasticSearch reindexing is breaking
 

Brandon Uhlman

unread,
Jan 26, 2021, 12:51:55 PM1/26/21
to ICA-AtoM Users
For anyone else who is running into this problem, I can report the cause (and the fix), at least in our instance.

Under Admin => Settings => I18n language settings, someone added a blank language by choosing no option from the dropdown field, and then clicking 'Add'. Removing the blank language entry was all that was required to allow reindexing to proceed again.


A future release might want to patch the UI so the blank language option can't be selected. But in the meantime, at least the issue is fixed.

Thanks again, Dan, for the assistance.

Brandon

From: ica-ato...@googlegroups.com <ica-ato...@googlegroups.com> on behalf of Brandon Uhlman <brandon...@uwaterloo.ca>
Sent: January 26, 2021 10:43 AM

Dan Gillean

unread,
Jan 26, 2021, 5:26:17 PM1/26/21
to ICA-AtoM Users
Hi Brandon, 

Thank you for posting an update to the thread about what you discovered and how you resolved the issue!

I've filed a bug ticket for the issue you describe here, so we can track it and hopefully address it in an upcoming release: 
Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

Joel Marchand

unread,
Feb 10, 2021, 1:24:55 PM2/10/21
to AtoM Users
Hi Brandon and Dan,

YES !!  You have found the solution for all my troubles with Atom 2.5/CentOS 7 and Atom 2.6/Ubuntu 16.04, since more six months and dozens of hours of checks and retries.

In fact, the problem is always the same and the solution is what you say

"Under Admin => Settings => I18n language settings, someone added a blank language by choosing no option from the dropdown field, and then clicking 'Add'. Removing the blank language entry was all that was required to allow reindexing to proceed again.

Many many congratulations !!
 
  Joel Marchand
Reply all
Reply to author
Forward
0 new messages