Indexing does not finish

75 views
Skip to first unread message

ism...@gmail.com

unread,
Jul 3, 2023, 2:15:25 AM7/3/23
to AtoM Users
Good morning, we have launched a full reindex of our Atom using the command:

php symfony search:populate

The server has been running this command for several days and it doesn't look like it's going to finish any time soon.

We are not sure if it is finishing very large tasks or if it is blocked.

I send image of current processes:
atom.png
I would like to know what I can do to speed up the processes and if it is indeed working, why is it taking so long?

Thanks and regards.

José Raddaoui

unread,
Jul 3, 2023, 12:30:11 PM7/3/23
to AtoM Users
Hi,

It depends on your dataset, AtoM version and resources. We have seen indexing times of multiple days in datasets around a million records, specially before AtoM 2.6.x where we added major improvements to that process. What AtoM version are you running and how big is your dataset? If it's bellow 2.6.x, upgrading will make a noticeable difference. 

Best,
Radda.

ism...@gmail.com

unread,
Jul 4, 2023, 2:48:12 AM7/4/23
to AtoM Users
Good morning, the AtoM version is 2.6.1 and the volume of digital files is 721 Gb. Regards.

José Raddaoui

unread,
Jul 4, 2023, 12:07:02 PM7/4/23
to AtoM Users
Hi,

For the indexing process, it's more about the amount the records in the DB and their relations than the volume of digital files. If you are running that AtoM version, all I can think is MySQL no returning data fast enough, or Elasticsearch not being able to index fast enough. If I recall correctly, Elasticsearch works pretty slow under heavy memory load (specially if working with a swap partition), which seems to be the case from your screenshot, so increasing the total memory available in that system may help with the performance of the indexing.

To check everything is still going, you could use the search status task, to see the amount of records currently indexed. By default AtoM indexes in batches of 500, so it may take a while to see an update in the counts:

php symfony search:status

Best,
Radda.

ism...@gmail.com

unread,
Jul 12, 2023, 3:04:51 AM7/12/23
to AtoM Users
Good morning, they have finally doubled our server's RAM memory to 16Gb. We have gone from having the server saturated using 8Gb of 8Gb to having 3Gb of 16Gb.

Running the command:

php symfony search:status

It keeps giving me the same result:

Elasticsearch server information:
 -Version: 5.6.16
 -Host: localhost
 - Port: 9200
 - Index name: atom

Document indexing status:
 - Accession: 1/1
 - Actor: 473/473
 - Aip: 0/0
 -Function object: 12/12
 - Information object: 38375/404522
 - Repository: 82/82
 -Term: 557/557


38375 indexed objects is the same as I have for at least 4 days.
Is it possible that a batch is being processed and the progress is not yet visible?
It is true that the number of processes when doing an htop has decreased.
All the best.

ism...@gmail.com

unread,
Jul 13, 2023, 8:03:17 AM7/13/23
to AtoM Users
Good afternoon, and what would be the safest way to stop the current processes? I'm worried about a process indexing in the middle and causing some problem. Thanks for everything and regards.

José Raddaoui

unread,
Jul 13, 2023, 2:15:09 PM7/13/23
to AtoM Users
Hi,

Not sure how are you executing such long running task in there. If you were running the search:populate task in an open terminal you could do Ctrl+C to terminate it, but you may need to kill the process differently. Also, if you are not doing it already, I'd run that task without PHP's memory limit:

php -d memory_limit=-1 symfony search:populate

Best,
Radda.

ism...@gmail.com

unread,
Jul 16, 2023, 2:36:41 AM7/16/23
to AtoM Users
Good morning José, the process is not in the foreground so I cannot stop it with CTRL+C. If I look at the processes via htop I get about 60 launched from elasticsearch, I'm not sure which one I should kill, how can I find out? greetings.

José Raddaoui

unread,
Jul 17, 2023, 12:42:27 PM7/17/23
to AtoM Users
Hi,

You should kill the PHP process that started the indexing (php symfony search:populate). If you are not using Elasticsearch for something else, I'd restart it too. You could also check Elasticsearch logs to see if there is still something going on in there, they are usually in `/var/log/elasticsearch/elasticsearch.log`.

Best,
Radda.

ism...@gmail.com

unread,
Jul 19, 2023, 12:15:03 PM7/19/23
to AtoM Users
Good afternoon, I have finally restarted elasticsearch and I have launched the command that you suggested: php -d memory_limit=-1 symfony search:populate and with the increased RAM it has executed it in less than an hour without problems. Many thanks for everything!

José Raddaoui

unread,
Jul 20, 2023, 7:46:46 AM7/20/23
to AtoM Users
Great news, glad it worked.
Reply all
Reply to author
Forward
0 new messages