Disable indexing during import

663 views
Skip to first unread message

Alain Sahli

unread,
Aug 8, 2012, 8:50:54 AM8/8/12
to alfresco-bulk-f...@googlegroups.com
Hi all!

I have about 30M Nodes to import in Alfresco.

After each import test, the importer process was killed after 6-8 hours and the database server (MS SQL 2008 R2) had a load average of 50%. This was mainly due to the fact that the transaction indexing job used a lot of resources on the database server.

I always set the index.recovery.mode to AUTO. But after each failed import, alfresco took so long to start (several days...) because of the index recovery that I tried to start it with index.recovery.mode set to NONE.

After that, I started a new import and then everything worked faster and the database server never reached more than 15% load. Now the importer is running since 2 days with a speed of 54 nodes/sec. I saw that alfresco isn't anymore indexing during the import. This is ok for me, so I can update the database statistics and indexes after each import chunk and start the next one. When everything will be in Alfresco I will do a FULL index recovery.

My problem is that I can't understand why the index is not updated during the import. And I must know why because I will have to do it on the production server too.

Here are some settings I changed for the import:
db.pool.max=675
hibernate.jdbc.fetch_size=150
lucene.maxAtomicTransformationTime=0
index.recovery.mode=NONE
system.usages.enable=false
index.tracking.disableInTransactionIndexing=true
audit.enable=false
hibernate.cache.use_second_level_cache=false

I did almost everything what stands in the "Zero day config" document.

So now my question :D Can someone explain me why the indexing stopped?

Thanks!
Alain

Alch3mi5t

unread,
Aug 9, 2012, 6:19:22 AM8/9/12
to alfresco-bulk-f...@googlegroups.com
Hi,
that should be mainly because of the option you set:

index.tracking.disableInTransactionIndexing=true

This disables indexing during the import/upload process. It speeds up bulk imports but makes the index process asyncronous, so it's done when the system has resources to allocate there.
Hope this was of some help.
Greetings!
Alen
Reply all
Reply to author
Forward
0 new messages