zoie exception cannot overwrite index file

96 views
Skip to first unread message

brij...@gmail.com

unread,
Jan 27, 2013, 7:02:55 AM1/27/13
to zo...@googlegroups.com
Hi Everyone,

I am using the bobo-browse with zoie for real time indexing. zoie version 3.0.0, I am getting error message "can not overwrite" I have created jira ticket. any help will  be greatly appreciated.


Best Regards,
Brij

John Wang

unread,
Jan 27, 2013, 12:26:40 PM1/27/13
to zo...@googlegroups.com
Can you check to see if the directory has write permissions and/or there are enough disk space?

-John

--
You received this message because you are subscribed to the Google Groups "zoie" group.
To post to this group, send email to zo...@googlegroups.com.
To unsubscribe from this group, send email to zoie+uns...@googlegroups.com.
Visit this group at http://groups.google.com/group/zoie?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

bri...@gmail.com

unread,
Jan 27, 2013, 11:24:55 PM1/27/13
to zo...@googlegroups.com
Hi John,

Thanks for your response. I really appreciate it.

yes, I have checked it. Although it create index first  time and works fine. We have daily Quartz Job to create index. when next time our qzuartz api scheduler job runs and tries to overwrite the index, it throws below exception.

Best Regards,
Brij

bri...@gmail.com

unread,
Jan 28, 2013, 12:34:53 AM1/28/13
to zo...@googlegroups.com
Hi John,

I have checked and found that by default windows 7 directory is read only. I have checked that files are not read only.

It is surprising that although directory is read only, it is allowing to write files when it is empty during the initial job run for indexing. next time it can not be overwritten it.

Best Regards,
Brij

bri...@gmail.com

unread,
Jan 28, 2013, 1:08:42 AM1/28/13
to zo...@googlegroups.com
disk space is also not an issue.

bri...@gmail.com

unread,
Jan 30, 2013, 8:27:34 AM1/30/13
to zo...@googlegroups.com
Hi John,

As of now Bobo and Zoie are implemented in system and in production.  I would really appreciate your help to resolve this issue.

I tried to debug the source code and modified the DiskSearchIndex and RAMSearchIndex.java and passed the create="true". it works however it does not create index for all the entities.

IndexWriter idxWriter = new IndexWriter(_directory, analyzer, create, MaxFieldLength.UNLIMITED);

Thanks for your help.

Best Regards,

Brij

John Wang

unread,
Jan 30, 2013, 3:01:58 PM1/30/13
to zo...@googlegroups.com
Hi Brii:

    Can you try with the latest version of Zoie? We have not seen this before. I wonder if this is specific to the windows environment.

    Also, we have moved bug tracking for zoie to senseidb.atlassian.net.

Thanks

-John

To unsubscribe from this group and stop receiving emails from it, send an email to zoie+uns...@googlegroups.com.

To post to this group, send email to zo...@googlegroups.com.

bri...@gmail.com

unread,
Jan 30, 2013, 5:00:14 PM1/30/13
to zo...@googlegroups.com
Thanks John for you response. I really appreciate it.

Sure, Currently I have updated form zoie 2.0 to zoie 3.0.0.  Zoie 3.0.0 does not create any compilation errors in our framework classes except few minor changes.
However When I tried to use the zoie 3.5.0, I have came across so many compilation errors specific to lucene package. We are using lucene version 2.9.2  We have lucene-core.jar in our deployment.

Could you please help me to find out Which lucene version is compatible with  zoie 3.5.0.  ?  Is lucene jar is bundled with zoie 3.5.jar ?

Thanks for information. yes, I have also created the ticket in new repository. https://senseidb.atlassian.net/browse/ZOIE-164

Best Regards,
Brij

John Wang

unread,
Jan 31, 2013, 1:37:51 AM1/31/13
to zo...@googlegroups.com
I think 3.50 is a linkedin internal fork.

The latest official zoie release is 3.2.0, see: https://github.com/senseidb/zoie/tags

which depends on lucene 3.5.0.

-John

bri...@gmail.com

unread,
Jan 31, 2013, 3:35:49 PM1/31/13
to zo...@googlegroups.com
Thanks John for your help. We will upgrade to new version.

I can see the codebase there. Could you please suggest me to path to download jar for 3.2 source too ? I think it is https://github.com/senseidb/zoie/downloads
where I can see the zoie-solr, zoie-jms and zoie-core jars for version 3.2

Kindly correct me if you find any discrepancy.

Best Regards,
Brij

John Wang

unread,
Jan 31, 2013, 9:15:17 PM1/31/13
to zo...@googlegroups.com
zoie-core is the jar you want, no?

if you want to source, just download the tag.

-John

Brijrajsinh Jhala

unread,
Feb 1, 2013, 11:49:12 AM2/1/13
to zo...@googlegroups.com
Hi John,

Issue is related to write.lock file.  I have found that following method of  FSDirectory.java (Lucene package)

 protected void ensureCanWrite(String name) throws IOException 

throws the exception.  I seems that it is failing at file.delete because of writer.lock file present in that index directory.

 if (file.exists() && !file.delete())          // delete existing, if any
      throw new IOException("Cannot overwrite: " + file);

We have multiple threads hitting to ZoieSystem.consume(...) method.
We executes the query in multi thread environment, which in turns create DataEvent object and pass to this consume method. so-essentially we are keep passing the DataEvent object to consume method until we finish our index process.

Next day again schedule starts and creates index from scratch. 

Any pointer will be greatly appreciated.

Best Wishes,
Brij

Brijrajsinh Jhala

unread,
Feb 1, 2013, 3:38:15 PM2/1/13
to zo...@googlegroups.com
Further update: 

First time index  Quartz job runs, it deletes existing index files first. Indexwriter writing files successfully  it run successfully.

Next time the Quartz job runs, it does not delete content of index files. As show below stack, IndexWriter acquire lock on index directory.
However IndexWriter.copySegmentAsIs generates the identical file name to add index directory, which in turns tries to delete by method FSDirectory.java  void ensureCanWrite(String name) throws which does not allow since current index writer has already put the file write.lock in this directory.

IndexWriter.copySegmentAsIs(SegmentInfo, String, Map<String,String>, Set<String>) line: 3320             

IndexWriter.addIndexes(Directory...) line: 3159               

DiskSearchIndex<R>(BaseSearchIndex<R>).loadFromIndex(BaseSearchIndex<R>) line: 253       

DiskLuceneIndexDataLoader<R>(LuceneIndexDataLoader<R>).loadFromIndex(RAMSearchIndex<R>) line: 251               

DiskLuceneIndexDataLoader<R>.loadFromIndex(RAMSearchIndex<R>) line: 140            

RealtimeIndexDataLoader<R,D>.processBatch() line: 182            

BatchedIndexDataLoader$LoaderThread.run() line: 394   


Thanks
Brij

John Wang

unread,
Feb 1, 2013, 6:08:12 PM2/1/13
to zo...@googlegroups.com
Are you running this in multiple threads against the same zoie instance or multiple zoie instances?

if you are running multiple zoie instances against the same index location, it will not work.

-John

Brij

unread,
Feb 1, 2013, 6:35:05 PM2/1/13
to zo...@googlegroups.com
Hi John,

I am running multiple threads against only one instance I.e.  ZoieSystem object is initialized once and singleton object. 

Best regards,

Brij

Brijrajsinh Jhala

unread,
Feb 1, 2013, 8:24:06 PM2/1/13
to zo...@googlegroups.com
Hi John,

I have also upgraded to Zoie 3.2 and lucene 3.5.

Now further update : root cause of this issue for our system as below:

Every day when  Quartz scheduler Index job runs, its first delete all the index files and the Zoie recreates index from scratch based on provided DataEvent.

It always works fine : when tomcat starts and first time scheduler job runs. however when next day scheduler job runs, it is not able to delete the few files of index directory. hence we are facing overwrite error.

I am not sure, the way our index framework functions or or Index creation is correct. We do have requirement to delete index based on some db changes.

kindly provide your precious inputs on it.

Best Regards,
Brij

John Wang

unread,
Feb 1, 2013, 9:22:22 PM2/1/13
to zo...@googlegroups.com
Can you provide more details on why you are doing the delete?

Is it possible your delete job has not yet completed before your indexing job?

In the case delete fails, what do you think about the following changes to your delete logic:

1) rename the directory to a unique name, e.g. idx-YYYYMMDD...
2) try to delete that directory with best effort
3) proceed with indexing to current index, e.g. directory idx.

This way, indexing and delete will never collide and you can run a batch/cron job and clean up all old idx-YYYY directories.

-John

Brijrajsinh Jhala

unread,
Feb 1, 2013, 10:07:32 PM2/1/13
to zo...@googlegroups.com
Hi  John,

Few entities are deleted in system. so facet search should not show these entities.   We are not deleting index run-time for such deleted entities. Though We do real time indexing for few entities. hence ZoieSystem is initialized once, it is never shutdown.

Now coming to delete files: We simply invoke deleteIndexDir method before index cron job. where file.delete() calls on each file of index directory. It deletes all the files before index job runs very first time once app server starts.  Now index job creates index successfully.

Next day, before job runs, deleteIndexDir() is not able to delete all the files. it does delete few files. now Index (Cron) Job starts and creates index file, which in turns not able to overwrite the index file (same as  deleteIndexDir method not able to delete)

Is zoie or lucene keeps open cany file handler once index is created.  We never shutdown ZoieSystem so that could be reason ?

Point 2: if we are not able to delete files, does it allow to delete the directory.
IndexDir is being passed to Zoie System. so creating index to other directory requires some changes.

John Wang

unread,
Feb 2, 2013, 12:20:32 AM2/2/13
to zo...@googlegroups.com
Ah, so your delete process runs while zoie is still running?

Yes, file handles can still be held and won't delete cleanly.

I would shut down the zoie instance and then do the delete and then start up the process.

-John

Brij

unread,
Feb 2, 2013, 9:05:12 AM2/2/13
to zo...@googlegroups.com
Thanks John. To be honest, it was working and is working in few deployment of our application. But some of them are start having this deletion issue. 

Just for curiosity , ideally how zoie handles the index for deleted entities. I mean how to delete index files for deleted entities of our application. 

Best regards,

Brij

John Wang

unread,
Feb 2, 2013, 11:27:49 PM2/2/13
to zo...@googlegroups.com
In latest zoie release, there is a purge functionality which deletes the all docs.

This is what you would do:

call ZoieSystem.setPurgeFilter() and pass in a lucene Filter implementation. If you want to delete all docs, you would pass in new QueryWrapperFilter(new MatchAllDocsQuery())

Be careful this is run everytime memory index is flushed down to disk.

You can cleverly implement your purgeFilter to be docs > certain date.

-John

bri...@gmail.com

unread,
Feb 3, 2013, 10:25:40 PM2/3/13
to zo...@googlegroups.com
Thanks John, I tried PurgeFilter. Its true we can delete the docs  > date. however we have real time index too. PurgeFilter is always get executed during the batchProcess as you mentioned memory index is flushed.

so during the real time index, it could delete relavant docs. If  there is way to identify who has invoked the filter, either by real time or cron job, it can help us.

I have fixed issue by shutting down the zoie and delete the index files.

However I am looking forward to expose few methods to delete index files while zoie is running. I explored purgeIndex and expunge etc. but did not succeed.

Thanks a lot for your guidance and help. Its really appreciated and happy to see your quick response.

Best Wishes,
Brij
Reply all
Reply to author
Forward
0 new messages