I'm still having some stability problems in my Voldemort Servers. I have 2 questions regarding it:
a) EnvironmentFailureException in the Voldemort Server [2012-08-10 12:33:21,330 voldemort.store.bdb.BdbStorageEngine] ERROR com.sleepycat.je.EnvironmentFailureException: (JE 4.1.17) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 4.1.17) /home/voldemort/voldemort/server/bin/../../stores-caronte/data/bdb/protobuf Tax fetchTarget of 0x2b06/0xc8c0e0 parent IN=589382 IN class=com.sleepycat.je.tree.BIN lastFullVersion=0x2b8e/0x9fdfaa parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
I'm using BDB 4.1.17 with some changes from Vinoth, and even when it's working much better, I still have problems from time to time. I'm going to upgrade to 4.1.21 since it seems to be fixing some of those problems. Is there any known problem with such a version?
Also, from time to time I see the next trace:
[2012-08-10 12:33:21,331 voldemort.server.niosocket.AsyncRequestHandler] ERROR java.lang.NullPointerException at voldemort.store.bdb.BdbStorageEngine.attemptCommit(BdbStorageEngine.java:41 5) at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:372) at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:68) at voldemort.store.logging.LoggingStore.delete(LoggingStore.java:90) at voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:1 94) at voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:6 0) at voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:71) at voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:41) at voldemort.store.DelegatingStore.delete(DelegatingStore.java:49) at voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:52) at voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:39) at voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleDelete(V oldemortNativeRequestHandler.java:366) at voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleRequest( VoldemortNativeRequestHandler.java:72) at voldemort.server.niosocket.AsyncRequestHandler.read(AsyncRequestHandler.jav a:120) at voldemort.utils.SelectorManagerWorker.run(SelectorManagerWorker.java:98) at voldemort.utils.SelectorManager.run(SelectorManager.java:194) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j ava:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 908) at java.lang.Thread.run(Thread.java:662)
I'll apply a patch to solve it, but is really synthomatic, since the real problem is caused by BDB.
2) Right now I've a cluster with 6 nodes. Those nodes are different, since 3 of them are newer, with more Cores and Memory. My question is: Can I configure those servers with more nio,selectors and more bdb.cache memory? Could be some problem synchronizing metadata between servers?
On Monday, September 24, 2012 11:21:22 AM UTC+2, ctasada wrote:
> Hi everyone,
> I'm still having some stability problems in my Voldemort Servers. I have 2 > questions regarding it:
> a) EnvironmentFailureException in the Voldemort Server > [2012-08-10 12:33:21,330 voldemort.store.bdb.BdbStorageEngine] ERROR > com.sleepycat.je.EnvironmentFailureException: (JE 4.1.17) Environment must > be closed, caused by: com.sleepycat.je.EnvironmentFailureException: > Environment invalid because of previous exception: (JE 4.1.17) > /home/voldemort/voldemort/server/bin/../../stores-caronte/data/bdb/protobuf Tax > fetchTarget of 0x2b06/0xc8c0e0 parent IN=589382 IN > class=com.sleepycat.je.tree.BIN lastFullVersion=0x2b8e/0x9fdfaa > parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is > likely invalid. Environment is invalid and must be closed.
> I'm using BDB 4.1.17 with some changes from Vinoth, and even when it's > working much better, I still have problems from time to time. I'm going to > upgrade to 4.1.21 since it seems to be fixing some of those problems. Is > there any known problem with such a version?
> Also, from time to time I see the next trace:
> [2012-08-10 12:33:21,331 voldemort.server.niosocket.AsyncRequestHandler] > ERROR > java.lang.NullPointerException > at > voldemort.store.bdb.BdbStorageEngine.attemptCommit(BdbStorageEngine.java:41 5) > at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:372) > at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:68) > at voldemort.store.logging.LoggingStore.delete(LoggingStore.java:90) > at > voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:1 94) > at > voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:6 0) > at > voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:71) > at > voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:41) > at voldemort.store.DelegatingStore.delete(DelegatingStore.java:49) > at > voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:52) > at > voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:39) > at > voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleDelete(V oldemortNativeRequestHandler.java:366) > at > voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleRequest( VoldemortNativeRequestHandler.java:72) > at > voldemort.server.niosocket.AsyncRequestHandler.read(AsyncRequestHandler.jav a:120) > at voldemort.utils.SelectorManagerWorker.run(SelectorManagerWorker.java:98) > at voldemort.utils.SelectorManager.run(SelectorManager.java:194) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j ava:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 908) > at java.lang.Thread.run(Thread.java:662)
> I'll apply a patch to solve it, but is really synthomatic, since the real > problem is caused by BDB.
> 2) Right now I've a cluster with 6 nodes. Those nodes are different, since > 3 of them are newer, with more Cores and Memory. My question is: Can I > configure those servers with more nio,selectors and more bdb.cache memory? > Could be some problem synchronizing metadata between servers?
I have not seen these before. Well, EnvironmentFailureExceptions happen if disk goes bad and such,. But not specifically for 4.1.17. Can you point to the version or branch are you running off? And 4.1.21 was basically made with some changes to BDB5 preupgrade script. So not sure what extra fixes are in there.
For 2), essentially, you will be throwing more resources at some machines. This might be okay in general. but make sure you don't have preferred_reads or something, since if you block of a fast and a slow node, you will only seethe performance of the slow node anyway
On Tuesday, October 2, 2012 1:43:45 PM UTC-7, ctasada wrote:
> Hi guys,
> No one?
> On Monday, September 24, 2012 11:21:22 AM UTC+2, ctasada wrote:
>> Hi everyone,
>> I'm still having some stability problems in my Voldemort Servers. I have >> 2 questions regarding it:
>> a) EnvironmentFailureException in the Voldemort Server >> [2012-08-10 12:33:21,330 voldemort.store.bdb.BdbStorageEngine] ERROR >> com.sleepycat.je.EnvironmentFailureException: (JE 4.1.17) Environment must >> be closed, caused by: com.sleepycat.je.EnvironmentFailureException: >> Environment invalid because of previous exception: (JE 4.1.17) >> /home/voldemort/voldemort/server/bin/../../stores-caronte/data/bdb/protobuf Tax >> fetchTarget of 0x2b06/0xc8c0e0 parent IN=589382 IN >> class=com.sleepycat.je.tree.BIN lastFullVersion=0x2b8e/0x9fdfaa >> parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is >> likely invalid. Environment is invalid and must be closed.
>> I'm using BDB 4.1.17 with some changes from Vinoth, and even when it's >> working much better, I still have problems from time to time. I'm going to >> upgrade to 4.1.21 since it seems to be fixing some of those problems. Is >> there any known problem with such a version?
>> Also, from time to time I see the next trace:
>> [2012-08-10 12:33:21,331 voldemort.server.niosocket.AsyncRequestHandler] >> ERROR >> java.lang.NullPointerException >> at >> voldemort.store.bdb.BdbStorageEngine.attemptCommit(BdbStorageEngine.java:41 5) >> at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:372) >> at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:68) >> at voldemort.store.logging.LoggingStore.delete(LoggingStore.java:90) >> at >> voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:1 94) >> at >> voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:6 0) >> at >> voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:71) >> at >> voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:41) >> at voldemort.store.DelegatingStore.delete(DelegatingStore.java:49) >> at >> voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:52) >> at >> voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:39) >> at >> voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleDelete(V oldemortNativeRequestHandler.java:366) >> at >> voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleRequest( VoldemortNativeRequestHandler.java:72) >> at >> voldemort.server.niosocket.AsyncRequestHandler.read(AsyncRequestHandler.jav a:120) >> at >> voldemort.utils.SelectorManagerWorker.run(SelectorManagerWorker.java:98) >> at voldemort.utils.SelectorManager.run(SelectorManager.java:194) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j ava:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 908) >> at java.lang.Thread.run(Thread.java:662)
>> I'll apply a patch to solve it, but is really synthomatic, since the real >> problem is caused by BDB.
>> 2) Right now I've a cluster with 6 nodes. Those nodes are different, >> since 3 of them are newer, with more Cores and Memory. My question is: Can >> I configure those servers with more nio,selectors and more bdb.cache >> memory? Could be some problem synchronizing metadata between servers?
On Tuesday, October 2, 2012 2:18:37 PM UTC-7, Vinoth Chandar wrote:
> Carlos,
> I have not seen these before. Well, EnvironmentFailureExceptions happen if > disk goes bad and such,. But not specifically for 4.1.17. > Can you point to the version or branch are you running off? And 4.1.21 was > basically made with some changes to BDB5 preupgrade script. So not sure > what extra fixes are in there.
> For 2), essentially, you will be throwing more resources at some machines. > This might be okay in general. but make sure you don't have preferred_reads > or something, since if you block of a fast and a slow node, you will only > seethe performance of the slow node anyway
> Thanks > Vinoth
> On Tuesday, October 2, 2012 1:43:45 PM UTC-7, ctasada wrote:
>> Hi guys,
>> No one?
>> On Monday, September 24, 2012 11:21:22 AM UTC+2, ctasada wrote:
>>> Hi everyone,
>>> I'm still having some stability problems in my Voldemort Servers. I have >>> 2 questions regarding it:
>>> a) EnvironmentFailureException in the Voldemort Server >>> [2012-08-10 12:33:21,330 voldemort.store.bdb.BdbStorageEngine] ERROR >>> com.sleepycat.je.EnvironmentFailureException: (JE 4.1.17) Environment must >>> be closed, caused by: com.sleepycat.je.EnvironmentFailureException: >>> Environment invalid because of previous exception: (JE 4.1.17) >>> /home/voldemort/voldemort/server/bin/../../stores-caronte/data/bdb/protobuf Tax >>> fetchTarget of 0x2b06/0xc8c0e0 parent IN=589382 IN >>> class=com.sleepycat.je.tree.BIN lastFullVersion=0x2b8e/0x9fdfaa >>> parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is >>> likely invalid. Environment is invalid and must be closed.
>>> I'm using BDB 4.1.17 with some changes from Vinoth, and even when it's >>> working much better, I still have problems from time to time. I'm going to >>> upgrade to 4.1.21 since it seems to be fixing some of those problems. Is >>> there any known problem with such a version?
>>> Also, from time to time I see the next trace:
>>> [2012-08-10 12:33:21,331 voldemort.server.niosocket.AsyncRequestHandler] >>> ERROR >>> java.lang.NullPointerException >>> at >>> voldemort.store.bdb.BdbStorageEngine.attemptCommit(BdbStorageEngine.java:41 5) >>> at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:372) >>> at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:68) >>> at voldemort.store.logging.LoggingStore.delete(LoggingStore.java:90) >>> at >>> voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:1 94) >>> at >>> voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:6 0) >>> at >>> voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:71) >>> at >>> voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:41) >>> at voldemort.store.DelegatingStore.delete(DelegatingStore.java:49) >>> at >>> voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:52) >>> at >>> voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:39) >>> at >>> voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleDelete(V oldemortNativeRequestHandler.java:366) >>> at >>> voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleRequest( VoldemortNativeRequestHandler.java:72) >>> at >>> voldemort.server.niosocket.AsyncRequestHandler.read(AsyncRequestHandler.jav a:120) >>> at >>> voldemort.utils.SelectorManagerWorker.run(SelectorManagerWorker.java:98) >>> at voldemort.utils.SelectorManager.run(SelectorManager.java:194) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j ava:886) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 908) >>> at java.lang.Thread.run(Thread.java:662)
>>> I'll apply a patch to solve it, but is really synthomatic, since the >>> real problem is caused by BDB.
>>> 2) Right now I've a cluster with 6 nodes. Those nodes are different, >>> since 3 of them are newer, with more Cores and Memory. My question is: Can >>> I configure those servers with more nio,selectors and more bdb.cache >>> memory? Could be some problem synchronizing metadata between servers?
Thanks for your answers. I'll double-check my configurations to make sure
that I don't have any bottleneck with the old hardware.
Regarding BDB 4.1.21 you're right, it only has some changes in the
preupgrade code, but 4.1.20 includes some other fixes regarding the "lock
files".
What do you mean with 'slapping' 4.1.17 onto voldemort 0.96? My local
changes are including the library plus code changes. It has been working
fine for some time so far with my 0.91 modified version. I'm still testing
the migration to 0.96.
> is what we are testing now. So these NPEs should be taken care of. If you
> are simply slapping 4.1.17 or greater onto 0.96 voldemort, please don't.
> On Tuesday, October 2, 2012 2:18:37 PM UTC-7, Vinoth Chandar wrote:
>> Carlos,
>> I have not seen these before. Well, EnvironmentFailureExceptions happen
>> if disk goes bad and such,. But not specifically for 4.1.17.
>> Can you point to the version or branch are you running off? And 4.1.21
>> was basically made with some changes to BDB5 preupgrade script. So not sure
>> what extra fixes are in there.
>> For 2), essentially, you will be throwing more resources at some
>> machines. This might be okay in general. but make sure you don't have
>> preferred_reads or something, since if you block of a fast and a slow node,
>> you will only seethe performance of the slow node anyway
>> Thanks
>> Vinoth
>> On Tuesday, October 2, 2012 1:43:45 PM UTC-7, ctasada wrote:
>>> Hi guys,
>>> No one?
>>> On Monday, September 24, 2012 11:21:22 AM UTC+2, ctasada wrote:
>>>> Hi everyone,
>>>> I'm still having some stability problems in my Voldemort Servers. I
>>>> have 2 questions regarding it:
>>>> a) EnvironmentFailureException in the Voldemort Server
>>>> [2012-08-10 12:33:21,330 voldemort.store.bdb.**BdbStorageEngine] ERROR
>>>> com.sleepycat.je.**EnvironmentFailureException: (JE 4.1.17)
>>>> Environment must be closed, caused by: com.sleepycat.je.**EnvironmentFailureException:
>>>> Environment invalid because of previous exception: (JE 4.1.17)
>>>> /home/voldemort/voldemort/**server/bin/../../stores-**caronte/data/bdb/prot obufTax
>>>> fetchTarget of 0x2b06/0xc8c0e0 parent IN=589382 IN
>>>> class=com.sleepycat.je.tree.**BIN lastFullVersion=0x2b8e/**0x9fdfaa
>>>> parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is
>>>> likely invalid. Environment is invalid and must be closed.
>>>> I'm using BDB 4.1.17 with some changes from Vinoth, and even when it's
>>>> working much better, I still have problems from time to time. I'm going to
>>>> upgrade to 4.1.21 since it seems to be fixing some of those problems. Is
>>>> there any known problem with such a version?
>>>> Also, from time to time I see the next trace:
>>>> [2012-08-10 12:33:21,331 voldemort.server.niosocket.**AsyncRequestHandler]
>>>> ERROR
>>>> java.lang.NullPointerException
>>>> at voldemort.store.bdb.**BdbStorageEngine.**attemptCommit(**
>>>> BdbStorageEngine.java:415)
>>>> at voldemort.store.bdb.**BdbStorageEngine.delete(**
>>>> BdbStorageEngine.java:372)
>>>> at voldemort.store.bdb.**BdbStorageEngine.delete(**
>>>> BdbStorageEngine.java:68)
>>>> at voldemort.store.logging.**LoggingStore.delete(**
>>>> LoggingStore.java:90)
>>>> at voldemort.store.rebalancing.**RedirectingStore.delete(**
>>>> RedirectingStore.java:194)
>>>> at voldemort.store.rebalancing.**RedirectingStore.delete(**
>>>> RedirectingStore.java:60)
>>>> at voldemort.store.**invalidmetadata.**InvalidMetadataCheckingStore.**
>>>> delete(**InvalidMetadataCheckingStore.**java:71)
>>>> at voldemort.store.**invalidmetadata.**InvalidMetadataCheckingStore.**
>>>> delete(**InvalidMetadataCheckingStore.**java:41)
>>>> at voldemort.store.**DelegatingStore.delete(**DelegatingStore.java:49)
>>>> at voldemort.store.stats.**StatTrackingStore.delete(**
>>>> StatTrackingStore.java:52)
>>>> at voldemort.store.stats.**StatTrackingStore.delete(**
>>>> StatTrackingStore.java:39)
>>>> at voldemort.server.protocol.**vold.**VoldemortNativeRequestHandler.**
>>>> handleDelete(**VoldemortNativeRequestHandler.**java:366)
>>>> at voldemort.server.protocol.**vold.**VoldemortNativeRequestHandler.**
>>>> handleRequest(**VoldemortNativeRequestHandler.**java:72)
>>>> at voldemort.server.niosocket.**AsyncRequestHandler.read(**
>>>> AsyncRequestHandler.java:120)
>>>> at voldemort.utils.**SelectorManagerWorker.run(**
>>>> SelectorManagerWorker.java:98)
>>>> at voldemort.utils.**SelectorManager.run(**SelectorManager.java:194)
>>>> at java.util.concurrent.**ThreadPoolExecutor$Worker.**
>>>> runTask(ThreadPoolExecutor.**java:886)
>>>> at java.util.concurrent.**ThreadPoolExecutor$Worker.run(**
>>>> ThreadPoolExecutor.java:908)
>>>> at java.lang.Thread.run(Thread.**java:662)
>>>> I'll apply a patch to solve it, but is really synthomatic, since the
>>>> real problem is caused by BDB.
>>>> 2) Right now I've a cluster with 6 nodes. Those nodes are different,
>>>> since 3 of them are newer, with more Cores and Memory. My question is: Can
>>>> I configure those servers with more nio,selectors and more bdb.cache
>>>> memory? Could be some problem synchronizing metadata between servers?
>>>> Thanks.
>>>> Regards,
>>>> Carlos.
>>> --
> You received this message because you are subscribed to the Google Groups
> "project-voldemort" group.
> To unsubscribe from this group, send email to
> project-voldemort+unsubscribe@googlegroups.com.
> Visit this group at http://groups.google.com/group/project-voldemort?hl=en > .
Since you mentioned you are testing some of my code, I was wondering what exactly you are using. By "slapping" bdb 4.1.17, what I meant was, are you simply updating the bdb version on an existing voldemort codebase. The most important change I have made is getting rid of BDB sorted duplicates usage, which is necessary for any migration to a higher version. Else, you will see disk growth from 4.0.92 due to the problems I outlined in the blog.
>> 4.1.20 includes some other fixes regarding the "lock files".
Point 1 in the change log addresses deferred write dbs, which I don't think relates to voldemort. anyways.
On Tuesday, October 2, 2012 3:01:20 PM UTC-7, ctasada wrote:
> Hi Vinoth,
> Thanks for your answers. I'll double-check my configurations to make sure > that I don't have any bottleneck with the old hardware.
> Regarding BDB 4.1.21 you're right, it only has some changes in the > preupgrade code, but 4.1.20 includes some other fixes regarding the "lock > files".
> What do you mean with 'slapping' 4.1.17 onto voldemort 0.96? My local > changes are including the library plus code changes. It has been working > fine for some time so far with my 0.91 modified version. I'm still testing > the migration to 0.96.
> On Tue, Oct 2, 2012 at 11:26 PM, Vinoth Chandar <mail.vino...@gmail.com<javascript:> > > wrote:
>> is what we are testing now. So these NPEs should be taken care of. If you >> are simply slapping 4.1.17 or greater onto 0.96 voldemort, please don't.
>> On Tuesday, October 2, 2012 2:18:37 PM UTC-7, Vinoth Chandar wrote:
>>> Carlos,
>>> I have not seen these before. Well, EnvironmentFailureExceptions happen >>> if disk goes bad and such,. But not specifically for 4.1.17. >>> Can you point to the version or branch are you running off? And 4.1.21 >>> was basically made with some changes to BDB5 preupgrade script. So not sure >>> what extra fixes are in there.
>>> For 2), essentially, you will be throwing more resources at some >>> machines. This might be okay in general. but make sure you don't have >>> preferred_reads or something, since if you block of a fast and a slow node, >>> you will only seethe performance of the slow node anyway
>>> Thanks >>> Vinoth
>>> On Tuesday, October 2, 2012 1:43:45 PM UTC-7, ctasada wrote:
>>>> Hi guys,
>>>> No one?
>>>> On Monday, September 24, 2012 11:21:22 AM UTC+2, ctasada wrote:
>>>>> Hi everyone,
>>>>> I'm still having some stability problems in my Voldemort Servers. I >>>>> have 2 questions regarding it:
>>>>> a) EnvironmentFailureException in the Voldemort Server >>>>> [2012-08-10 12:33:21,330 voldemort.store.bdb.**BdbStorageEngine] >>>>> ERROR com.sleepycat.je.**EnvironmentFailureException: (JE 4.1.17) >>>>> Environment must be closed, caused by: com.sleepycat.je.**EnvironmentFailureException: >>>>> Environment invalid because of previous exception: (JE 4.1.17) >>>>> /home/voldemort/voldemort/**server/bin/../../stores-**caronte/data/bdb/prot obufTax >>>>> fetchTarget of 0x2b06/0xc8c0e0 parent IN=589382 IN >>>>> class=com.sleepycat.je.tree.**BIN lastFullVersion=0x2b8e/**0x9fdfaa >>>>> parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is >>>>> likely invalid. Environment is invalid and must be closed.
>>>>> I'm using BDB 4.1.17 with some changes from Vinoth, and even when it's >>>>> working much better, I still have problems from time to time. I'm going to >>>>> upgrade to 4.1.21 since it seems to be fixing some of those problems. Is >>>>> there any known problem with such a version?
>>>>> Also, from time to time I see the next trace:
>>>>> [2012-08-10 12:33:21,331 voldemort.server.niosocket.**AsyncRequestHandler] >>>>> ERROR >>>>> java.lang.NullPointerException >>>>> at voldemort.store.bdb.**BdbStorageEngine.**attemptCommit(** >>>>> BdbStorageEngine.java:415) >>>>> at voldemort.store.bdb.**BdbStorageEngine.delete(** >>>>> BdbStorageEngine.java:372) >>>>> at voldemort.store.bdb.**BdbStorageEngine.delete(** >>>>> BdbStorageEngine.java:68) >>>>> at voldemort.store.logging.**LoggingStore.delete(** >>>>> LoggingStore.java:90) >>>>> at voldemort.store.rebalancing.**RedirectingStore.delete(** >>>>> RedirectingStore.java:194) >>>>> at voldemort.store.rebalancing.**RedirectingStore.delete(** >>>>> RedirectingStore.java:60) >>>>> at voldemort.store.**invalidmetadata.**InvalidMetadataCheckingStore.* >>>>> *delete(**InvalidMetadataCheckingStore.**java:71) >>>>> at voldemort.store.**invalidmetadata.**InvalidMetadataCheckingStore.** >>>>> delete(**InvalidMetadataCheckingStore.**java:41) >>>>> at voldemort.store.**DelegatingStore.delete(** >>>>> DelegatingStore.java:49) >>>>> at voldemort.store.stats.**StatTrackingStore.delete(** >>>>> StatTrackingStore.java:52) >>>>> at voldemort.store.stats.**StatTrackingStore.delete(** >>>>> StatTrackingStore.java:39) >>>>> at voldemort.server.protocol.**vold.**VoldemortNativeRequestHandler.** >>>>> handleDelete(**VoldemortNativeRequestHandler.**java:366) >>>>> at voldemort.server.protocol.**vold.**VoldemortNativeRequestHandler.* >>>>> *handleRequest(**VoldemortNativeRequestHandler.**java:72) >>>>> at voldemort.server.niosocket.**AsyncRequestHandler.read(** >>>>> AsyncRequestHandler.java:120) >>>>> at voldemort.utils.**SelectorManagerWorker.run(** >>>>> SelectorManagerWorker.java:98) >>>>> at voldemort.utils.**SelectorManager.run(**SelectorManager.java:194) >>>>> at java.util.concurrent.**ThreadPoolExecutor$Worker.** >>>>> runTask(ThreadPoolExecutor.**java:886) >>>>> at java.util.concurrent.**ThreadPoolExecutor$Worker.run(** >>>>> ThreadPoolExecutor.java:908) >>>>> at java.lang.Thread.run(Thread.**java:662)
>>>>> I'll apply a patch to solve it, but is really synthomatic, since the >>>>> real problem is caused by BDB.
>>>>> 2) Right now I've a cluster with 6 nodes. Those nodes are different, >>>>> since 3 of them are newer, with more Cores and Memory. My question is: Can >>>>> I configure those servers with more nio,selectors and more bdb.cache >>>>> memory? Could be some problem synchronizing metadata between servers?
>>>>> Thanks.
>>>>> Regards, >>>>> Carlos.
>>>> -- >> You received this message because you are subscribed to the Google Groups >> "project-voldemort" group. >> To unsubscribe from this group, send email to >> project-voldem...@googlegroups.com <javascript:>. >> Visit this group at >> http://groups.google.com/group/project-voldemort?hl=en.