Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
2 Questions regarding server stability
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  6 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
ctasada  
View profile  
 More options Sep 24 2012, 5:21 am
From: ctasada <ctas...@gmail.com>
Date: Mon, 24 Sep 2012 02:21:22 -0700 (PDT)
Local: Mon, Sep 24 2012 5:21 am
Subject: 2 Questions regarding server stability

Hi everyone,

I'm still having some stability problems in my Voldemort Servers. I have 2
questions regarding it:

a) EnvironmentFailureException in the Voldemort Server
[2012-08-10 12:33:21,330 voldemort.store.bdb.BdbStorageEngine] ERROR
com.sleepycat.je.EnvironmentFailureException: (JE 4.1.17) Environment must
be closed, caused by: com.sleepycat.je.EnvironmentFailureException:
Environment invalid because of previous exception: (JE 4.1.17)
/home/voldemort/voldemort/server/bin/../../stores-caronte/data/bdb/protobuf Tax
fetchTarget of 0x2b06/0xc8c0e0 parent IN=589382 IN
class=com.sleepycat.je.tree.BIN lastFullVersion=0x2b8e/0x9fdfaa
parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is
likely invalid. Environment is invalid and must be closed.

I'm using BDB 4.1.17 with some changes from Vinoth, and even when it's
working much better, I still have problems from time to time. I'm going to
upgrade to 4.1.21 since it seems to be fixing some of those problems. Is
there any known problem with such a version?

Also, from time to time I see the next trace:

[2012-08-10 12:33:21,331 voldemort.server.niosocket.AsyncRequestHandler]
ERROR  
java.lang.NullPointerException
at
voldemort.store.bdb.BdbStorageEngine.attemptCommit(BdbStorageEngine.java:41 5)
at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:372)
at voldemort.store.bdb.BdbStorageEngine.delete(BdbStorageEngine.java:68)
at voldemort.store.logging.LoggingStore.delete(LoggingStore.java:90)
at
voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:1 94)
at
voldemort.store.rebalancing.RedirectingStore.delete(RedirectingStore.java:6 0)
at
voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:71)
at
voldemort.store.invalidmetadata.InvalidMetadataCheckingStore.delete(Invalid MetadataCheckingStore.java:41)
at voldemort.store.DelegatingStore.delete(DelegatingStore.java:49)
at voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:52)
at voldemort.store.stats.StatTrackingStore.delete(StatTrackingStore.java:39)
at
voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleDelete(V oldemortNativeRequestHandler.java:366)
at
voldemort.server.protocol.vold.VoldemortNativeRequestHandler.handleRequest( VoldemortNativeRequestHandler.java:72)
at
voldemort.server.niosocket.AsyncRequestHandler.read(AsyncRequestHandler.jav a:120)
at voldemort.utils.SelectorManagerWorker.run(SelectorManagerWorker.java:98)
at voldemort.utils.SelectorManager.run(SelectorManager.java:194)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j ava:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 908)
at java.lang.Thread.run(Thread.java:662)

I'll apply a patch to solve it, but is really synthomatic, since the real
problem is caused by BDB.

2) Right now I've a cluster with 6 nodes. Those nodes are different, since
3 of them are newer, with more Cores and Memory. My question is: Can I
configure those servers with more nio,selectors and more bdb.cache memory?
Could be some problem synchronizing metadata between servers?

Thanks.

Regards,
Carlos.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
ctasada  
View profile  
 More options Oct 2 2012, 4:43 pm
From: ctasada <ctas...@gmail.com>
Date: Tue, 2 Oct 2012 13:43:45 -0700 (PDT)
Local: Tues, Oct 2 2012 4:43 pm
Subject: Re: 2 Questions regarding server stability

Hi guys,

No one?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Vinoth Chandar  
View profile  
 More options Oct 2 2012, 5:18 pm
From: Vinoth Chandar <mail.vinoth.chan...@gmail.com>
Date: Tue, 2 Oct 2012 14:18:37 -0700 (PDT)
Local: Tues, Oct 2 2012 5:18 pm
Subject: Re: 2 Questions regarding server stability

Carlos,

I have not seen these before. Well, EnvironmentFailureExceptions happen if
disk goes bad and such,. But not specifically for 4.1.17.
Can you point to the version or branch are you running off? And 4.1.21 was
basically made with some changes to BDB5 preupgrade script. So not sure
what extra fixes are in there.

For 2), essentially, you will be throwing more resources at some machines.
This might be okay in general. but make sure you don't have preferred_reads
or something, since if you block of a fast and a slow node, you will only
seethe performance of the slow node anyway

Thanks
Vinoth


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Vinoth Chandar  
View profile  
 More options Oct 2 2012, 5:26 pm
From: Vinoth Chandar <mail.vinoth.chan...@gmail.com>
Date: Tue, 2 Oct 2012 14:26:13 -0700 (PDT)
Local: Tues, Oct 2 2012 5:26 pm
Subject: Re: 2 Questions regarding server stability

https://github.com/vinothchandar/voldemort/blob/pidscan/src/java/vold...

is what we are testing now. So these NPEs should be taken care of. If you
are simply slapping 4.1.17 or greater onto 0.96 voldemort, please don't.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Carlos Tasada  
View profile  
 More options Oct 2 2012, 6:01 pm
From: Carlos Tasada <ctas...@gmail.com>
Date: Wed, 3 Oct 2012 00:01:18 +0200
Local: Tues, Oct 2 2012 6:01 pm
Subject: Re: [project-voldemort] Re: 2 Questions regarding server stability

Hi Vinoth,

Thanks for your answers. I'll double-check my configurations to make sure
that I don't have any bottleneck with the old hardware.

Regarding BDB 4.1.21 you're right, it only has some changes in the
preupgrade code, but 4.1.20 includes some other fixes regarding the "lock
files".

What do you mean with 'slapping' 4.1.17 onto voldemort 0.96? My local
changes are including the library plus code changes. It has been working
fine for some time so far with my 0.91 modified version. I'm still testing
the migration to 0.96.

On Tue, Oct 2, 2012 at 11:26 PM, Vinoth Chandar <


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Vinoth Chandar  
View profile  
 More options Oct 2 2012, 9:24 pm
From: Vinoth Chandar <mail.vinoth.chan...@gmail.com>
Date: Tue, 2 Oct 2012 18:24:29 -0700 (PDT)
Local: Tues, Oct 2 2012 9:24 pm
Subject: Re: [project-voldemort] Re: 2 Questions regarding server stability

Since you mentioned you are testing some of my code, I was wondering what
exactly you are using.
By "slapping" bdb 4.1.17, what I meant was, are you simply updating the bdb
version on an existing voldemort codebase. The most important change I have
made is getting rid of BDB sorted duplicates usage, which is necessary for
any migration to a higher version. Else, you will see disk growth from
4.0.92 due to the problems I outlined in the blog.

>> 4.1.20 includes some other fixes regarding the "lock files".

Point 1 in the change log addresses deferred write dbs, which I don't think
relates to voldemort. anyways.

We are testing
https://github.com/voldemort/voldemort/compare/master...release-096li8 and
if confirmed, we will release some conversion scripts so people can migrate
their data over.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »