Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Questions about production deployments
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
bhauer  
View profile  
 More options Sep 18 2011, 10:52 am
From: bhauer <bhs...@tsotech.com>
Date: Sun, 18 Sep 2011 07:52:47 -0700 (PDT)
Local: Sun, Sep 18 2011 10:52 am
Subject: Re: Questions about production deployments
Hi Sergio,

Thanks for the detailed reply.  Here are some additional thoughts
based on your feedback.

On Sep 2, 2:00 am, Sergio Bossa <sergio.bo...@gmail.com> wrote:

> That sounds good and necessary.
> I've found performance monitoring very important too: that's because
> Terrastore is memory-based, and performance of memory-based systems
> tends to degrade under one of the following circumstances:
> 1) Large memory blobs are faulted in, either from disk or network.
> 2) Memory saturates and full garbage collection kicks in.
> So first, always monitor your memory and garbage collector logs.
> Also, we set up several metrics to track execution times of most
> expensive Terrastore operations (i.e., range queries and map reduce
> processing), so to be alerted when performance degrades: we're using
> Nimrod for that (seehttps://github.com/sbtourist/nimrod).

Good idea.  I think that the load is going to be much lower than the
server's capacity such that I will likely avoid the memory contention
you've described for a long time.  But it's good to hear you have some
ideas for this that I can put to use if/when that becomes an issue.

> Apart from the number of jdb files, doesn't the db size decrease when
> you delete buckets and documents?

Each .jdb file (aside from the most recent) is pegged at 9,766 KB for
what appears to be eternity.  As I mentioned earlier, 00000000.jdb has
create/modified dates reaching back to the very start of my project.
Since then, dozens of .jdb files have been created, but none has been
deleted, even when I delete buckets.  During development, I have
deleted all buckets and created new buckets dozens of times.

When you say the "does the db size decrease," it occurs to me that you
may be referring to something other than the .jdb files.  Is there
another file or set of files I should be looking at?

Am I missing something that's very obvious to someone familiar with
Terracotta?  I feel like there has to be some way to tell Terracotta,
"Hey, you can trim out the unreferenced objects in those old .jdb
files."

> > I also have not yet found the time to exercise the Terracotta
> > clustering to observe its real-world behavior when the master goes
> > offline, the slave takes over, and then the master is restored.  By
> > contrast, in development, I have routinely started and stopped
> > multiple Terrastore servers so I know that those join and exit their
> > cluster fairly smoothly.

> The Terracotta master failover works pretty well ... unless you have a
> huge db :)

That shouldn't be a problem for me.  At least for now.  :)

> That's plenty of memory ... we have more modest machines :)
> Our masters run with 6 gigabytes, while servers run with 3 gigabytes.
> With such a setup, we handle several millions (compressed) documents
> and a database ranging from 10 to 20 gigabytes.

This gives me a great deal of confidence that I'm significantly over-
provisioned on memory.  Memory is so astonishingly cheap right now
that I see no reason to not buy gobs of it.  But if you're doing that
kind of load with 3 GB allocated to the servers, I am not concerned
about my application--at least from a memory standpoint.

That is, for the time being and for my application, my goal is to do
whatever I can to maximize reliability/durability.  Performance seems
to be no problem for now.

> Everything I said is referred to (unfortunately still not officially
> released) 0.8.2 :)
> It contains lots of fixes and anhancements, so I strongly suggest you
> to use 0.8.2.

I just saw the 0.8.2 announcement!  Congratulations.  I'll be
switching over soon.

> The backup format for 0.8.1 is different from the 0.8.2 format, so you
> should write a backup tool by yourself (which should be fairly
> straightforward by the way) ... sorry for that.

No worries!

Do you suspect that the backup format will become stable over time,
maybe at 1.0?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.