MogileFS, would you do it over again?

Mark Imbriaco

unread,

Jan 29, 2010, 10:49:03 PM1/29/10

to mog...@googlegroups.com

We've been using MogileFS for a while, but on a pretty small scale (1.5TB total, mostly empty). We are considering rolling it out in a bigger way (36.5M files, 50TB of data, growing 3-4TB a month). I'm wondering if those of you who are using it on a similar or larger scale would do it over again if you had to redesign your storage architecture today? If not, what would you do differently?

We've been generally happy with MogileFS so far, but since we're contemplating a much larger scale I wanted to see if I could get some opinions from folks who have more directly applicable experience.

Thanks,
-Mark

--
Mark Imbriaco
System Administrator
37signals

dormando

unread,

Jan 29, 2010, 11:15:27 PM1/29/10

to mog...@googlegroups.com

I've had several MogileFS clusters that size and bigger... For a while I
was pretty unhappy with having many drives in a single host (12+) since
losing that host entirely can (and does) happen. In recent releases
replication has sped up enough that this isn't as huge a deal as it was in
the past. (though I think I added some tuning to the reaper job in
trunk...).

having a caching layer in front (varnish/squid/whatever) can help a lot if
you end up with fewer large hosts. Also path caching in some form is a
must if frequently accessed (not so much if cached in front, etc).

MFS works pretty great once you get it up and running. Development pace
has been pretty good in the last six months, and we're continuing to
improve performance and stability.

If you haven't seen it yet, the google code site has some documentation
fleshed out now. Take a look and let us know if there're questions about a
large install which you still have in particular.

-Dormando

James Byers

unread,

Jan 30, 2010, 12:19:18 AM1/30/10

to mog...@googlegroups.com

February marks our fourth year on MogileFS and we'd certainly do it
again. After we got past some early issues, it's been worry-free.
Details on our setup are at
http://code.google.com/p/mogilefs/wiki/Users -- we have more files and
less storage in comparison. Some observations:

- Finding the right ratio of hosts to disks is a balancing act: cost,
IO, and availability are all in play. We started off with 4-disk
hosts and have switched to 10-disk hosts. We've been lucky to never
permanently lose a host but have gone through a number of dead disks.
Replication has gotten better, but it's worth seeing the impact of a
big dead disk (or two, or ten) on your system.

- We've fiddled with the number of mogilefsd jobs (listener, delete,
replicate, reaper, monitor) many times now. Trial-and-error.
Consider the implications of losing multiple trackers and what
replication will do when a disk dies.

- We use two pairs of trackers, mostly for availability reasons. Only
one of each pair does delete / replicate / reaper. Seems like these
will last us a while especially in combination with memcached path
caching.

- We started serving files from mogstored but quickly moved to
lighttpd for gets. lighttpd is dead to us now, in the near future
we'll be reading and writing with nginx, mogstored will just handle
stats. We did have problems with mogstored handling multi-GB files,
but this seems not to be a problem anymore.

- MySQL master-master replication works well using the standard
auto-increment offset, write-to-one-master approach.

James
Wikispaces.com

Jared Klett

unread,

Feb 9, 2010, 2:28:56 PM2/9/10

to mog...@googlegroups.com

hi Mark,

I've been very happy with MogileFS, so yes, I would do it the
same way if I were back there in late 2006.

Our clusters have grown to over half a petabyte in size, with
over a thousand disks and nearly 100 hosts. We typically do 10-12
devices per host.

cheers,

- Jared

--
Jared Klett
Co-founder/Chief Engineer, blip.tv
office: 917-546-6989 x2002
mobile: 646-526-8948
aol im: JaredAtWrok
http://blog.blip.tv

Reply all

Reply to author

Forward