[chironfs] Questions for production use

100 views
Skip to first unread message

Daniel Werner

unread,
Nov 11, 2008, 12:30:59 PM11/11/08
to ChironFS-Forum
Hi Luis and list,

I found out about ChironFS today and already must say that I feel
excited about trying it. In terms of storing media content on two
physically distant NFS file servers which have to be kept in sync,
your file system seems to do exactly what my company needs.

There are still some general questions remaining which I hope you can
help me with:

1. How do you pronounce "ChironFS"? This is the first thing to know so
I can start advocating towards my boss ;-)

2. The file servers I'd like to combine into a ChironFS tree face only
a small number of read and write requests during daily use. However,
once in a while there may be file uploads of up to 8 GB in size. Is
ChironFS prepared to handle reads and writes of this size?

3. Do you deem ChironFS ready for production use in mission-critical
environments? The file system would have to survive long-lasting
uptimes and cope with some potential temporary network outages.

I hope to set up a test environment soon, so thanks in advance for any
information!

Cheers,
--
Daniel

Alex

unread,
Nov 11, 2008, 1:03:39 PM11/11/08
to chironfs-forum
Hi Daniel,

I think I can be of some help

2008/11/11 Daniel Werner <daniel....@googlemail.com>:
>
> Hi Luis and list,
>
> I found out about ChironFS today and already must say that I feel
> excited about trying it. In terms of storing media content on two
> physically distant NFS file servers which have to be kept in sync,
> your file system seems to do exactly what my company needs.
>
> There are still some general questions remaining which I hope you can
> help me with:
>
> 1. How do you pronounce "ChironFS"? This is the first thing to know so
> I can start advocating towards my boss ;-)


Chiron {ky'-rahn}
the greek 'chi' sounds almost exactly like 'key', as in 'Achilles'


> 2. The file servers I'd like to combine into a ChironFS tree face only
> a small number of read and write requests during daily use. However,
> once in a while there may be file uploads of up to 8 GB in size. Is
> ChironFS prepared to handle reads and writes of this size?


For this, I'll wait for Luis Otavio... but, in principle, ChironFS
doesn't deal
with the files, it redirects to the real filesystems in the replicas.
These
guys will need to support your demand for file size or responsiveness
[in case of a remote replica, for example...]


> 3. Do you deem ChironFS ready for production use in mission-critical
> environments? The file system would have to survive long-lasting
> uptimes and cope with some potential temporary network outages.



If you have a mission-critical backup solution running nicely... ;-)

Serious, Luis shared some very impressive numbers from a
ChironFS site running on a *big* news portal in Brazil. My
guess is, if you carefully certify and test your solution design
[as you would do with any 'mission-critical' solution] it will work.


> I hope to set up a test environment soon, so thanks in advance for any
> information!
>
> Cheers,
> --
> Daniel


best regards,
alexandre

--
Any technology distinguishable from magic is insufficiently advanced

Luis Furquim

unread,
Nov 13, 2008, 8:20:49 AM11/13/08
to chironf...@googlegroups.com
Hello Daniel and list,

> I found out about ChironFS today and already must say that I feel
> excited about trying it. In terms of storing media content on two
> physically distant NFS file servers which have to be kept in sync,
> your file system seems to do exactly what my company needs.

Great!


> 1. How do you pronounce "ChironFS"? This is the first thing to know so
> I can start advocating towards my boss ;-)

Alexandre has explained the ChironFS pronounciation better than
I ever could!


> 2. The file servers I'd like to combine into a ChironFS tree face only
> a small number of read and write requests during daily use. However,
> once in a while there may be file uploads of up to 8 GB in size. Is
> ChironFS prepared to handle reads and writes of this size?

ChironFS uses 64bits file pointers, so it can handle 8 Gb files without
any problem. File size is not a problem anyway. Massive parallel access
may be a problem in extreme cases. There is a big content provider
here in Brazil using ChironFS in many of its site and in one of them
he had failures when copying the entire site into the ChironFS tree
using a parallel software in a machine with 8 cores and 32Gb of
RAM. I provided them a version of ChironFS which runs in single
threaded mode and the kickstart ran well. After feeding the initial
data (something about tens of thousands of files) they were instructed
to use the normal multi-threaded version of ChironFS. I will publish
a new version of ChironFS supporting both threading versions and
a new memory allocation method (I detected memory conflict issues
with the FUSE library when the application layer used massive
threading) as soon as possible.


> 3. Do you deem ChironFS ready for production use in mission-critical
> environments? The file system would have to survive long-lasting
> uptimes and cope with some potential temporary network outages.

As long you do not have massive parallel access using multi-threaded
version I can't see any problem. If you need to run ChironFS in such
environment, I advise you to wait the next version with memory issues
solved. Anyway, you may test it to see how it works for you. Feel free
to send me any bug reports, they are very useful to improve ChironFS.

The network outages are not a problem, they are the reason why ChironFS
were born. It is designed to handle replica failures. Let's see some
cases using your topology (2 distant NFS servers, I am assuming you're
not using any local copy)

1. If you temporarily lose access to 1 of the NFS servers and the other
remains responsive and working, then
1.1 Any read access will make ChironFS elect one of the servers
to make the read;
1.1.1 If ChironFS chooses the working server everything will work fine;
1.1.2 If ChironFS chooses the unaccessable server, it will detect the
failure, report in the log (if you used option --log)
and try to read
from the working server;
1.2 Any write access must be done in all the replicas, so, when ChironFS
tries to write to the unaccessable server, it will detect the
failure, report
to the log *and disable* this replica. This way, any
subsequent read/write
will not be done from/to this replica even when the network access to
it is restored. You will have to manually resync the failed
replica before
reenabling it in ChironFS;
2. If you temporarily lose access to both NFS servers, then your system will
not work and the application will receive an I/O error just as it would if
using any other filesystem. To avoid it, I suggest you use a local replica
along with the other 2 distant NFS servers, so network outages will make
your replicas outdated, but your system will still be running.
Using the local
copy has the advantage of performance improvement in read operations
because you can tell ChironFS to give read priority to the local copy.

Best Regards

--
Luis Otavio de Colla Furquim
Não alimente os pingos
Don't feed the tribbles
http://www.furquim.org/chironfs/

Daniel Werner

unread,
Nov 20, 2008, 7:18:28 AM11/20/08
to chironf...@googlegroups.com
Hello Luis, Alex & list,

thank you for your swift and comprehensive responses. I now have a use
case in mind which doesn't even involve adding any new machines.

Mounting both file servers on each and every client machine (mostly
on-demand streaming servers) would be the most straightforward option,
of course. This could be done on our Linux clients, but not on the
Windows 2003 Server clients we're bound to use for media streaming. So
using ChironFS for read operations is out of the question, a simple
NFS mount (through Services for Unix on Win2k3 machines) has to
suffice. ChironFS' load balancing is not a big loss since most of the
streaming data is cached at delivery anyway.

Writing to the filers is mostly done through two FTP servers, both
Linux machines, so I'd just have to link the remote NFS mounts
together via ChironFS on these machines and expose the resulting tree
through FTP.

Is it really that easy? :-)

Another more complex idea I had before: On each of the two FreeBSD
file servers, we could mount the other filer via NFS, respectively.
Stringing the local storage and the NFS mount together via ChironFS
would yield a tree that writes to the local copy as well as the
redundant file server. Reads would be satisfied by the local storage
only.

There is a potential issue with this, however: Clients writing to one
of the filers via NFS would effectively pass their request through a
NFS -> ChironFS -> NFS chain. My colleagues seem to have had
universally bad experiences with re-exporting NFS mounts. Does the
ChironFS layer mitigate this in any way?

2008/11/13 Luis Furquim <luisf...@gmail.com>:


> 2. If you temporarily lose access to both NFS servers, then your system will
> not work and the application will receive an I/O error just as it would if
> using any other filesystem. To avoid it, I suggest you use a local replica
> along with the other 2 distant NFS servers, so network outages will make
> your replicas outdated, but your system will still be running.
> Using the local copy has the advantage of performance improvement in read operations
> because you can tell ChironFS to give read priority to the local copy.

I'm not sure if I understand this correctly. A complete local replica
on the client machines would require exactly as much storage as the
filer provides, and a means to continuously replicate data to the
local copy. Doing that would defeat the purpose of using a central
net-mounted storage, and use several TBs of disk space -- more than
any client machine has. Maybe I am missing something here.

Or are you referring to a file system based read cache? I did consider
accessing ChironFS through another layer of CacheFS which stores
cached copies of the most utilized files on the local disk.

In any case, I'll be grateful for any ideas and corrections you may
have regarding this setup!

Cheers,
--
Daniel

Reply all
Reply to author
Forward
0 new messages