Hi,
We use the following type of configuration when writing to HDFS files. 
Which is basically buffer store with another bufferstore as primary. We 
use two hdfs clusters on the same physical cluster to deal with the 
problem of the namenode being a single point of failure.
With the replay_buffer=no configuration,
scribe does try to transfer data from secondary to primary when say 
hdfs://dfsscribe3 comes back up again after being down for a while.
So basically the functionality becomes, try writing to first hdfs 
cluster, else try writing to the second hdfs cluster and if both fail, 
then buffer on local disk.
With this kind of setup you will have to set up the right copier scripts 
to collect your data from two logical clusters.
Hope that helps,
Gautam
port=1456
max_msg_per_second=1000000
check_interval=1
max_queue_size=100000000
num_thrift_server_threads=3
# DEFAULT
<store>
category=default
type=buffer
max_write_interval=1
retry_interval=120
buffer_send_rate=5
must_succeed=yes
     <primary>
     type=buffer
     retry_interval=600
     replay_buffer=no
         <primary>
         type=file
         fs_type=hdfs
         file_path=hdfs://dfsscribe3:9000/user/scribe
         create_symlink=no
         use_hostname_sub_directory=yes
         base_filename=thisisoverwritten
         max_size=1000000000
         rotate_period=hourly
         add_newlines=1
         write_stats=no
         rotate_on_reopen=yes
         </primary>
         <secondary>
         type=file
         fs_type=hdfs
         file_path=hdfs://dfsscribe4:9000/user/scribe
         create_symlink=no
         use_hostname_sub_directory=yes
         base_filename=thisisoverwritten
         max_size=1000000000
         rotate_period=hourly
         add_newlines=1
         write_stats=no
         rotate_on_reopen=yes
         </secondary>
     </primary>
     <secondary>
     type=file
     file_path=/mnt/d0/scribe
     base_filename=thisisoverwritten
     max_size=40000000
     </secondary>
</store>
On 5/7/10 4:07 PM, Travis Crawford wrote:
> On Fri, May 7, 2010 at 8:09 AM, Wouter de Bie<
pru...@gmail.com>  wrote:
>> Hi all,
>>
>> We're currently having some problems when writing to HDFS if the
>> connection to the namenode becomes unavailable. hdfsWrite() always
>> returns the bytes written, even if it never actually wrote. The hdfs
>> client tries to reconnect and tries this for 45 minutes. This is done
>> in Client.java line 307:
>>
>> } catch (SocketTimeoutException toe) {
>>            /* The max number of retries is 45,
>>             * which amounts to 20s*45 = 15 minutes retries.
>>             */
>>            handleConnectionFailure(timeoutFailures++, 45, toe);
>> }
>>
>>
>> There is some code in hfds.c that tries to catch an exception from the
>> java client, but it seems to never get that exception (or maybe after
>> 45 minutes). This is in hdfs.c line 1005:
>>
 >>        if (invokeMethod(env, NULL,&jExc, INSTANCE, jOutputStream,