Does anyone have any clue how I can configure scribe to act properly
in this scenario? Should I log to the machine hosting the NSF mounted
drive as my primary, and have that machine running another scribe
instance relaying the messages to the NSF drive?
Thanks,
-Harish
port=1463
max_msg_per_second=2000000
check_interval=3
# DEFAULT - forward all messages to Scribe on port 1463
<store>
category=default
type=buffer
target_write_size=20480
max_write_interval=1
buffer_send_rate=1
retry_interval=30
retry_interval_range=10
<primary>
type=file
fs_type=std
file_path=/data/var/log
base_filename=thisisoverwritten
rotate_period=daily
rotate_hour=0
rotate_minute=0
</primary>
<secondary>
type=file
fs_type=std
file_path=/var/log/scribe
base_filename=thisisoverwritten
max_size=3000000
</secondary>
</store>
What do the print messages from the shell you run scribed from tell you?
Best,
Gautam
1. The permissions should be fine - when I've tested the failover
before without nsf (for instance, setting a primary store to a
directory which doesn't exist), the messages get logged to the
secondary store directory. I believe I've tested (will do so again
tonight) shutting down nsf gracefully before and messages were logged
to the secondary store. The specific scenario in which messages don't
get logged to the seconday (and are forever lost) is when NSF crashes
unexpectedly via the hosting machine going down.
2. The failure occurred over a span of about eight hours on a server
running apache, it definitely would have exceeded 20480 bytes.
I'm using a script to run scribed in daemon mode, so I'm not sure how
to examine logged messages. Is it not advisable to run scribed in
this way?
Thanks,
-Harish
nothing wrong with running scribed in daemon mode. We use LOG_OPER
(defined) in env_default.h in a bunch of places in the source code to
provide messages that would help in debugging in case of a problem. It
is sent to stderr. If you run scribed you should see these messages
print. Look at whichever place stderr is redirected to, in case of
daemonizing.
Best,
Gautam
Here is a related question - when using scribe over NFS, should I have
the client mount the NFS drive in 'soft' failure mode? I'm not very
familiar with NFS and have left most things to their default values.
I'm now reading about hard vs. soft failure mode and it seems like I
should be using the 'soft' mode, but it would be great to get some
advice from someone more experienced than me.
You will see log messages. Something like
[~/Code/scribe/examples]$ ../src/scribed example1.conf
[Tue Feb 9 23:11:02 2010] "setrlimit error (setting max fd size)"
[Tue Feb 9 23:11:02 2010] "STATUS: STARTING"
[Tue Feb 9 23:11:02 2010] "STATUS: configuring"
[Tue Feb 9 23:11:02 2010] "got configuration data from file
<example1.conf>"
[Tue Feb 9 23:11:02 2010] "CATEGORY : default"
[Tue Feb 9 23:11:02 2010] "Creating default store"
[Tue Feb 9 23:11:02 2010] "configured <1> stores"
[Tue Feb 9 23:11:02 2010] "STATUS: "
[Tue Feb 9 23:11:02 2010] "STATUS: ALIVE"
[Tue Feb 9 23:11:02 2010] "Starting scribe server on port 1463"
Thrift: Tue Feb 9 23:11:02 2010 libevent 2.0.3-alpha method kqueue
There will be more messages when messages are received/forwared.
Best,
Gautam
On 2/9/10 5:41 AM, harish wrote:
I'll try to restart scribed in a terminal when I have a moment and see
what gets printed out during an nfs failure - I don't have a great
development environment in which to test things though.
In the meantime, I went ahead and tested soft failure mode on the nfs
client and scribe is now behaving as expected (failing over to a local
drive when nfs goes down). While this fixes the problem, I am still
worried that I'm setting myself up for larger problems down the line -
would be great to hear from any scribe over nfs users out there with
suggestions on how best to configure scribe and an nfs client to work
well with each other during an nfs outage.
-Harish