tail() that keeps track of position in file and restarts from that position

122 views
Skip to first unread message

NerdyNick

unread,
Jan 14, 2011, 1:46:19 PM1/14/11
to Flume Users
Is there another source like tail() that will actually pick up where
it left off if the agent happens to die. ie the agent is listening to
apache logs and in the middle of this the agent dies but apache is
still going. I need to be able to bring the agent back on line and not
miss an event nor resend already sent events.

Is this possible or would this need to be a new feature source?

--
Nick Verbeck - NerdyNick
----------------------------------------------------
NerdyNick.com
Coloco.ubuntu-rocks.org

Jonathan Hsieh

unread,
Jan 14, 2011, 2:08:05 PM1/14/11
to NerdyNick, Flume Users
Nick,

This isn't available out-of-the-box currently and my first thought is that this is a new feature.  

My worry with this is what happens if the agent is down a for a while and the file it was following has been moved or renamed.  One approach may be to save off some checksum/offset data periodically.  To recover we could re-read but not resend if the checksum matches.

Alternately there was another proposal that roughly boils down to some kind of ability to change  sink without changing (and thus losing the state) of a source.  https://issues.cloudera.org/browse/FLUME-463

Jon.
--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera

NerdyNick

unread,
Jan 14, 2011, 8:06:45 PM1/14/11
to Jonathan Hsieh, Flume Users
I did come across the following issue
https://issues.cloudera.org/browse/FLUME-457 that seems to propose
what I was thinking. I did add some comments with some ideas on how to
implement it, but basically all you would really need is the inode ID
and last left of location and you can pick up where you left off. So
when you first start up you can compare the inode ID to file name your
expecting. If they don't match finish sending off that file then go
get the actual new files your suppose to be following.

Now the only down fall would be what happens if log rotate rolls the
log twice while Flume is down. You wouldn't be able to get that log
file in the middle. In those cases you would need to manually
intervene but at least its a whole lot easier.

Jonathan Hsieh

unread,
Jan 15, 2011, 4:40:40 AM1/15/11
to NerdyNick, Flume Users
Nick,

The catch with the inode approach is that it becomes may become platform/architecture/os specific (either via jni or something unix specific exec call).  A big feature of the next release is windows support! 

Basically, if the inode approach is taken for linux-esque systems, I thinkn I'm fine with this as long as it doesn't break the windows version, and doesn't make the windows version worse.

Jon.

Otis

unread,
Jan 15, 2011, 5:01:36 AM1/15/11
to Flume Users
Hi,

Shouldn't it be possible to configure Flume Agent to "go check those
other (older/rotated/compressed) files for data in xyz dir if the
current log file doesn't seem to be the one you were reading just
before you went down"?

Otis

On Jan 14, 8:06 pm, NerdyNick <nerdyn...@gmail.com> wrote:
> I did come across the following issuehttps://issues.cloudera.org/browse/FLUME-457 that seems to propose
> what I was thinking. I did add some comments with some ideas on how to
> implement it, but basically all you would really need is the inode ID
> and last left of location and you can pick up where you left off. So
> when you first start up you can compare the inode ID to file name your
> expecting. If they don't match finish sending off that file then go
> get the actual new files your suppose to be following.
>
> Now the only down fall would be what happens if log rotate rolls the
> log twice while Flume is down. You wouldn't be able to get that log
> file in the middle. In those cases you would need to manually
> intervene but at least its a whole lot easier.
>
>
>
> On Fri, Jan 14, 2011 at 12:08 PM, Jonathan Hsieh <j...@cloudera.com> wrote:
> > Nick,
> > This isn't available out-of-the-box currently and my first thought is that
> > this is a new feature.
> > My worry with this is what happens if the agent is down a for a while and
> > the file it was following has been moved or renamed.  One approach may be to
> > save off some checksum/offset data periodically.  To recover we could
> > re-read but not resend if the checksum matches.
> > Alternately there was another proposal that roughly boils down to some kind
> > of ability to change  sink without changing (and thus losing the state) of a
> > source.  https://issues.cloudera.org/browse/FLUME-463
> > Jon.
>
Reply all
Reply to author
Forward
0 new messages