Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Imitating "tail -f"

4 views
Skip to first unread message

Ivan Voras

unread,
Nov 21, 2009, 9:43:31 PM11/21/09
to
I'm trying to simply imitate what "tail -f" does, i.e. read a file, wait
until it's appended to and process the new data, but apparently I'm
missing something.

The code is:

54 f = file(filename, "r", 1)
55 f.seek(-1000, os.SEEK_END)
56 ff = fcntl.fcntl(f.fileno(), fcntl.F_GETFL)
57 fcntl.fcntl(f.fileno(), fcntl.F_SETFL, ff | os.O_NONBLOCK)
58
59 pe = select.poll()
60 pe.register(f)
61 while True:
62 print repr(f.read())
63 print pe.poll(1000)

The problem is: poll() always returns that the fd is ready (without
waiting), but read() always returns an empty string. Actually, it
doesn't matter if I turn O_NDELAY on or off. select() does the same.

Any advice?

exa...@twistedmatrix.com

unread,
Nov 21, 2009, 11:10:02 PM11/21/09
to Ivan Voras, pytho...@python.org

select(), poll(), epoll, etc. all have the problem where they don't
support files (in the thing-on-a-filesystem sense) at all. They just
indicate the descriptor is readable or writeable all the time,
regardless.

"tail -f" is implemented by sleeping a little bit and then reading to
see if there's anything new.

Jean-Paul

Matt Nordhoff

unread,
Nov 22, 2009, 1:32:27 AM11/22/09
to Jason Sewall, pytho...@python.org
Jason Sewall wrote:
> FWIW, GNU tail on Linux uses inotify for tail -f:
>
> http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/tail.c
>
> The wikipedia page for inotify lists several python bindings:
>
> http://en.wikipedia.org/wiki/Inotify
>
> Not much help for non-Linux users, but there it is. Too bad, because
> inotify is pretty cool.
>
> Jason

Some other operating systems have similar facilities, e.g. FSEvents on OS X.
--
Matt Nordhoff

Wolodja Wentland

unread,
Nov 22, 2009, 4:11:44 AM11/22/09
to pytho...@python.org
On Sun, Nov 22, 2009 at 03:43 +0100, Ivan Voras wrote:
> I'm trying to simply imitate what "tail -f" does, i.e. read a file, wait
> until it's appended to and process the new data, but apparently I'm
> missing something.
[..]
> Any advice?

Have a look at [1], which mimics "tail -f" perfectly. It comes from a
talk by David Beazley on generators which you can find at [2] and
[3].

Enjoy!

[1] http://www.dabeaz.com/generators/follow.py
[2] http://www.dabeaz.com/generators-uk/
[3] http://www.dabeaz.com/coroutines/

--
.''`. Wolodja Wentland <went...@cl.uni-heidelberg.de>
: :' :
`. `'` 4096R/CAF14EFC
`- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC

signature.asc

Paul Rudin

unread,
Nov 22, 2009, 7:20:45 AM11/22/09
to
Matt Nordhoff <mnor...@mattnordhoff.com> writes:

Yeah, and there's a similar kind of thing in the windows api.

A nice python project would be a cross-platform solution that presented
a uniform api and just did the right thing behind the scenes on each OS.

(Incidentally on linux you need to watch out for the value of
/proc/sys/fs/inotify/max_user_watches - if you're using inotify in anger
it's easy to exceed the default set by a lot of distributions.)

Nobody

unread,
Nov 22, 2009, 3:35:07 PM11/22/09
to
On Sun, 22 Nov 2009 03:43:31 +0100, Ivan Voras wrote:

> The problem is: poll() always returns that the fd is ready (without
> waiting), but read() always returns an empty string. Actually, it
> doesn't matter if I turn O_NDELAY on or off. select() does the same.

Regular files are always "ready" for read/write. read() might return EOF,
but it will never block (or fail with EAGAIN or EWOULDBLOCK).

> Any advice?

The Linux version of "tail" uses the Linux-specific inotify_add_watch()
mechanism to block waiting for file-modification events.

If you don't have access to inotify_add_watch(), you'll just have to keep
trying to read from the file, sleep()ing whenever you hit EOF so that you
don't tie up the system with a busy-wait.

Aahz

unread,
Nov 28, 2009, 11:29:00 PM11/28/09
to
In article <mailman.801.1258871...@python.org>,

Matt Nordhoff <mnor...@mattnordhoff.com> wrote:
>Jason Sewall wrote:
>>
>> FWIW, GNU tail on Linux uses inotify for tail -f:
>
>Some other operating systems have similar facilities, e.g. FSEvents on OS X.

Having spent some time with FSEvents, I would not call it particularly
similar to inotify. FSEvents only works at the directory level. Someone
suggested pnotify the last time this subject came up, but I haven't had
time to try it out.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

The best way to get information on Usenet is not to ask a question, but
to post the wrong information.

Paul Boddie

unread,
Nov 30, 2009, 6:15:18 AM11/30/09
to
On 22 Nov, 05:10, exar...@twistedmatrix.com wrote:
>
> "tail -f" is implemented by sleeping a little bit and then reading to
> see if there's anything new.

This was the apparent assertion behind the "99 Bottles" concurrency
example:

http://wiki.python.org/moin/Concurrency/99Bottles

However, as I pointed out (and as others have pointed out here), a
realistic emulation of "tail -f" would actually involve handling
events from operating system mechanisms. Here's the exchange I had at
the time:

http://wiki.python.org/moin/Concurrency/99Bottles?action=diff&rev2=12&rev1=11

It can be very tricky to think up good examples of multiprocessing
(which is what the above page was presumably intended to investigate),
as opposed to concurrency (which can quite easily encompass responding
to events asynchronously in a single process).

Paul

P.S. What's Twisted's story on multiprocessing support? In my limited
experience, the bulk of the work in providing usable multiprocessing
solutions is in the communications handling, which is something
Twisted should do very well.

exa...@twistedmatrix.com

unread,
Nov 30, 2009, 9:36:38 AM11/30/09
to Paul Boddie, pytho...@python.org

Twisted includes a primitive API for launching and controlling child
processes, reactor.spawnProcess. It also has several higher-level APIs
built on top of this aimed at making certain common tasks more
convenient. There is also a third-party project called Ampoule which
provides a process pool to which it is is relatively straightforward to
send jobs and then collect their results.

Jean-Paul

0 new messages