How to imlement unix command 'tail' by go

2,191 views
Skip to first unread message

dlin

unread,
Apr 15, 2011, 1:26:01 AM4/15/11
to golang-nuts
I'm trying to write a unix command 'tail' by Go as an exercise.

But, it can not work after read the EOF of file.


func file_reader(path string) {
file, err := os.Open(path)
if err != nil {
log.Fatalln("Err:os.Open", err)
}
defer file.Close()

if _, err = file.Seek(0,os.SEEK_END) ; err != nil {
log.Fatalln("Err:file.Seek", err)
}

l := line.NewReader(bufio.NewReader(file), 80) // keep 10 lines
for {
line, isPrefix, err := l.ReadLine()
if err != nil {
if err == os.EOF {
time.Sleep(1e9) // 1 s
continue
}
log.Fatalln("ReadLine()", err)
}
if isPrefix {
println(string(line))
} else {
println(string(line))
}
}
}

roger peppe

unread,
Apr 15, 2011, 4:25:21 AM4/15/11
to dlin, golang-nuts
this is a good point. currently the error in bufio is "sticky" - i.e. if
you get an error once, no subsequent operation will succeed.

perhaps it could provide some way to reset the error,
although in your case (tail -f), i don't see that there's any particular
benefit in using bufio or the line reader - you could just
read from the end of the file directly.

peterGo

unread,
Apr 15, 2011, 9:19:09 AM4/15/11
to golang-nuts
FYI

bufio: add ReadLine
http://code.google.com/p/go/source/detail?r=54b2790403aa

It matches encoding/line exactly and the tests are copied from there.
If we land this, then encoding/line will get marked as deprecated then
deleted in time.

Peter

roger peppe

unread,
Apr 15, 2011, 10:19:25 AM4/15/11
to peterGo, golang-nuts
why do i get the impression that my posts here are vanishing?

ReadLine won't help this problem at all - the problem is that
once you've got an error from bufio.Reader (e.g. by reading to the
end of a file) you can't use that bufio.Reader to read any more.

dlin

unread,
Apr 15, 2011, 11:08:00 AM4/15/11
to golang-nuts
Thanks Peter, I knew it is old code, and I've tried 'gofix' to
automatic replace with new code, but gofix can not do it.

But, this is not the point of this problem. I require a method to
reset the error status after read a EOF to overcome this problem.

On 4月15日, 下午9時19分, peterGo <go.peter...@gmail.com> wrote:
> FYI
>
> bufio: add ReadLinehttp://code.google.com/p/go/source/detail?r=54b2790403aa

peterGo

unread,
Apr 15, 2011, 11:20:55 AM4/15/11
to golang-nuts
roger,

On Apr 15, 10:19 am, roger peppe <rogpe...@gmail.com> wrote:
> why do i get the impression that my posts here are vanishing?

I don't know either! I see your posts.

> > On Apr 15, 1:26 am, dlin <dlin...@gmail.com> wrote:
> l := line.NewReader(bufio.NewReader(file), 80) // keep 10 lines

Since dlin's code was using the encoding/line package ReadLine
function, in my reply to his message, I pointed out that it's being
replaced by the new bufio.Readline function.

What has that got to do with your posts (not) vanishing?

Peter

On Apr 15, 10:19 am, roger peppe <rogpe...@gmail.com> wrote:
> why do i get the impression that my posts here are vanishing?
>
> ReadLine won't help this problem at all - the problem is that
> once you've got an error from bufio.Reader (e.g. by reading to the
> end of a file) you can't use that bufio.Reader to read any more.
>

chris dollin

unread,
Apr 15, 2011, 11:27:25 AM4/15/11
to peterGo, golang-nuts
On 15 April 2011 16:20, peterGo <go.pe...@gmail.com> wrote:
> roger,
>
> On Apr 15, 10:19 am, roger peppe <rogpe...@gmail.com> wrote:
>> why do i get the impression that my posts here are vanishing?
>
> I don't know either! I see your posts.
>
>> > On Apr 15, 1:26 am, dlin <dlin...@gmail.com> wrote:
>>          l := line.NewReader(bufio.NewReader(file), 80) // keep 10 lines
>
> Since dlin's code was using the encoding/line package ReadLine
> function, in my reply to his message,  I pointed out that  it's being
> replaced by the new bufio.Readline function.
>
> What has that got to do with your posts (not) vanishing?

Because it doesn't matter where ReadLine lives: it doesn't solve
the problem, yet you replied as though it would. Roger wondered
where his post pointing out the irrelevance of ReadLine had
got lost as a result.

--
Chris "allusive" Dollin

dlin

unread,
Apr 15, 2011, 12:55:47 PM4/15/11
to golang-nuts
I've solve this problem. By add this function in $GOROOT/src/pkg/bufio/
bufio.go
But, I require someone to patch into repository.

// Clear error to let it workable again.
// It's useful when implement unix 'tail -f'
func (b *Reader) ClearErr() {
b.err = nil
}


On 4月15日, 下午11時27分, chris dollin <ehog.he...@googlemail.com> wrote:

Russ Cox

unread,
Apr 15, 2011, 1:00:33 PM4/15/11
to dlin, golang-nuts
I don't think this is the right fix.
bufio should just try again if the
known error is os.EOF.

Daniel Lin

unread,
Apr 15, 2011, 8:39:49 PM4/15/11
to Russ Cox, golang-nuts
I agree, my dirty patch require extra effort of API user.

2011/4/16 Russ Cox <r...@google.com>

roger peppe

unread,
Apr 19, 2011, 2:50:34 AM4/19/11
to Russ Cox, dlin, golang-nuts

i don't think that's right.

for other kinds of Reader, os.EOF is a permanent thing and
you don't want to call Read again after getting it.

moreover there are other kinds of errors that one might not wish to
persist - for example when using bufio over a network connection
with read timeouts.

ideally bufio could determine whether the error is temporary,
for instance by testing whether the error implements Temporary(),
but that can't work for os.EOF.

i think that providing ClearError might be the best solution here.

Morgaine

unread,
Apr 19, 2011, 4:02:15 AM4/19/11
to roger peppe, Russ Cox, dlin, golang-nuts
Whether or not the API should contain a ClearError() or RetestEOF() or similar is an interesting question, but perhaps we should step back a moment and re-examine the requirement first before making an ad hoc fix.

The suggested solution is somewhat questionable because it "fixes" a perceived problem in a sleep-and-try again algorithm.  This is a terrible design pattern which, depending on the chosen sleep interval, results in either a long latency in response to file updates or else a high busy-wait CPU overhead owing to rapid polling.  The trade-off is terrible at both ends, and twice half-terrible in the middle.

A program written for production use would employ something like select() so that no CPU is used at all until the system wakes up the appropriate file descriptor and releases the blocked call.  Even if this is not used, I can't help but feel that goroutines ought to be providing us with a much cleaner abstraction to intermittent activity than reading until EOF and then changing the semantic of EOF.  It feels wrong.

While I appreciate that the purpose of the simple program was mainly tutorial, it may be highlighting an area that needs a little more thought rather than a quick fix, to capitalize on Go's strengths.


Morgaine.




======================

Skip Tavakkolian

unread,
Apr 19, 2011, 4:05:31 AM4/19/11
to dlin, golang-nuts
it could be argued that Reader should be renewed from file in this case.
like (untried):

if err == os.EOF {
oldstat, err := os.Stat(file)
if err != nil {
log.Fatalln("...")
}
newstat := oldstat
waitforchange:
for ; newstat.Size == oldstat.Size; {
time.Sleep(1e9) // 1 s
newstat, err = os.Stat(file)
if err != nil {
log.Fatalln("Stat")
}
}
if newstat.Size < oldstat.Size {
file.Seek(0,2)
if err != nil {
log.Fatalln("Seek")
}
oldstat = newstat
goto waitforchange


}
l = line.NewReader(bufio.NewReader(file), 80) // keep 10 lines
}

roger peppe

unread,
Apr 19, 2011, 5:21:39 AM4/19/11
to Morgaine, Russ Cox, dlin, golang-nuts
On 19 April 2011 09:02, Morgaine <morgain...@googlemail.com> wrote:
> The suggested solution is somewhat questionable because it "fixes" a
> perceived problem in a sleep-and-try again algorithm.  This is a terrible
> design pattern which, depending on the chosen sleep interval, results in
> either a long latency in response to file updates or else a high busy-wait
> CPU overhead owing to rapid polling.  The trade-off is terrible at both
> ends, and twice half-terrible in the middle.
>
> A program written for production use would employ something like select() so
> that no CPU is used at all until the system wakes up the appropriate file
> descriptor and releases the blocked call.  Even if this is not used, I can't
> help but feel that goroutines ought to be providing us with a much cleaner
> abstraction to intermittent activity than reading until EOF and then
> changing the semantic of EOF.  It feels wrong.

as far as i'm aware, there's no universal substitute for sleep-and-try-again
when waiting for files to change. some platforms provide a notification
mechanism (e.g. os/inotify) but they're not portable and i doubt that
they work on network file systems.

for the particular problem of "implementing unix command 'tail'", i think
a hybrid approach would work best - use bufio to read lines, and then
read the file directly for the polling part (ignoring the fact that the tail
command reads backwards from the end of the file when possible, something
that bufio cannot do).

regardless of that, i still think that ClearError or support for Temporary
errors would be useful for the reasons i outlined above.

for the original poster: here's a version of tail.

// tail prints the last n lines of the file, and
// then polls waiting for more data to be written
// and printing it.
func tail(f *os.File, n int) {
lines := make([]string, n) // circular buffer
i := 0
n = 0
r := bufio.NewReader(f)
w := bufio.NewWriter(os.Stdout)
for {
line, err := r.ReadString('\n')
if len(line) > 0 {
lines[i] = line
i = (i + 1) % len(lines)
if n < len(lines) {
n++
}
}
if err != nil {
break
}
}
for j := (i - n + len(lines)) % len(lines); n > 0; j, n =
(j+1)%len(lines), n-1 {
w.WriteString(lines[j])
}
w.Flush()
buf := make([]byte, 8192)
for {
time.Sleep(1e9)
for {
n, err := f.Read(buf)
if n > 0 {
w.Write(buf[0:n])
}
if err != nil {
break
}
}
w.Flush()
f.Seek(0, 2) // in case the file has been truncated.
}
}

Morgaine

unread,
Apr 19, 2011, 10:46:06 AM4/19/11
to roger peppe, Russ Cox, dlin, golang-nuts
Go has the powerful select statement for channels in its repertoire, which seems tailor-made for programming the kind of event-driven functionality required for "tail" using a native idiom.

Sleep-and-try-again is most definitely not idiomatic in a language that defines goroutines to allow blocked waiting at very low cost as a core goal.

Furthermore, polled solutions scale very poorly, because as the number of sources polled increases, you have to decrease the sleep time in order to keep the response latency within bounds.  This rapidly converges to a crunch point where the busy-wait consumes 100% CPU and each extra source that is added increases the response latency.  That's bad engineering.

The only reason the proposed solution seems viable in this example is because it is in response to an undemanding use case involving a single source.  Go needs to work well in more general scenarios though, highly concurrent and scalable ones for Internet services in particular, and polling won't get us there.

The choice of how EOF is handled is mostly just a matter of elegance, but the choice of polled versus event-driven has very negative consequences if you get it wrong.  I think it deserves highlighting.

While I desire effective EOF handling as much as anyone, I'm much more concerned that Go has strong support for scalability in the production environment, and that means strong concurrent event handling using Go's native idioms.


Morgaine.


=====================

Namegduf

unread,
Apr 19, 2011, 11:59:13 AM4/19/11
to golan...@googlegroups.com
On Tue, 19 Apr 2011 15:46:06 +0100
Morgaine <morgain...@googlemail.com> wrote:

> Go has the powerful *select statement* for channels in its


> repertoire, which seems tailor-made for programming the kind of
> event-driven functionality required for "tail" using a native idiom.
>
> Sleep-and-try-again is most definitely not idiomatic in a language
> that defines goroutines to allow blocked waiting at very low cost as
> a core goal.

If the OS does not provide a source of event notifications, you cannot
use an event-driven approach. You cannot avoid polling. Using
os/inotify for notifications would be nice and idiomatic, but it is
unfortunately Linux-only. There is no portable equivalent.

The language can't change this; it could hide the polling but that
would be unpleasant and complex.

Morgaine

unread,
Apr 19, 2011, 4:34:00 PM4/19/11
to Namegduf, golan...@googlegroups.com
You only need to poll on a platform that doesn't provide anything better.  Constraining all Go programmers to the level of efficiency of the weakest platform is not a recipe for wide success.

The language is intended to be portable, but that doesn't mean that every feature will be supported equally efficiently on all platforms.  That could never be the case anyway.  Platforms differ, machines differ, compilers differ, and so do system support libraries.  Each platform does the best it can.  And each of them evolves, there is no fixed point.

The question we have here is how to support event-driven operation where the platform allows it, while still keeping the language portable to less efficient platforms that don't yet support it.  Since Go already offers a select statement that provides the needed abstraction very nicely. this is not a huge leap.  It needn't involve differences in core runtime either, if a package can be written in C to present an event-based API in the form of channels for the select statement to use.

In addition, please keep in mind that while Linux provides inotify today and Windows/Mac may not, this does not mean that Windows/Mac won't provide a similar mechanism tomorrow.  The non-scalability of polling makes that very likely indeed, as the proprietary O/S manufacturers are regularly upgrading, and their customers don't like poor performance.

It's important to look ahead when designing a language, and the Go designers did exactly that when they added concurrency, multicore support, and channels after all.  And channels + Go's select statement are a great abstraction for handling events in a clean idiomatic way, so this seems like the right way to go.


Morgaine.



===============

Jessta

unread,
Apr 19, 2011, 5:03:30 PM4/19/11
to Morgaine, golan...@googlegroups.com
On Wed, Apr 20, 2011 at 6:34 AM, Morgaine
<morgain...@googlemail.com> wrote:
> You only need to poll on a platform that doesn't provide anything better.
> Constraining all Go programmers to the level of efficiency of the weakest
> platform is not a recipe for wide success.

Go implementations already use event-driven I/O operations to make I/O
more efficient under the hood. But it's just an
implementation detail and isn't relevant to the actual language.


> The language is intended to be portable, but that doesn't mean that every
> feature will be supported equally efficiently on all platforms.  That could
> never be the case anyway.  Platforms differ, machines differ, compilers
> differ, and so do system support libraries.  Each platform does the best it
> can.  And each of them evolves, there is no fixed point.

An implementation of 'tail' could use os/inotify to wait for a
modification events on the file, but this program would only work on
systems that supported inotify. It depends on the specific
programmer's requirements, it's not a language issue.


- jessta
--
=====================
http://jessta.id.au

Morgaine

unread,
Apr 19, 2011, 10:14:02 PM4/19/11
to Jessta, golan...@googlegroups.com
I certainly agree that it shouldn't affect the grammar of the language, and that's not required anyway because Go's select statement + channels provides a very nice API for waits that are blocked pending notification of a file update.

Back in August 2010, Russ wrote "I think inotify would belong in os/inotify and
probably only be available on Linux."
  That makes sense, but calling inotify(2) syscalls directly in a Go application program isn't idiomatic.  To make it more so, it would be nice to hide the inotify(2) calls in a package that provides a public channels-based interface so that an application could use idiomatic select statements on those channels instead.


Morgaine.


===================
Reply all
Reply to author
Forward
0 new messages