reading long text without newline from stdin

1,650 views
Skip to first unread message

vend...@gmail.com

unread,
Sep 9, 2016, 11:43:51 AM9/9/16
to golang-nuts
Hi,

I was trying to read a longer text from stdin, without any newline. I tried it with many ways: fmt.Scan, bufio.NewScanner, bufio ReadLine, ioutil.ReadAll. A sample code looks like this:
stackoverflow.com/questions/27196195/golang-read-from-pipe-reads-tons-of-data#answer-27196786
If the length was just a bit longer than 4096 characters, the result was always the same: it was cut at 4096 characters.

When I saved the text to a file, and read from there, every method worked. If I changed the spaces to newlines, every method worked from stdin. So only this combination was wrong. Looks like a bug for me.

I wanted to use it for hackerrank.com excercises, where this is the typical input: long line given through stdin, without newline.

Can anyone help me please?

Ian Lance Taylor

unread,
Sep 9, 2016, 12:45:24 PM9/9/16
to vend...@gmail.com, golang-nuts
Show us the exact code you are running, and tell us about the system
on which you are running it.

Ian

vend...@gmail.com

unread,
Sep 9, 2016, 3:37:11 PM9/9/16
to golang-nuts, vend...@gmail.com


Almost the same code as the example I linked, I only changed the log to fmt print, and the buffer size from 4*1024 to 8*1024. So when I run it, enter a string with 4900 characters without newline, it waits for more input. Even if I end it with EOF (Ctrl+d) immediately, It writes this: Bytes: 4096 Chunks: 1.

I'm on Arch linux 32 bit. Latest go:
$ go version
go version go1.7.1 linux/386

The code:

package main

import (
    "bufio"
    "fmt"
    "io"
    "os"
)

func main() {
    nBytes, nChunks := int64(0), int64(0)
    r := bufio.NewReader(os.Stdin)
    buf := make([]byte, 0, 8*1024)
    for {
        n, err := r.Read(buf[:cap(buf)])
        buf = buf[:n]
        if n == 0 {
            if err == nil {
                continue
            }
            if err == io.EOF {
                break
            }
            fmt.Println(err)
        }
        nChunks++
        nBytes += int64(len(buf))
        // process buf
        if err != nil && err != io.EOF {
            fmt.Println(err)
        }
    }
    fmt.Println("Bytes:", nBytes, "Chunks:", nChunks)
}

Ian Lance Taylor

unread,
Sep 9, 2016, 3:51:05 PM9/9/16
to vend...@gmail.com, golang-nuts
On Fri, Sep 9, 2016 at 12:36 PM, <vend...@gmail.com> wrote:
>
> On Friday, 9 September 2016 18:45:24 UTC+2, Ian Lance Taylor wrote:
>>
>> On Fri, Sep 9, 2016 at 7:22 AM, <vend...@gmail.com> wrote:
>> >
>> > I was trying to read a longer text from stdin, without any newline. I
>> > tried
>> > it with many ways: fmt.Scan, bufio.NewScanner, bufio ReadLine,
>> > ioutil.ReadAll. A sample code looks like this:
>> >
>> > stackoverflow.com/questions/27196195/golang-read-from-pipe-reads-tons-of-data#answer-27196786
>> > If the length was just a bit longer than 4096 characters, the result was
>> > always the same: it was cut at 4096 characters.
>> >
>> > When I saved the text to a file, and read from there, every method
>> > worked.
>> > If I changed the spaces to newlines, every method worked from stdin. So
>> > only
>> > this combination was wrong. Looks like a bug for me.
>> >
>> > I wanted to use it for hackerrank.com excercises, where this is the
>> > typical
>> > input: long line given through stdin, without newline.
>> >
>> > Can anyone help me please?
>>
>> Show us the exact code you are running, and tell us about the system
>> on which you are running it.
>
> Almost the same code as the example I linked, I only changed the log to fmt
> print, and the buffer size from 4*1024 to 8*1024. So when I run it, enter a
> string with 4900 characters without newline, it waits for more input. Even
> if I end it with EOF (Ctrl+d) immediately, It writes this: Bytes: 4096
> Chunks: 1.
>
> I'm on Arch linux 32 bit. Latest go:

You are just typing the string on a terminal? I'm pretty sure that
you are running into the terminal input buffer size, which is 4096 on
GNU/Linux (look for N_TTY_BUF_SIZE in the kernel sources). That is
how many characters you can type on a terminal in canonical mode
before hitting a newline. This has nothing to do with Go. I expect
that you will see the same behavior with the cat program.

Ian

vend...@gmail.com

unread,
Sep 15, 2016, 7:15:56 AM9/15/16
to golang-nuts, vend...@gmail.com

Hi,

Thanks for this clarification. I checked a couple of things, and now I'm pretty sure, you're right. For an exact proof, I had to check N_TTY_BUF_SIZE, but I wasn't able to find it. Anyway, when I solved the same exercise in Python, it was OK for all test cases including the long one. And when I ran the Python on my localhost, it failed the same way as in golang. So I think there is a bug at hackerrank for evaluating golang. I already wrote them about the possible bug.
Thanks again.

Reply all
Reply to author
Forward
0 new messages