Writing to a file slow

351 views
Skip to first unread message

jord...@gmail.com

unread,
Jun 21, 2013, 3:11:05 PM6/21/13
to golan...@googlegroups.com
Hi,

I'm evaluating Go as a substitute to Python for a task I need to do. 
I'm reading from STDIN and saving the information to a file. 
The following code takes around 3 secs to read 1million lines from stdin and writing them to a file using Go1.1 (OS Debian Linux)

package main
import "fmt"
import "bufio"
import "os"

func main() {
    f, _ := os.Create("outputgo.txt")
    reader := bufio.NewReader(os.Stdin)
    for  {
        line, err := reader.ReadString('\n')
        if err != nil {
            fmt.Println("%s", line)
            return
        }
        f.WriteString(line)
    }
}

In Python, takes approx 0.3 secs:

import sys

f = open('outputpython.txt')
for line in sys.stdin:
    f.write(line + '\n')

Any explanation?

Thanks.

Rémy Oudompheng

unread,
Jun 21, 2013, 4:55:49 PM6/21/13
to jord...@gmail.com, golan...@googlegroups.com
Maybe a few hints:
- do some profiling
- try the bufio.Scanner interface
- buffer the writing to file

Rémy.

2013/6/21, jord...@gmail.com <jord...@gmail.com>:
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

Ian Lance Taylor

unread,
Jun 21, 2013, 4:57:23 PM6/21/13
to jord...@gmail.com, golan...@googlegroups.com
On Fri, Jun 21, 2013 at 12:11 PM, <jord...@gmail.com> wrote:
>
> I'm evaluating Go as a substitute to Python for a task I need to do.
> I'm reading from STDIN and saving the information to a file.
> The following code takes around 3 secs to read 1million lines from stdin and
> writing them to a file using Go1.1 (OS Debian Linux)
>
> package main
> import "fmt"
> import "bufio"
> import "os"
>
> func main() {
> f, _ := os.Create("outputgo.txt")
> reader := bufio.NewReader(os.Stdin)
> for {
> line, err := reader.ReadString('\n')
> if err != nil {
> fmt.Println("%s", line)
> return
> }
> f.WriteString(line)
> }
> }

This code is going to do a lot of copying between []byte and string.
One thing to learn about Go is that when doing I/O, stick to []byte.

This is also, of course, an implausible way to copy a file, but
presumably you really do want to read lines for some reason rather
than simply reading buffers of data. To read lines in Go 1.1, you
should bufio.NewScanner. See
http://golang.org/pkg/bufio/#example_Scanner_lines .

Ian

Rob Pike

unread,
Jun 21, 2013, 4:57:50 PM6/21/13
to Rémy Oudompheng, jord...@gmail.com, golan...@googlegroups.com
It's almost all due to the lack of buffering, but ReadString is also
allocating unnecessarily.

What Rémy said.

-rob

Matthew Kane

unread,
Jun 21, 2013, 4:59:08 PM6/21/13
to jord...@gmail.com, golang-nuts
You aren't buffering your output. Try wrapping f in a bufio.Writer.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
matt kane
twitter: the_real_mkb / nynexrepublic
http://hydrogenproject.com

Rodrigo Kochenburger

unread,
Jun 21, 2013, 5:07:49 PM6/21/13
to golan...@googlegroups.com
Also, how are you measuring the time?

jord...@gmail.com

unread,
Jun 21, 2013, 5:32:46 PM6/21/13
to golan...@googlegroups.com
The buffering did the trick. Awesome! Thanks very much. 

For timing I was using the time command line utility. 

Rodrigo Kochenburger

unread,
Jun 21, 2013, 6:57:15 PM6/21/13
to golan...@googlegroups.com
Not sure if you were using 'go run' or 'go build' and timing only the execution, but something to keep in mind :)
Reply all
Reply to author
Forward
0 new messages