ioutil.ReadAll(resp.Body) will be blocked when downloading SUPER LARGE file with VERY SLOW down speed

4,880 views
Skip to first unread message

Zhai Xiang

unread,
Nov 27, 2013, 5:15:50 AM11/27/13
to golan...@googlegroups.com
Hi GoLangers,

I use Axel (tries to accelerate HTTP/FTP downloading process by using multiple connections for one file) under Linux http://axel.alioth.debian.org/
and tried to porting to WIN32 http://www.codeproject.com/Articles/335690/MultiThread-Download-Accelerator-Console
But here comes GoLang, it provides wide operating system support, goroutines, network I/O ... so I setup the GoAxel project https://github.com/xiangzhai/goaxel

To HTTP protocol, there is already http package provides HTTP client and server implementations in GoLang, but ioutil.ReadAll(resp.Body) will be blocked when downloading SUPER ARGE file with VERY SLOW down speed https://github.com/tuxcanfly/godown/blob/master/godown.go#L68
For example, set limit_rate 10k in nginx conf, then restart the service, run @tuxcanfly GoDown example by go run godown http://localhost/SUPER_LARGE_FILE, then it will be blocked when reading Body io.ReadCloser

So I wrote my own conn package to handle net.Conn connection, Write to it with HTTP HEADER, Read from net.Conn in the for LOOPING https://github.com/xiangzhai/goaxel/blob/master/conn/http.go#L129
Please someone show me whether or not http package is able to handle SUPER LARGE file download, thanks a lot!

PS: GoLang is cool, less source code, but more feature function, it saved my keyboard, I do not need to input huge number of C/C++ source code any more :)

Jesse McNelis

unread,
Nov 27, 2013, 5:30:09 AM11/27/13
to Zhai Xiang, golang-nuts
On Wed, Nov 27, 2013 at 9:15 PM, Zhai Xiang <xiang...@gmail.com> wrote:
To HTTP protocol, there is already http package provides HTTP client and server implementations in GoLang, but ioutil.ReadAll(resp.Body) will be blocked when downloading SUPER ARGE file with VERY SLOW down speed https://github.com/tuxcanfly/godown/blob/master/godown.go#L68

Don't use ioutil.ReadAll() will allocate space in memory for the whole of resp.Body. This will use a lot of memory for a large file and if the file is large enough you might run out of memory.

If you want to handle large input then you shouldn't use ioutil.ReadAll(), you should stream the resp.Body to disk by creating a file using os.Create() and io.Copy(file, resp.Body)
 
--
=====================
http://jessta.id.au

Leslie Zhai

unread,
Nov 27, 2013, 8:13:26 PM11/27/13
to Jesse McNelis, golang-nuts
Hi Jesse,

Thanks for your reply :)

I tried io.Copy(file, resp.Body), it might be blocked too when limited the download speed by setting limit_rate 10k for nginx`s configuration.
So I use net.Conn.Read(data []byte) in a for LOOP https://github.com/xiangzhai/goaxel/blob/master/conn/http.go#L129

Leslie

Dave Cheney

unread,
Nov 27, 2013, 8:16:26 PM11/27/13
to Leslie Zhai, Jesse McNelis, golang-nuts
You need to re-read the contact for io.Reader, specifically your code
may loose the last block of data read.
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Leslie Zhai

unread,
Nov 27, 2013, 8:41:18 PM11/27/13
to Dave Cheney, Jesse McNelis, golang-nuts
Hi Dave,

I use net.Conn.Read(data []byte) in a for LOOP, such as
for {                                                                      
        data := make([]byte, buffer_size)                                      
        n, err := http.conn.Read(data)                                         
        if err != nil {                                                        
            return                                                             
        }                                                                      
        f.WriteAt(data[:n], int64(http.offset))                                
        if http.Callback != nil {                                              
            http.Callback(n)                                                   
        }                                                                      
        http.offset += n                                                       
    }
And I tested it using 3 connctions number (with/without) limited download speed, net.Conn.Read() in a for LOOP worked well :)

To io.Copy(chunkFile, resp.Body), because it is using multiply connections for one file, it need to create several (depend on connection number) chunk files, when download finished, combined chunks into one output file https://github.com/xiangzhai/goaxel/blob/25e604de2e0c016bf8bd759f7e7bcd8468b9cf5e/goaxel.go#L119
If directly io.Copy(outputFile, resp.Body) without chunk files, the goroutines might io.Copy the output file with wrong file position indicator, for example, downloaded a PNG image, the output file might be disordered.

Leslie

Dave Cheney

unread,
Nov 27, 2013, 8:45:17 PM11/27/13
to Leslie Zhai, Jesse McNelis, golang-nuts
On Thu, Nov 28, 2013 at 12:41 PM, Leslie Zhai <xiang...@gmail.com> wrote:
> Hi Dave,
>
> I use net.Conn.Read(data []byte) in a for LOOP, such as
> for {
> data := make([]byte, buffer_size)
> n, err := http.conn.Read(data)
> if err != nil {
> return
> }
> f.WriteAt(data[:n], int64(http.offset))
> if http.Callback != nil {
> http.Callback(n)
> }
> http.offset += n
> }
> And I tested it using 3 connctions number (with/without) limited download
> speed, net.Conn.Read() in a for LOOP worked well :)
>
> To io.Copy(chunkFile, resp.Body), because it is using multiply connections
> for one file, it need to create several (depend on connection number) chunk
> files, when download finished, combined chunks into one output file
> https://github.com/xiangzhai/goaxel/blob/25e604de2e0c016bf8bd759f7e7bcd8468b9cf5e/goaxel.go#L119
> If directly io.Copy(outputFile, resp.Body) without chunk files, the
> goroutines might io.Copy the output file with wrong file position indicator,
> for example, downloaded a PNG image, the output file might be disordered.

Use an io.LimitedReader

Nigel Tao

unread,
Nov 27, 2013, 9:54:04 PM11/27/13
to Leslie Zhai, Dave Cheney, Jesse McNelis, golang-nuts
On Thu, Nov 28, 2013 at 12:41 PM, Leslie Zhai <xiang...@gmail.com> wrote:
> for {
> data := make([]byte, buffer_size)
> n, err := http.conn.Read(data)
> etc
> }

If you insist on doing it this way, at least re-use the buffer instead
of creating garbage on every loop iteration. Do this:

data := make([]byte, bufferSize)
for {
n, err := http.conn.Read(data)
etc
}

instead of

for {
data := make([]byte, bufferSize)
n, err := http.conn.Read(data)
etc
}

but Dave Cheney is right, and you should read
http://golang.org/pkg/io/#Reader carefully.

I also think that you may be able to open the file multiple times, one
for each connection, and use a single File.Seek followed by an
io.Copy, instead of rolling your own with File.WriteAt.

Or if you insist on staging the download in chunk files, concatenating
the chunks into the destination file (your writeChunk function) can be
a series of io.Copy calls instead of rolling your own inner for loop.

Leslie Zhai

unread,
Nov 27, 2013, 10:07:52 PM11/27/13
to Nigel Tao, Dave Cheney, Jesse McNelis, golang-nuts
Hi Nigel,

Thanks for your advice, I reduced make byte array times :) https://github.com/xiangzhai/goaxel/blob/master/conn/http.go#L128

I chosed outputFile.WriteAt(data[:n], int64(http.offset)) now, directly WriteAt data with offset into output file, it does not need to make chunk files any more.

Leslie

Jesse McNelis

unread,
Nov 27, 2013, 10:16:18 PM11/27/13
to Leslie Zhai, Nigel Tao, Dave Cheney, golang-nuts
On Thu, Nov 28, 2013 at 2:07 PM, Leslie Zhai <xiang...@gmail.com> wrote:
Hi Nigel,

Thanks for your advice, I reduced make byte array times :) https://github.com/xiangzhai/goaxel/blob/master/conn/http.go#L128

I chosed outputFile.WriteAt(data[:n], int64(http.offset)) now, directly WriteAt data with offset into output file, it does not need to make chunk files any more

n, err := http.conn.Read(data)
if err != nil {
  return
}
You're still not handling the Read() correctly.
if 'err' is io.EOF, 'n' could be more than zero and thus you'd throw away the end of the data.
You should check the value of 'n' before looking at the error value.

You should read the docs for io.Reader, http://golang.org/pkg/io/#Reader

Leslie Zhai

unread,
Nov 27, 2013, 10:17:08 PM11/27/13
to Dave Cheney, Jesse McNelis, golang-nuts
Hi Dave,

Thanks for your advice, it helps me to learn more APIs about GoLang :)

I tried io.LimitReader in a for LOOP https://github.com/xiangzhai/goaxel/blob/master/test/http.go#L71
for {
        lr := io.LimitReader(resp.Body, buffer_size)
        n, err := io.ReadAtLeast(lr, data, int(buffer_size))
        if err != nil { return }
        f.WriteAt(data, int64(h.offset))
        if h.Callback != nil {
            h.Callback(n)
        }
        h.offset += n
    }
But if buffer_size was 102400, io.LimitReader might be blocked (I hop it is my misunderstanding of io.LimitReader), then set buffer_size to 10240, it worked...
And output file was diff from the original one, run command diff -a output /usr/share/nginx/html/original in my Linux box, it is able to copy the test/http.go to conn/http.go to run the test by go run goaxel.go http://localhost/original

Leslie

Leslie Zhai

unread,
Nov 27, 2013, 10:30:49 PM11/27/13
to Jesse McNelis, Nigel Tao, Dave Cheney, golang-nuts

rec...@gmail.com

unread,
Apr 14, 2014, 2:30:22 AM4/14/14
to golan...@googlegroups.com
In my opinion, this is a io package bug rather than an issue.

在 2013年11月27日星期三UTC+8下午6时15分50秒,Leslie Zhai写道:

Dave Cheney

unread,
Apr 15, 2014, 6:08:16 AM4/15/14
to golan...@googlegroups.com, rec...@gmail.com
How is this a bug ? We've explained that if you don't want to wait, you need to setup a timeout.
Reply all
Reply to author
Forward
0 new messages