Hi all,
I've hit an interesting problem, and I was a bit surprised that there isn't anything in the standard libs that could have solved it easily. It isn't too complicated to write, but it isn't trivial either. If by any chance it's already in the libs, please enlighten me :), otherwise would anyone be interested in including it?
The thing I was solving is fairly trivial: download a file from the internet, and stream-upload it somewhere else (Google Cloud Storage specifically, but it doesn't really matter). The naive solution is pretty straightforward: wire together the downloader's reader with the uploader's writer, and voila, magic... until you look at the network usage: x1 secs download, y1 secs upload, x2 secs download, y2 secs upload.
The problem is, that the uploader will read some fixed amount of data, buffer it up and then start the upload. But while the upload is in progress, it doesn't read any more data from the reader, essentially pausing it until it finishes. By that time the reader could have filled up the next buffer to send, but alas, it was blocked so it didn't download anything.
Note, buffered readers/writers won't really solve this issue, since even though they have the buffers in place to store arriving data, those buffers cannot be simultaneously filled and flushed too. As far as I figured it out, the only way to solve this streaming problem is to have to concurrent go routines, one reading and another writing, with a data buffer in between.
If there is indeed no such thing, would it be worthwhile to add something like?
bufio.Copy(dst io.Writer, src io.Reader, buffer int) (written int64, err error)
Which essentially does what io.Copy does, but starts up a separate writer go routine and passes everything through a user definable buffer. Then it could handle both data bursts as well batching readers/writers.
Comments/feedback? :)
Cheers,
Peter