ByteString using N bytes from an InputStream?

33 views
Skip to first unread message

V.B.

unread,
Aug 5, 2013, 6:23:59 PM8/5/13
to prot...@googlegroups.com
Greetings all,
    We are using version 2.5. What is the most efficient way (i.e. single copy operation, no extra byte arrays) to construct a ByteString from a specific number of bytes in an InputStream? The various versions of ByteString.readFrom() drain the stream completely, which is not what we need; any data past N bytes should remain in the stream. The ByteString.readChunk() method looks like it will work if we simply give it N as the chunkSize parameter. Unfortunately, ByteString.readChunk() is declared private, so that method is not currently an option. Is there another option that I just haven't found in the source code yet?

(Thanks for taking the time to read this question.)

Feng Xiao

unread,
Aug 5, 2013, 7:32:49 PM8/5/13
to V.B., Protocol Buffers
On Mon, Aug 5, 2013 at 3:23 PM, V.B. <vidalb...@gmail.com> wrote:
Greetings all,
    We are using version 2.5. What is the most efficient way (i.e. single copy operation, no extra byte arrays) to construct a ByteString from a specific number of bytes in an InputStream? The various versions of ByteString.readFrom() drain the stream completely, which is not what we need; any data past N bytes should remain in the stream. The ByteString.readChunk() method looks like it will work if we simply give it N as the chunkSize parameter. Unfortunately, ByteString.readChunk() is declared private, so that method is not currently an option. Is there another option that I just haven't found in the source code yet?
How about create an wrapper InputStream that only reads N bytes from the original InputStream and provide the wrapper to BytesString.readFrom()? 
 

(Thanks for taking the time to read this question.)


--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.
To post to this group, send email to prot...@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

V.B.

unread,
Aug 6, 2013, 12:28:56 AM8/6/13
to prot...@googlegroups.com, V.B.
Hi Feng Xiao! Thanks for the response.
    That's actually our backup plan. We were hoping to avoid it, though, since the wrappers would each contain an extra copy of the data internally. Our ideal case is for the data to get copied in a single step directly from an InputStream to a ByteString with no intermediate copies along the way.
Question: You would know best... Would the safety of ByteStrings be preserved if the readChunk() method were to be made public? If so, I'll open a feature request on the issue tracker.

V.B.

unread,
Aug 6, 2013, 12:31:47 AM8/6/13
to prot...@googlegroups.com, V.B.
... Actually, I just now took a closer look at the readChunk() method. Even that method makes an internal copy, so it looks like readChunk() isn't what we are looking for after all. Hmmm.

Feng Xiao

unread,
Aug 6, 2013, 2:18:49 PM8/6/13
to V.B., Protocol Buffers
On Mon, Aug 5, 2013 at 9:31 PM, V.B. <vidalb...@gmail.com> wrote:
... Actually, I just now took a closer look at the readChunk() method. Even that method makes an internal copy, so it looks like readChunk() isn't what we are looking for after all. Hmmm.
It seems to me that readChunk() has done a redundant copy which can be eliminated (and should be). I don't understand why the wrapper has to contain an extra copy though. Isn't it just copying the data to the destination directly?
 


On Tuesday, August 6, 2013 12:28:56 AM UTC-4, V.B. wrote:
Hi Feng Xiao! Thanks for the response.
    That's actually our backup plan. We were hoping to avoid it, though, since the wrappers would each contain an extra copy of the data internally. Our ideal case is for the data to get copied in a single step directly from an InputStream to a ByteString with no intermediate copies along the way.
Question: You would know best... Would the safety of ByteStrings be preserved if the readChunk() method were to be made public? If so, I'll open a feature request on the issue tracker.

--

Oliver Jowett

unread,
Aug 6, 2013, 3:26:05 PM8/6/13
to Feng Xiao, V.B., Protocol Buffers
On Tue, Aug 6, 2013 at 7:18 PM, Feng Xiao <xiao...@google.com> wrote:

On Mon, Aug 5, 2013 at 9:31 PM, V.B. <vidalb...@gmail.com> wrote:
... Actually, I just now took a closer look at the readChunk() method. Even that method makes an internal copy, so it looks like readChunk() isn't what we are looking for after all. Hmmm.
It seems to me that readChunk() has done a redundant copy which can be eliminated (and should be).

Does ByteString need to worry about hostile InputStreams? The stream could retain a reference to the bytearray it was given in the read() call.

Oliver

Feng Xiao

unread,
Aug 6, 2013, 4:38:17 PM8/6/13
to Oliver Jowett, V.B., Protocol Buffers
Hmm, I haven't thought of that. Seems it should be the reason why it does that copy. Given that, there isn't any way to construct a ByteString with just one copy.
 

Oliver


Reply all
Reply to author
Forward
0 new messages