I have written a sender-receiver and the receiver stops receiving any
data after 2735 bytes. The sender seems to be fine, because when
connecting with a telnet session, it sends all the bytes.
I have tried to send the data in 100 byte pieces and flush() afterwards,
to no avail.
Am I missing someting?
TIA
Bart
--
Bart Blogt Beter: blog.friesoft.nl
Yes, showing us the code :-).
--
Knute Johnson
email s/nospam/knute2010/
:) Obviously.
This is the sender:
socket.sock.getOutputStream().write(chunk);
socket is my own class, sock inside it is a java.net.Socket, chunk is a
byte[].
This is the receiver:
int bytesleft = length;
int bytesread = 0;
while (bytesleft > 0 && bytesread > -1) {
bytesread = socket.sock.getInputStream().read(theChunk, length -
bytesleft, 1);
bytesleft -= bytesread;
}
theChunk is a byte[] of size 'length'
To elaborate on Knute's comment, I've used OutputStream.read() in plenty
of places where it read more than 2735 bytes. Therefore, there must be
some mistake in your own code. Show us an SSCCE and we might be able to
help.
Just to add a bit to my previous post: This code isn't complete. We
need a complete example that reproduces the problem. Otherwise, not
much we can do.
... whose first `bytesread' elements will hold the
values from the *last* call to read(), if I'm not mistaken.
That is, if you try to read 1000 bytes and get them in two
chunks of 600 and 400 bytes each,
- The first read() deposits input bytes 0-599 in
theChunk[0] through theChunk[599],
- The next read() deposits input bytes 600-999 in
theChunk[0] through theChunk[399], wiping out
bytes 0-399,
- At the end, theChunk[] holds input bytes 600-999,
followed by input bytes 400-599, followed by zeroes
(or old garbage),
- And, as a special bonus, you have no way of knowing
how many bytes were received or where they were stored.
--
Eric Sosman
eso...@ieee-dot-org.invalid
>> theChunk is a byte[] of size 'length'
>
> ... whose first `bytesread' elements will hold the
> values from the *last* call to read(), if I'm not mistaken.
> That is, if you try to read 1000 bytes and get them in two
> chunks of 600 and 400 bytes each,
>
> - The first read() deposits input bytes 0-599 in
> theChunk[0] through theChunk[599],
>
> - The next read() deposits input bytes 600-999 in
> theChunk[0] through theChunk[399], wiping out
> bytes 0-399,
Not according to the documentation:
"...The first byte read is stored into element b[off], the next one into
b[off+1], and so on. The number of bytes read is, at most, equal to len.
Let k be the number of bytes actually read; these bytes will be stored
in elements b[off] through b[off+k-1], leaving elements b[off+k] through
b[off+len-1] unaffected.
In every case, elements b[0] through b[off] and elements b[off+len]
through b[b.length-1] are unaffected. ..."
>
> - At the end, theChunk[] holds input bytes 600-999,
> followed by input bytes 400-599, followed by zeroes
> (or old garbage),
>
> - And, as a special bonus, you have no way of knowing
> how many bytes were received or where they were stored.
Of course, read() returns that.
> Eric Sosman wrote:
>> - And, as a special bonus, you have no way of knowing
>> how many bytes were received or where they were stored.
>
> Of course, read() returns that.
I think Eric means that read() also returns -1 if it reaches
end-of-stream, which you don't test for. Therefore, bytesleft will be
off by -1 if you reach end of stream. But you have no way of knowing
that it did, so....
True, but my code echoes me the return value, which is 35 in the last
call. After that, the read() blocks.
According to the process monitor, the processes are waiting in the
futex_wait channel (whatever that is; seems to have to do something with
threading).
i posted the code as I have it now here:
http://www.friesoft.nl/software/MammothCopy-0.1.tar.gz
which is a NetBeans project.
I am sorry it is not short, but it is what I could do fast, and if
someone is willing to take a look at it, I will be grateful. The code
that seems to have problems is in MammothReceiver::ReceiveChunks (line 191).
I think Eric's post completely misconstrues the interface for
InputStream. The second argument to the read() method indicates the
location in the buffer to copy the new bytes, but Eric is claiming the
new bytes are always written to the start of the array. The OP's code
does properly calculate the correct offset into the array where to copy
the bytes, with "length - bytesleft" (which is essentially a back-handed
way of calculating how many bytes have been read so far�new bytes are
copied into the array just past the last byte read so far).
You are correct, however, that the OP's code is flawed in its use of
"bytesread". He should only be subtracting the number from "bytesleft"
if "bytesread" is positive. An easy way to do that is either to move
the call to read() and assignment to bytesread into the while loop's
condition expression, or simply to move the statement decrementing
bytesleft to precede the call to read() in the loop (a harmless 0
decrement happens the first time through in that approach).
Other than the off-by-one bug, the biggest issue with the loop is that
the OP is reading one byte at a time! Duh.
Still, nothing posted so far would explain why reading the data stops at
the magic number of 2735 bytes. The OP needs to post a code example
that proves that more bytes than that are being sent, as well as
reliably demonstrates the receiving code reading only the 2735 bytes.
Pete
Since you don't, apparently, send the receiver any indication of how many
bytes you intend to send it, how does 'length' get set to a sensible
value?
--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |
> I have written a sender-receiver and the receiver stops receiving any
> data after 2735 bytes. The sender seems to be fine, because when
> connecting with a telnet session, it sends all the bytes.
I have been experimenting a little bit more and found that when sending
smaller pieces of data (I used 41 bytes now), AND finishing the exact
amount of bytes, the entire chunk gets sent correctly.
> You are correct, however, that the OP's code is flawed in its use of
> "bytesread". He should only be subtracting the number from "bytesleft"
> if "bytesread" is positive. An easy way to do that is either to move
> the call to read() and assignment to bytesread into the while loop's
> condition expression, or simply to move the statement decrementing
> bytesleft to precede the call to read() in the loop (a harmless 0
> decrement happens the first time through in that approach).
Yes, that's a good one, and I will implement the correct checking of the
-1 return value.
>
> Other than the off-by-one bug, the biggest issue with the loop is that
> the OP is reading one byte at a time! Duh.
Which is the result of my own debugging efforts ;).
> Since you don't, apparently, send the receiver any indication of how many
> bytes you intend to send it, how does 'length' get set to a sensible
> value?
I do send the receiver the size of the chunk, before sending the chunk
itself. The code is part of a protocol implementation.
The code is part of a large file copying program. See also
http://blog.friesoft.nl/2009/10/24/sending-really-big-files/
The point is not what YOU can do fast, but what will allow others to
help you fast. Don't waste other people's time. Create a proper SSCCE,
as requested.
Pete
Hm..
No idea why I got into this mess. When sending and receiving exactly
'length' bytes, it works fine. When 'length' is bug (around 1MB) I get a
'connection reset' exception, so I guess I should chop the chunk up in
smaller pieces to send.
Anyway, I am on my way again.
Thats why posting an SSCE is a good idea. You don't forget rather
important details like that and may well see the flaw when you're making
the SSCE. Another thing you haven't shown us or said you've checked: are
you sure the sender always reports the number of bytes its going to send
correctly?
> No idea why I got into this mess. When sending and receiving exactly
> 'length' bytes, it works fine. When 'length' is bug (around 1MB) I get a
> 'connection reset' exception, so I guess I should chop the chunk up in
> smaller pieces to send.
>
> Anyway, I am on my way again.
This is all so wrong, I don't know where to start. Have you considered
posting that SSCCE, so you can get some advice how to read and write
streams correctly, without doing silly things like breaking your data
into chunks less than 1MB for no obvious reason?
I took one of his previous replies to mean that he had considered a
SSCCE and decided he would rather than each of us individually distill
his code into a usable sample we can debug, rather than doing that work
himself.
In short, the answer to your question is probably "yes", he's
_considered_ it. But without the subsequent conclusion one might hope for.
Good thing I wrote "if I'm not mistaken," isn't it? ;-)
>> - And, as a special bonus, you have no way of knowing
>> how many bytes were received or where they were stored.
>
> Of course, read() returns that.
Not my day, was it? You don't have a direct indication of
how many bytes were read, but you *can* compute the count:
int totalbytes = length - bytesleft
+ (bytesread < 0 ? 1 : 0);
Definitely not my day ...
--
Eric Sosman
eso...@ieee-dot-org.invalid
That's called 'scp' or 'bittorrent'. Running that between two
NATted/firewalled boxes is called 'NAT traversal' or 'hole punching'.
There are techniques and tools for doing these things - google will help
you find them.
Or you could just use GBridge or similar:
tom
--
Hier gaan over het tij, de wind, de maan en wij.
> I took one of his previous replies to mean that he had considered a
> SSCCE and decided he would rather than each of us individually distill
> his code into a usable sample we can debug, rather than doing that work
> himself.
In that case, you misunderstood me.
I have considered the SSCCE and also decided that I wasn't going to make
it yesterday (perhaps I will make it later this week, or month). I
posted the entire code, because of the possibility that someone *would*
take the time or effort to look into it. If nobody does that, fine. I
also don't expect anybody who specifically asks for an SSCCE to look
into it.
> This is all so wrong, I don't know where to start. Have you considered
> posting that SSCCE, so you can get some advice how to read and write
> streams correctly, without doing silly things like breaking your data
> into chunks less than 1MB for no obvious reason?
I have considered it. And I might send it later this week, or month.
And I a completely agree with you that this is not a correct way to fix
the problem. But, the program is in alfa stage, it is a hobby and there
are other parts of the program that should be finished as well. If that
means to have some completely wrong code in the program, but it does
what it needs to do, so be it.
Too complex for a normal user. Not cross platform (or is there a decent
Windows scp implementation?) and doesn't work behind a firewall without
touching that firewall.
> or 'bittorrent'.
Too complex for a one-time sending of a file.
> Running that between two
> NATted/firewalled boxes is called 'NAT traversal' or 'hole punching'.
Too complex for normal users.
> There are techniques and tools for doing these things - google will help
> you find them.
No, they don't exist.
>
> Or you could just use GBridge or similar:
>
> http://www.gbridge.com/
Not cross platform.
Have you read my blog post?
> Tom Anderson wrote:
>> On Mon, 4 Jan 2010, Bart Friederichs wrote:
>>> The code is part of a large file copying program. See also
>>> http://blog.friesoft.nl/2009/10/24/sending-really-big-files/
>>
>> That's called 'scp'
>
> Too complex for a normal user.
>
Easy enough to wrap in a script (shell, Python, ....)
> Not cross platform (or is there a decent
> Windows scp implementation?)
>
PuTTY is a Windows implementation of both telnet and ssh.
openSSH offers a Java implementation that you can use to build your own
client and/or server.
> and doesn't work behind a firewall without
> touching that firewall.
>
That depends entirely on what direction you want to make the connection.
I connect out through an unmodified firewall all the time.
In any case ssh port forwarding to an sshd server is reasonably common
and should be safe enough if it is configured correctly. The same port is
used for both ssh sessions and scp/sftp file transfers
>> Running that between two
>> NATted/firewalled boxes is called 'NAT traversal' or 'hole punching'.
>
Same comments I made about firewalls apply to NAT routers.
> Too complex for normal users.
>
1) I'd expect that only sysadmins are authorised to kick holes
in firewalls. 'Normal users' should be actively prevented
from doing so.
2) if they can't type "scp filename user@host:destination/path"
or use an equivalent GUI wrapper they shouldn't be let near
a computer.
>> There are techniques and tools for doing these things - google will
>> help you find them.
>
> No, they don't exist.
>
Wrongo. I've just listed some.
The Cygwin project includes an openSSH implementation that just plain works
from a command shell. I've used it for years, often leveraging its
port-forwarding capabilities.
Tom Anderson wrote:
>>> There are techniques and tools for doing these things - google will
>>> help you find them.
Bart Friederichs wrote:
>> No, they don't exist.
Martin Gregorie wrote:
> Wrongo. I've just listed some.
--
Lew
> Tom Anderson wrote:
>> On Mon, 4 Jan 2010, Bart Friederichs wrote:
>>> The code is part of a large file copying program. See also
>>> http://blog.friesoft.nl/2009/10/24/sending-really-big-files/
>>
>> That's called 'scp'
>
> Too complex for a normal user. Not cross platform (or is there a decent
> Windows scp implementation?)
WinSCP, if you want all the widgety-clickety bits and pieces.
> and doesn't work behind a firewall without
> touching that firewall.
Nothing will, or at least nothing should. After all, that's rather the
point of a firewall in the first place.
If the person using the software isn't sufficiently au fait with the
security requirements, and implications, of opening the requisite holes
in their firewall then you have to question whether they should really be
doing that. Using a piece of software which bypasses the firewall,
without understanding what that software does or what the implications of
its actions are, is exceedingly risky. One prime example of this is
Windows - lots of people with no clue of how the Internet works, or how
to secure themselves from attack, relying on a piece of software
(Windows) to handle it all for them. Look how well that's worked out.
--
Nigel Wade
Yes. Good luck with your project, Bart. You'll need it, because i'm afraid
you're an idiot.
tom
--
Mpreg is short for Male Impregnation and I cannot get enough. -- D
> Martin Gregorie wrote:
>> PuTTY is a Windows implementation of both telnet and ssh.
>>
>> openSSH offers a Java implementation that you can use to build your own
>> client and/or server.
>>
> ...
>
> The Cygwin project includes an openSSH implementation that just plain
> works from a command shell. I've used it for years, often leveraging
> its port-forwarding capabilities.
>
I thought I'd better mention openSSH's Java package because the OP seems
to think his potential users are incapable of using a CLI scp command. He
can, at least, use the package to build them a GUI.
You want something entirely free, that must be cross-platform
and that is very easy for the user.
You also apparently want two lambdas users to be able to bypass
any kind of firewall.
I'm sorry but if scp is "too complicated" what exactly makes
you think installing Java would be easier for a "normal user"?
(hint: Java ain't installed on all Windows machine nor on all
Linux machines).
Your best bet to bypass firewall is to have one end create an
HTTP server and the other connect to it. Amazon S3 manages
daily terabytes of files transfer using HTTP requests.
If temporarily storing 10 GB on Amazon S3 wasn't an issue
(is it really at $0.15 cents / GB per month, especially
if your 10 GB stays there only one or two days?) then I'd
simply have one side push to Amazon S3 and the other pull
from it. Or to any service using Amazon S3 as storage (I
think if you invite enough people you can get up to 5 GB
for free lifetime at DropBox, they use S3 behind the scene).
In any case solving this by making two very dumb users on
the two sides (apparently they're very dumg) using some
self-reinvented broken Java wheel like you plan to doesn't
seem like that great a plan.
If it was a recurrent problem I'd set up a Webapp that'd
serve as a frontend to S3 and make users enter the domain
name of the site where your webapp is hosted in their
browser.
Even dumb users should be able to figure that out.
The problem is that it takes at least some skill to
do and seen the level of confusion shown by your posts
here and on your blog I doubt it's within your reach ;)
out is not the problem. in is. I assume both sides are behind a home-use
router/firewall. Isn't it strange, that in the day of 20MBits/s home
lines, it not easily possible to send a file from computer A to computer
B without a third party? People still use email to send photos.
> In any case ssh port forwarding to an sshd server is reasonably common
> and should be safe enough if it is configured correctly. The same port is
> used for both ssh sessions and scp/sftp file transfers
Okay, consider two home computers, both behind a NATed router (plain
broadband setup), one guy is a filmmaker with a MAC, the other guy is a
cameraman with a Windows XP PC. They have only a working knowledge of
the computer, and also don't want to know anything more. The 'command
line' you speak of, reminds them of DOS and they certainly will not want
to use that.
Both sides have to be able to be listener or connector (from the TCP
view), and both sides should be able to send or receive. The file is
huge (no limit, but typical is around 10GB), so being able to continue
the transfer is a must.
>
>>> Running that between two
>>> NATted/firewalled boxes is called 'NAT traversal' or 'hole punching'.
> Same comments I made about firewalls apply to NAT routers.
>
>> Too complex for normal users.
>>
> 1) I'd expect that only sysadmins are authorised to kick holes
> in firewalls. 'Normal users' should be actively prevented
> from doing so.
UPnP describes a protocol to do just that. Whether or not it is allowed
in the router, is the only bottleneck I have right now.
> 2) if they can't type "scp filename user@host:destination/path"
> or use an equivalent GUI wrapper they shouldn't be let near
> a computer.
They will be able to use an equivalent GUI wrapper (my program).
>>> There are techniques and tools for doing these things - google will
>>> help you find them.
>> No, they don't exist.
>>
> Wrongo. I've just listed some.
Yes, and I listed reasons why they are not suitable for what *I* want to
accomplish.
Well, pretty clear if you ask me.
>
> You also apparently want two lambdas users to be able to bypass
> any kind of firewall.
Aren't bittorrent clients doing that all the time with UPnP?
>
> I'm sorry but if scp is "too complicated" what exactly makes
> you think installing Java would be easier for a "normal user"?
> (hint: Java ain't installed on all Windows machine nor on all
> Linux machines).
I understand, but installing it, is fairly easy.
>
> Your best bet to bypass firewall is to have one end create an
> HTTP server and the other connect to it. Amazon S3 manages
> daily terabytes of files transfer using HTTP requests.
>
> If temporarily storing 10 GB on Amazon S3 wasn't an issue
> (is it really at $0.15 cents / GB per month, especially
> if your 10 GB stays there only one or two days?) then I'd
> simply have one side push to Amazon S3 and the other pull
> from it. Or to any service using Amazon S3 as storage (I
> think if you invite enough people you can get up to 5 GB
> for free lifetime at DropBox, they use S3 behind the scene).
I don't want a third party. Why is that so hard to understand? It is
insane that we have broadband connections and cannot transfer files from
A to B in a straightforward way.
I have thought of using the third-party way, in fact, that is what he is
using now. And yes, that works, and yes that costs money. I just think
it is weird to pay money for something that is available already
(bandwidth that is).
> In any case solving this by making two very dumb users on
> the two sides (apparently they're very dumg) using some
> self-reinvented broken Java wheel like you plan to doesn't
> seem like that great a plan.
The project can still fail miserably if the whole thing doesn't work.
But if it does, it might make life for this guy a little easier.
> The problem is that it takes at least some skill to
> do and seen the level of confusion shown by your posts
> here and on your blog I doubt it's within your reach ;)
I guess I am completely missing the point here. What is so confused
about the blog post? I have stated some requirements, and all you do is
changing these to fix the problem. That's not fixing the problem.
I agree with you on the sending/receiving bug I have. That should be
fixed well. But for now, I rather have a working proof-of-concept with a
known bug, than correct code.
> Martin Gregorie wrote:
>> On Tue, 05 Jan 2010 20:06:20 +0100, Bart Friederichs wrote:
>>> and doesn't work behind a firewall without touching that firewall.
>>>
>> That depends entirely on what direction you want to make the
>> connection. I connect out through an unmodified firewall all the time.
>
> out is not the problem. in is. I assume both sides are behind a home-use
> router/firewall. Isn't it strange, that in the day of 20MBits/s home
> lines, it not easily possible to send a file from computer A to computer
> B without a third party? People still use email to send photos.
>
Its not strange at all in the hostile Internet environment of attacks by
bot herders, etc. What you're complaining about it is nothing to do with
the technology.
>> In any case ssh port forwarding to an sshd server is reasonably common
>> and should be safe enough if it is configured correctly. The same port
>> is used for both ssh sessions and scp/sftp file transfers
>
> Okay, consider two home computers, both behind a NATed router (plain
> broadband setup), one guy is a filmmaker with a MAC, the other guy is a
> cameraman with a Windows XP PC. They have only a working knowledge of
> the computer, and also don't want to know anything more. The 'command
> line' you speak of, reminds them of DOS and they certainly will not want
> to use that.
>
You entirely missed my point, which was to use the openSSH connection
library behind a GUI of your own devising.
> Both sides have to be able to be listener or connector (from the TCP
> view), and both sides should be able to send or receive. The file is
> huge (no limit, but typical is around 10GB), so being able to continue
> the transfer is a must.
>
Since ssh is a fairly secure protocol its reasonable to forward ssh
connections from the firewall to an ssh server, but IMO it would be
extremely foolish to do that with a less secure protocol. End-users will
still have to configure their router to forward port 22 to the host
running sshd but thats a one-off task, but if you want the end-user
systems to accept incoming connections you've no choice over that.
Since you have no control over the type of firewall or NAT router the end-
user has installed the only straight forward way to avoid port forwarding
is to run a store-and-forward relay server.