Roger
--
You may say I am a dreamer, but I am not the only one.
I'd be very surprised if the latency control had any significant
negative effect (which apparently it didn't :)). After all, the whole
*point* of it is to keep latency low, and if it takes 57 seconds to do
165 turnarounds, that's very high latency.
The "problem" with SNMP is that it has a very large number of
synchronous requests, which is about the worst possible case for any
network protocol. Fundamentally, it's just badly designed and never
should have been done that way, but oh well.
In any case, 57/165 = 345ms per turnaround. That's not *crazily*
high, but if you're normally getting much much faster times, that
means your ping times must be much lower than 345ms.
So... what are your ping times from your sshuttle client machine to
the ssh server? What are the ping times from the ssh server to the
snmp server? Is the combination anywhere near 345ms?
It's certainly possible that sshuttle is delaying things somehow due
to some sort of accident, particularly if you have other sessions
(file transfers?) going through the tunnel at once. Is that a
possibility?
Finally, we might be suffering from the nagle algorithm in TCP, which
we have little control over since the TCP session is managed by ssh.
Stupid idea: can you try adding, say, 500 or 1500 bytes to every UDP
packet we send through the tunnel and stripping it off at the receive
side? It sounds stupid, but this should tell us if there are any
nagle-related issues.
...I guess I should really review those UDP patches :)
Sorry I've been so busy lately, I'll have to do another burst of
catching up sometime soon. Maybe even this weekend.
Have fun,
Avery
It is always possible there is a performance bug in my code too. I can
think of some minor things that can be optimized (e.g. two round trips
required on initial connection), nothing that would affect this
significantly however (as same connection is used for every packet).
It is also possible that some part of my code, while technically
correct, as written, doesn't perform as well as I hoped under python.
> ...I guess I should really review those UDP patches :)
So far I haven't submitted the UDP patches (although the code is
written), only the IPv6 patches. UDP would be the next patch after the
ones I have already submitted.
--
Brian May <br...@microcomaustralia.com.au>
Yup. A strace or tcpdump might help to track this down. At least
snmpwalk is a pretty easy use case (I think you can just apt-get
install an snmp server to test with on Debian).
> It is also possible that some part of my code, while technically
> correct, as written, doesn't perform as well as I hoped under python.
This is conceivable, but pretty unlikely - 345ms per packet would be
completely shocking unless we're grinding through each byte one by one
or something awful like that.
>> ...I guess I should really review those UDP patches :)
>
> So far I haven't submitted the UDP patches (although the code is
> written), only the IPv6 patches. UDP would be the next patch after the
> ones I have already submitted.
Ah. Well, in that case, I should at least just review my overflowing inbox :)
Have fun,
Avery
This seems to affect Peter a lot more then me I think. I get slow
times, but I expect that as this isn't a fast connection and as
discussed snmp isn't very efficient. Then again, maybe a fast
connection is a contributing factor?
A tshark shows 2574 snmp packets on the sshuttle client end and 2542
snmp packets at the server end for a single walk operation. Or 32
packets lost.
The first one lost seems to be a request to the server:
327 5.417118 128.250.103.238 -> 192.168.3.3 SNMP get-next-request
1.3.6.1.2.1.4.24.4.1.3.192.168.3.0.0.255.255.255.0.0.0.0.0
This is retransmitted and works on the 2nd attempt:
328 6.418173 128.250.103.238 -> 192.168.3.3 SNMP get-next-request
1.3.6.1.2.1.4.24.4.1.3.192.168.3.0.0.255.255.255.0.0.0.0.0
In fact all the missing packets seem to be outgoing packets, similar
to above. Always works for me on the 2nd attempt.
So the question is, where is this data being lost?
The last thing in the sshuttle logs before it paused indicates a
success 2 way transfer, e.g.:
c : Waiting: 7 r=[3, 5, 7, 8, 11, 12, 15] w=[] x=[] (fullness=8934/0)
c : Ready: 7 r=[8] w=[] x=[]
c : Accept UDP using socket_ext recvmsg.
c : Accept UDP: ('128.250.103.238', 58452) -> ('192.168.3.3', 161).
c : > channel=2 cmd=UDP_DATA len=64 (fullness=8934)
c : Remaining DNS requests: 0
c : Remaining UDP channels: 1
c : mux wrote: 72/72
c : Waiting: 7 r=[3, 5, 7, 8, 11, 12, 15] w=[] x=[] (fullness=8998/0)
s: mux wrote: 76/76
s: Waiting: 2 r=[4, 6] w=[] x=[] (fullness=9969/0)
s: Ready: 2 r=[4] w=[] x=[]
s: < channel=2 cmd=UDP_DATA len=64
s: Incoming UDP request channel=2, cmd=16909
s: is incoming UDP data. '192.168.3.3' 161.
s: UDP: sending to '192.168.3.3' port 161
s: Waiting: 2 r=[4, 6] w=[] x=[] (fullness=9969/0)
s: Ready: 2 r=[6] w=[] x=[]
s: UDP response: 54 bytes
s: > channel=2 cmd=UDP_DATA len=70 (fullness=9969)
c : Ready: 7 r=[15] w=[] x=[]
c : < channel=2 cmd=UDP_DATA len=70
c : doing send from ('192.168.3.3', 161) to ('128.250.103.238', 58452)
c : Waiting: 7 r=[3, 5, 7, 8, 11, 12, 15] w=[] x=[] (fullness=8998/0)
s: mux wrote: 78/78
s: Waiting: 2 r=[4, 6] w=[] x=[] (fullness=10039/0)
[ pause ]
So this can't be the botched transfer, because the botched transfer
never got any response, not even at the server end.
Which in turn means our select(...) call never detected the packet, or
the client would have printed a "Ready" after the "Waiting".
The very next log entries also represent another successful two way transfer.
Is it possible we have encountered an obscure kernel bug here?
--
Brian May <br...@microcomaustralia.com.au>