On Thu, Dec 20, 2012 at 10:16 PM, Greg Young <
gregor...@gmail.com> wrote:
> it was a while ago i do remember connections were explicitly closed
> though they probably did 30 or so commands to simulate a user stream
> of activity ( so yes some piggy backing / streaming) , it was not to
> a web server but a WCF service using http binding . Command data
> which was 50% was only 100-200 bytes ( on the wire with Ip header )
> with similar responses. Which meant 50% of the traffic took about 200
> * 100K = 20Mbytes of data on the up and down ( or 160 Mbit) . Most
> get queries were similar single records not huge though a few were 3K
> , and 4-600 Mbit total is certainly possible on a full Duplex 1G
> connection .
>
> MTUs.... sending 100k packets with 20 bytes each in them does not = 2mb
MTUs are the maximum size before fragmenting .. 20 bytes is not
possible IP header is 20 , tcp is +20 i think,. 100K packets with 20
bytes data would be worst case ( 100K * 60 ) 6MBytes , 48 Mbit. .
with some control piggy backing for Acks etc . If you send 20 byte
messages to a socket however Nagle will add them to a packet till
the MTU is reached . Those packets i mentioned were that size on the
wire ( though not when running they probably were batched in the 30
transaction stream) .
That said Nagle assumes the messages are send over a socket for HTTP
1.0 however you sometimes get this pattern
Open socket
Send 1 packet
Close socket
For large packets this is not really a throughput issue but a latency
issue ( and a test for the socket management layer ) as each socket
has a connection cost. For small packets it can be very costly
You can use Connection: keep-alive in HTTP 1.1 to prevent this and
its one of the things you will find in asmx web services. Its crucial
so the connections from a client get "pipelined". ( and i think the
30 messages and an explicit close is realistic) .
>
> Its quite easy to get very high throughput with small commands by putting
> many in a packet. The problem is this is measuring something very different.
It can be similar .. Many commands are quite small and it was ok for
that project ( as we were very careful with our packets - besides 300
very active desktops we had 3K vehicle clients whose GSM connection
got downgraded to 9600 baud !) - Agree though with most modern code
and talking to a http server it can be quite different , XML
namespaces , Guids and big strings , serialized complex entities and
you cop the cookie + the request and response header . In that
project we just signed the packets with a small 9 digit hash so no
cookie needed.
>
> Benchmarking such systems is hard! As an example on my local machine the ES
> writes > 80MB/s if you stream 1MB messages to it. In practice though events
> are much smaller than this (and fsync + smaller writes ends up killing you).
Agree its hard.. And the test i mentioned was a quick and dirty , had
a requirement for 3K / sec so a spend a few days to see what i could
get..
The really nice thing about an event store though is its just an array
of events , speaking of which if you had an async api with a timeout
couldnt you batch up all the requests from a machine ( in the client
API) and batch them. You dont return to the caller unless the timeout
has happened or success or failure similar to a file op .. I suppose
the sync users / processors would suffer from the extra latency
though.
BTW have you had GC pause issues , or is all the heavy lifting native ?
> The same with our HTTP stuff. Pipelining + combining puts many commands/MTU
> (can do the same on response). This is not however a normal operation.
>
> With all such systems though I find getting the processing fast is never
> where the time is spent, its getting the IO to be fast.
True which is why i have a persists as rarely as possible policy,
but note if you have all async IO ( which a Lmax style architecture
does have even if a single thread) , then you have much greater
opportunity for batching IO .. For full async it means very little to
throughput if the disk returns in 3 ms or 100ms ( which would kill
sequential workers) but there is much greater scope for IO
batching. If your trying to get 10K transactions per second 100ms
would be a 1K batch. .. Works especially nice with commands which
assume success and callback on failure .. Though you would need to
sort out the aggregates by type for some types of storage.
Ben