Tested on Windows 2000, XP, 2003 and behaviour looks like a BUG. Used AF_INET SOCK_STREAM socket in nonblocking mode with WSAEventSelect().
Any call to send() or WSASend() in nonblocking mode mostly finished with one of the next results:
1. Succeeded and all data is processed to sending when socket buffer space is available [Documented].
1a. Succeeded and all data is processed to sending when data size is more then socket send buffer size (SO_SNDBUF), for example - send 64Mb [Not documented, data must be partially processed].
2. Failed with WSAEWOULDBLOCK [Documented, send() must be called later].
3. Failed with WSAENOBUFS (for example 128Mb) [Not Documented, data must be partially processed].
But never succeeded with only part of data processed [Not Documented, see 3]!!!
This problem can be resolved by sending not more then 64Kb by one call of send() or WSASend() function. See also MSDN KB201213, but it applies to Windows 95/98/NT4 and blocking socket mode and one of resolutions is "Use the socket in nonblocking or asynchronous mode."!!!
You can also see topic "async send() fails on very large data blocks with nonblocking socket, appears to be treated as blocking (XP)" in "comp.os.ms-windows.programmer.tools.winsock".
Is it a bug? Maybe somebody have more information.
Using 64Kb limitation has bad performance in most cases. Better resolution is using limit equal to 1 byte smaller than the SO_SNDBUF value. See MSDN KB823764: "Slow Performance Occurs When You Copy Data to a TCP Server by Using a Windows Sockets API Program".
It's an artifact of the WinSock implementation. You have to live with it. Most applications use small send blocks, in the order of 2K (or even less), so there's no reason you can't do the same.
BTW, you'll get WSAEWOULDBLOCK after the send buffer is full all the same. WSAENOBUFS means you sent more data than can be accomodated in kernel buffers - e.g. out of kernel memory. Note you are killing the performance of all other applications in this (abusive) manner, since you are tying up kernel memory.
-- ===================================== Alexander Nickolov Microsoft MVP [VC], MCSD email: agnicko...@mvps.org MVP VC FAQ: http://www.mvps.org/vcfaq =====================================
> Tested on Windows 2000, XP, 2003 and behaviour looks like a BUG. > Used AF_INET SOCK_STREAM socket in nonblocking mode with > WSAEventSelect().
> Any call to send() or WSASend() in nonblocking mode mostly finished > with one of the next results:
> 1. Succeeded and all data is processed to sending when socket buffer > space is available [Documented].
> 1a. Succeeded and all data is processed to sending when data size is > more then socket send buffer size (SO_SNDBUF), for example - send 64Mb > [Not documented, data must be partially processed].
> 2. Failed with WSAEWOULDBLOCK [Documented, send() must be called > later].
> 3. Failed with WSAENOBUFS (for example 128Mb) [Not Documented, data > must be partially processed].
> But never succeeded with only part of data processed [Not Documented, > see 3]!!!
> This problem can be resolved by sending not more then 64Kb by one call > of send() or WSASend() function. See also MSDN KB201213, but it applies > to Windows 95/98/NT4 and blocking socket mode and one of resolutions is > "Use the socket in nonblocking or asynchronous mode."!!!
> You can also see topic "async send() fails on very large data blocks > with nonblocking socket, appears to be treated as blocking (XP)" in > "comp.os.ms-windows.programmer.tools.winsock".
> Is it a bug? > Maybe somebody have more information.
The more memory you'll have - the more non-pageable memory you'll have for socket buffers and you'll have different results , try that on 64bit machine with few Tb and see the differens. JFYI , socket buffer is 16K from W2K and newer and 8K before ( NT and 9x ) , so TCP stack do the best to divide and send the data but up to some limits :) Arkady
> Tested on Windows 2000, XP, 2003 and behaviour looks like a BUG. > Used AF_INET SOCK_STREAM socket in nonblocking mode with > WSAEventSelect().
> Any call to send() or WSASend() in nonblocking mode mostly finished > with one of the next results:
> 1. Succeeded and all data is processed to sending when socket buffer > space is available [Documented].
> 1a. Succeeded and all data is processed to sending when data size is > more then socket send buffer size (SO_SNDBUF), for example - send 64Mb > [Not documented, data must be partially processed].
> 2. Failed with WSAEWOULDBLOCK [Documented, send() must be called > later].
> 3. Failed with WSAENOBUFS (for example 128Mb) [Not Documented, data > must be partially processed].
> But never succeeded with only part of data processed [Not Documented, > see 3]!!!
> This problem can be resolved by sending not more then 64Kb by one call > of send() or WSASend() function. See also MSDN KB201213, but it applies > to Windows 95/98/NT4 and blocking socket mode and one of resolutions is > "Use the socket in nonblocking or asynchronous mode."!!!
> You can also see topic "async send() fails on very large data blocks > with nonblocking socket, appears to be treated as blocking (XP)" in > "comp.os.ms-windows.programmer.tools.winsock".
> Is it a bug? > Maybe somebody have more information.
Maybe you know where I can find some information from Microsoft about this artifact? I don't find anything about it in MSDN.
Help on send() and WSASend() say:
"... On nonblocking stream oriented sockets, the number of bytes written can be between 1 and the requested length, depending on buffer availability on both client and server computers. The select, WSAAsyncSelect or WSAEventSelect functions can be used to determine when it is possible to send more data."
If "transport buffers" is SO_SNDBUF and SO_RCVBUF - this help is misleading.
Alexander Nickolov wrote: > It's an artifact of the WinSock implementation. You have to > live with it. Most applications use small send blocks, in the > order of 2K (or even less), so there's no reason you can't > do the same.
> BTW, you'll get WSAEWOULDBLOCK after the send buffer > is full all the same. WSAENOBUFS means you sent more data > than can be accomodated in kernel buffers - e.g. out of > kernel memory. Note you are killing the performance of all > other applications in this (abusive) manner, since you are > tying up kernel memory.
Arkady Frenkel wrote: > JFYI , socket buffer is 16K from W2K and newer and 8K before ( NT and 9x ) , > so TCP stack do the best to divide and send the data but up to some limits > :)
In my opinion, current behavior (non-partial buffer send) can help minimize number of switches between kernel mode and user mode. But in my current application design, I use the same thread for sending and receiving data on the same socket. So if I try to send huge data block, I can't receive any data, because send function is in progress (WSA allocates huge kernel buffer and copies sending data to the buffer - it can take relatively long time). In this case using small data block gives better application behavior. But it is very important to use blocks with size equal to SO_SNDBUF - 1; using static block limit value gives poor network performance.
> Isn't overlapped I/O a good choice? You would need to preserve sent data > during send operation, but no internal buffer space would bee required nor > waiting for WSASend to complete.
Or if not using overlapped I/O (which has its traps and pitfalls) why not use select() to know where it is "safe" to send/read
> Isn't overlapped I/O a good choice? You would need to preserve sent data > during send operation, but no internal buffer space would bee required nor > waiting for WSASend to complete.
Yes, I agree, overlapped I/O is the best choice for performance reason. But now I need to optimize exist code which worked with nonblocking sockets.
> Or if not using overlapped I/O (which has its traps and pitfalls) why > not use select() to know where it is "safe" to send/read
I am already using nonblocking mode with WSAEventSelect(), it is more fexible then select().
> If "transport buffers" is SO_SNDBUF and SO_RCVBUF - this help is > misleading.
It's not - it's the number of available kernel buffers, e.g. all the available kernel memory can be tied up in a single send.
You won't find this documented per se anywhere - we've found it out by experimenting with different send size patterns. Note that the behavior does not contradict the documentation as it is sufficiently vague.
-- ===================================== Alexander Nickolov Microsoft MVP [VC], MCSD email: agnicko...@mvps.org MVP VC FAQ: http://www.mvps.org/vcfaq =====================================
> Maybe you know where I can find some information from Microsoft about > this artifact? I don't find anything about it in MSDN.
> Help on send() and WSASend() say:
> "... On nonblocking stream oriented sockets, the number of bytes > written can be between 1 and the requested length, depending on buffer > availability on both client and server computers. The select, > WSAAsyncSelect or WSAEventSelect functions can be used to determine > when it is possible to send more data."
> If "transport buffers" is SO_SNDBUF and SO_RCVBUF - this help is > misleading.
> Best regards, > DS
> Alexander Nickolov wrote: >> It's an artifact of the WinSock implementation. You have to >> live with it. Most applications use small send blocks, in the >> order of 2K (or even less), so there's no reason you can't >> do the same.
>> BTW, you'll get WSAEWOULDBLOCK after the send buffer >> is full all the same. WSAENOBUFS means you sent more data >> than can be accomodated in kernel buffers - e.g. out of >> kernel memory. Note you are killing the performance of all >> other applications in this (abusive) manner, since you are >> tying up kernel memory.
If you need to use your buffers and not system , you can do that with zero-buffering ( AFAIK that Linux term , but it work in Windows too ) , you set size of the winsock buffer to zero ) so system will use your buffer and not allocate system one for operations Arkady
> Arkady Frenkel wrote: >> JFYI , socket buffer is 16K from W2K and newer and 8K before ( NT and >> 9x ) , >> so TCP stack do the best to divide and send the data but up to some >> limits >> :)
> In my opinion, current behavior (non-partial buffer send) can help > minimize number of switches between kernel mode and user mode. But in > my current application design, I use the same thread for sending and > receiving data on the same socket. So if I try to send huge data block, > I can't receive any data, because send function is in progress (WSA > allocates huge kernel buffer and copies sending data to the buffer - it > can take relatively long time). In this case using small data block > gives better application behavior. But it is very important to use > blocks with size equal to SO_SNDBUF - 1; using static block limit value > gives poor network performance.
> If you need to use your buffers and not system , you can do that with > zero-buffering ( AFAIK that Linux term , but it work in Windows too ) , > you set size of the winsock buffer to zero ) so system will use your > buffer and not allocate system one for operations > Arkady
>> Arkady Frenkel wrote: >>> JFYI , socket buffer is 16K from W2K and newer and 8K before ( NT and >>> 9x ) , >>> so TCP stack do the best to divide and send the data but up to some >>> limits >>> :)
>> In my opinion, current behavior (non-partial buffer send) can help >> minimize number of switches between kernel mode and user mode. But in >> my current application design, I use the same thread for sending and >> receiving data on the same socket. So if I try to send huge data block, >> I can't receive any data, because send function is in progress (WSA >> allocates huge kernel buffer and copies sending data to the buffer - it >> can take relatively long time). In this case using small data block >> gives better application behavior. But it is very important to use >> blocks with size equal to SO_SNDBUF - 1; using static block limit value >> gives poor network performance.
Setting socket send buffer (SO_SNDBUF) to zero is a bad idea for nonblocking mode. Because it change send() function behaviour, send() doesn't complete until send really completes (send() work like in blocking mode).
Arkady Frenkel wrote: > If you need to use your buffers and not system , you can do that with > zero-buffering ( AFAIK that Linux term , but it work in Windows too ) , you > set size of the winsock buffer to zero ) so system will use your buffer and > not allocate system one for operations > Arkady
Some interesting information from MSDN article "Windows Sockets 2.0: Write Scalable Winsock Apps Using Completion Ports": "...the data gets copied by AFD.SYS to its internal buffers (up to the SO_SNDBUF setting)..." (see "Who Manages the Buffers?"). http://msdn.microsoft.com/msdnmag/issues/1000/winsock/default.aspx
As I see it, send() function must not return WSAENOBUFS error regardless of the socket is blocking or nonblocking. See also MSDN KB201213 "BUG: Send() Fails with Error WSAENOBUFS Over Blocking Socket". On nonblocking socket can be returned WSAEWOULDBLOCK if buffer space is unavailable or returned number of sent bytes between 1 and the requested length. Also other errors like WSAENETDOWN can be returned but not WSAENOBUFS error. So current WinSock API behavior is a BUG.
Also, in my opinion, current nonblocking socket behavior can be unexpected and Microsoft should document it better and more unambiguously.
> It's not - it's the number of available kernel buffers, e.g. all the > available kernel memory can be tied up in a single send.
> You won't find this documented per se anywhere - we've found > it out by experimenting with different send size patterns. Note > that the behavior does not contradict the documentation as it > is sufficiently vague.
> Setting socket send buffer (SO_SNDBUF) to zero is a bad idea for > nonblocking mode. Because it change send() function behaviour, send() > doesn't complete until send really completes (send() work like in > blocking mode).
> Best regards, > DS
> Arkady Frenkel wrote: >> If you need to use your buffers and not system , you can do that with >> zero-buffering ( AFAIK that Linux term , but it work in Windows too ) , >> you >> set size of the winsock buffer to zero ) so system will use your buffer >> and >> not allocate system one for operations >> Arkady
I can't argue with your sentiments, I feel much the same here :). However, you are giving here a link to an article, not the WinSock documentation. Don't confuse magazines with documentation.
-- ===================================== Alexander Nickolov Microsoft MVP [VC], MCSD email: agnicko...@mvps.org MVP VC FAQ: http://www.mvps.org/vcfaq =====================================
> Some interesting information from MSDN article "Windows Sockets 2.0: > Write Scalable Winsock Apps Using Completion Ports": > "...the data gets copied by AFD.SYS to its internal buffers (up to the > SO_SNDBUF setting)..." (see "Who Manages the Buffers?"). > http://msdn.microsoft.com/msdnmag/issues/1000/winsock/default.aspx
> As I see it, send() function must not return WSAENOBUFS error > regardless of the socket is blocking or nonblocking. See also MSDN > KB201213 "BUG: Send() Fails with Error WSAENOBUFS Over Blocking > Socket". On nonblocking socket can be returned WSAEWOULDBLOCK if buffer > space is unavailable or returned number of sent bytes between 1 and the > requested length. Also other errors like WSAENETDOWN can be returned > but not WSAENOBUFS error. So current WinSock API behavior is a BUG.
> Also, in my opinion, current nonblocking socket behavior can be > unexpected and Microsoft should document it better and more > unambiguously.
> Best regards, > DS
> Alexander Nickolov wrote:
>> It's not - it's the number of available kernel buffers, e.g. all the >> available kernel memory can be tied up in a single send.
>> You won't find this documented per se anywhere - we've found >> it out by experimenting with different send size patterns. Note >> that the behavior does not contradict the documentation as it >> is sufficiently vague.
> 1a. Succeeded and all data is processed to sending when data size is > more then socket send buffer size (SO_SNDBUF), for example - send 64Mb > [Not documented, data must be partially processed].
How do you test it? If your sending data to local host I guess it is fine - despite buffer size send is able to deliver all data immediately (I think I saw written somewhere that 'local' sockets are implemented specifically to achieve better performance.
I have tested it on two hosts (on next test pairs: 1. Windows XP SP2 + Windows 2003 Server; 2. Windows XP SP2 + Windows XP SP1; 3. Windows XP SP1 + Windows 2003 Server; 4. Windows XP SP2 + Windows 2000 SP4) on LAN (100Mbps Fast Ethernet).
As I see it, send() function behavior must not depend on localhost or not localhost is used. It must works as stated in documentation in any case.
qfel wrote: > How do you test it? > If your sending data to local host I guess it is fine - despite buffer size > send is able to deliver all data immediately (I think I saw written > somewhere that 'local' sockets are implemented specifically to achieve > better performance.
Yes, but article written by Microsoft developers (Microsoft Windows 2000 Networking group) and published by Microsoft. ;) Alexander, thank you for your help. :)
Alexander Nickolov wrote: > I can't argue with your sentiments, I feel much the same here :). > However, you are giving here a link to an article, not the > WinSock documentation. Don't confuse magazines with > documentation.
Some additional and must be most believable information from MSDN about SO_SNDBUF and send() behaviour.
See article "INFO: Design Issues - Sending Small Data Segments Over TCP w/Winsock" in MSDN Knowledge Base:
"If necessary, Winsock can buffer significantly more than the SO_SNDBUF buffer size."
"Winsock uses the following rules to indicate a send completion to the application (depending on how the send is invoked, the completion notification could be the function returning from a blocking call, signaling an event or calling a notification function, and so forth):
- If the socket is still within SO_SNDBUF quota, Winsock copies the data from the application send and indicates the send completion to the application.
- If the socket is beyond SO_SNDBUF quota and there is only one previously buffered send still in the stack kernel buffer, Winsock copies the data from the application send and indicates the send completion to the application.
- If the socket is beyond SO_SNDBUF quota and there is more than one previously buffered send in the stack kernel buffer, Winsock copies the data from the application send. Winsock does not indicate the send completion to the application until the stack completes enough sends to put the socket back within SO_SNDBUF quota or only one outstanding send condition."