Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Understanding Send-Q and Recv-Q by netstat

16,655 views
Skip to first unread message

Christophe Lohr

unread,
Dec 4, 2009, 11:55:38 AM12/4/09
to
Hello,
I do not understand the relationship that may exist between the socket
option SO_RCVBUF / SO_SNDBUF and values Recv-Q/Send-Q given by netstat.

Consider the following example, consisting of a TCP client (sending
zeros) and a lazy TCP server (which does not consume received data).

$ socat -ddd OPEN:/dev/zero TCP:localhost:8003,sndbuf=2000,rcvbuf=2000
2009/12/04 17:30:21 socat[2880] E write(4, 0x954a758, 8192): Broken pipe


$ socat -ddd TCP-LISTEN:8003,reuseaddr,sndbuf=2000,rcvbuf=2000
EXEC:'sleep 30'
2009/12/04 17:30:21 socat[2879] E write(3, 0x89c2008, 3008): Broken pipe

Thus, the values Recv-Q/Send-Q should match the values SO_RCVBUF /
SO_SNDBUF configured. Isn't it? Yet this is not the case... why?


$ netstat -tpn
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name

tcp 3008 0 127.0.0.1:8003 127.0.0.1:54629
ESTABLISHED 2879/socat

tcp 0 2176 127.0.0.1:54629 127.0.0.1:8003
ESTABLISHED 2880/socat

Note that the server says he received 8003 bytes, which is consistent
with Recv-Q.
But the client said he sent 8192 bytes, which does not correspond to the
Send-Q:2176.
Moreover, none of its values is consistent with the setting of
SO_RCVBUF/SO_SNDBUF.


Does someone could explain?
Best regards

Rick Jones

unread,
Dec 4, 2009, 12:40:48 PM12/4/09
to
Christophe Lohr <christo...@enst-bretagne.fr> wrote:
> Hello,
> I do not understand the relationship that may exist between the socket
> option SO_RCVBUF / SO_SNDBUF and values Recv-Q/Send-Q given by netstat.

SO_RCVBUF and SO_SNDBUF are, ostensibly, the limits to how much can be
queued to the socket. Recv-Q and Send-Q are how much are actually
there.

Recv-Q will be that data which has not yet been pulled from the socket
buffer by the application.

Send-Q will be that data which the sending application has given to
the transport, but has yet to be ACKnowledged by the receiving TCP.

> $ socat -ddd OPEN:/dev/zero TCP:localhost:8003,sndbuf=2000,rcvbuf=2000
> 2009/12/04 17:30:21 socat[2880] E write(4, 0x954a758, 8192): Broken pipe

If the sndbuf and rcvbuf settings correspond to setsockopt() calls for
SO_SNDBUF and SO_RCVBUF respectively, it is important to keep in mind
that Linux, unlike virtually every other *nix I've encountered, very
much considers the setsockopt() call a "suggestion" rather than a
"demand." It will set the actual socket buffer size to something
else, which one can see with a subsequent getsockopt() call (eg what
netperf does). It adds-in space for overhead (overheads of the
buffers that get queued to the socket buffers IIRC)

rick jones
--
firebug n, the idiot who tosses a lit cigarette out his car window
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

David Schwartz

unread,
Dec 4, 2009, 2:36:05 PM12/4/09
to
On Dec 4, 8:55 am, Christophe Lohr <christophe.l...@enst-bretagne.fr>
wrote:

> Does someone could explain?

You are making the naive assumption that the send queue and the
receive queue are set in units of application-level data bytes. They
are not.

DS

Rick Jones

unread,
Dec 4, 2009, 3:48:51 PM12/4/09
to
David Schwartz <dav...@webmaster.com> wrote:
> You are making the naive assumption that the send queue and the
> receive queue are set in units of application-level data bytes. They
> are not.

perhaps it is out of date, but the netstat manpage on my linux system
has this to say about the Q's:

Recv-Q
The count of bytes not copied by the user program connected
to this socket.

Send-Q
The count of bytes not acknowledged by the remote host.

Which certainly sounds like application-level bytes to me.

rick jones
--
a wide gulf separates "what if" from "if only"

David Schwartz

unread,
Dec 5, 2009, 3:23:05 AM12/5/09
to
On Dec 4, 12:48 pm, Rick Jones <rick.jon...@hp.com> wrote:

> perhaps it is out of date, but the netstat manpage on my linux system
> has this to say about the Q's:
>
>    Recv-Q
>        The  count  of  bytes  not copied by the user program connected
>        to this socket.
>
>    Send-Q
>        The count of bytes not acknowledged by the remote host.
>
> Which certainly sounds like application-level bytes to me.

I was talking about how the queue sizes are set, not how they are
measured. This description is TCP-specific, but the SO_SNDBUF/
SO_RCVBUF socket options are protocol-neutral.

Think about UDP. Where do you think the source IP and port are stored
if not in the receive queue?

DS

David Schwartz

unread,
Dec 5, 2009, 3:29:41 AM12/5/09
to
Oh, and before you say set and read socket buffer sizes are the same,
try this program:

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>

int main(void)
{
int j, i;
socklen_t l;

j=socket(AF_INET, SOCK_STREAM, 0);
if(j<0) return -1;

i=16384;
l=sizeof(i);
if(setsockopt(j, SOL_SOCKET, SO_RCVBUF, &i, l)!=0) return -2;

if(getsockopt(j, SOL_SOCKET, SO_RCVBUF, &i, &l)!=0) return -3;
printf("recvbuf=%d bytes\n", i);

close(j);
}

It does not fail, but the output is *NOT* 16,384.

DS

Christophe Lohr

unread,
Dec 7, 2009, 7:10:26 AM12/7/09
to
So, there are two points:
- set and read socket buffer size are not the same (2 times larger)
- Send-Q and Recv-Q may containts headers (but in TCP it contains user
data only, isn't it?)


So let's play with ad-hoc lient / server
(see attached files)

$ ./lazyServerTCP 8003
SO_RCVBUF=4000
SO_SNDBUF=4000

$ ./clientTCP localhost 8003
SO_RCVBUF=4000
SO_SNDBUF=4000
80715


$ netstat -tpn
Proto Recv-Q Send-Q Adresse locale Adresse distante Etat
PID/Program name
tcp 65930 0 127.0.0.1:8003 127.0.0.1:47969
ESTABLISHED 16937/lazyServerTCP
tcp 0 14784 127.0.0.1:47969 127.0.0.1:8003
ESTABLISHED 16939/clientTCP


I see that Recv-Q plus Sed-Q equals the amount of user data sent.

However I can't figure out why Recv-Q is 16 times larger than the actual
socket buffer...?

Regards.
--

clientTCP.c
lazyServerTCP.c

David Schwartz

unread,
Dec 7, 2009, 8:35:00 AM12/7/09
to
On Dec 7, 4:10 am, Christophe Lohr <christophe.l...@enst-bretagne.fr>
wrote:

> However I can't figure out why Recv-Q is 16 times larger than the actual
> socket buffer...?

Why should one have anything to do with the other? You're measuring
two completely different things and wondering why the measurements are
different -- well why shouldn't they be? The receive buffer is
something you at socket level. The Recv-Q is something at TCP level.
The receive queue size may affect how fast the receive buffer can
grow, but it won't directly affect its ultimate maximum size.

DS

Christophe Lohr

unread,
Dec 7, 2009, 10:34:20 AM12/7/09
to
David Schwartz a �crit :

>
>> However I can't figure out why Recv-Q is 16 times larger than the actual
>> socket buffer...?
>
> Why should one have anything to do with the other? You're measuring
> two completely different things and wondering why the measurements are
> different -- well why shouldn't they be?

Let's play with wireshark. Do a "Follow TCP stream".
The amount of user data sent before the TCP window becomes zero is equal
to Recv-Q

So, I understand the definition of Recv-Q:


The count of bytes not copied by the user program connected to
this socket.

> The receive buffer is
> something you at socket level. The Recv-Q is something at TCP level.

Well, but netstat gives a Recv-Q per socket... (a TCP socket in my case)

> The receive queue size may affect how fast the receive buffer can
> grow, but it won't directly affect its ultimate maximum size.

... sorry, I don't understand what are socket receive/send buffers :-(

This is another level of buffer?
What is in it?
Is it after or before Recv-Q (along data flow within the socket)

Regards
Thank you for your explanations and your patience

Rick Jones

unread,
Dec 7, 2009, 2:10:08 PM12/7/09
to
David Schwartz <dav...@webmaster.com> wrote:
> Why should one have anything to do with the other?

Perhaps this is decades of BSD-based precedent interacting with
Linux's desire to be different?

rick jones
--
oxymoron n, Hummer H2 with California Save Our Coasts and Oceans plates

Rick Jones

unread,
Dec 7, 2009, 2:13:33 PM12/7/09
to
David Schwartz <dav...@webmaster.com> wrote:

> On Dec 4, 12:48?pm, Rick Jones <rick.jon...@hp.com> wrote:

> > perhaps it is out of date, but the netstat manpage on my linux system
> > has this to say about the Q's:
> >

> > ? ?Recv-Q
> > ? ? ? ?The ?count ?of ?bytes ?not copied by the user program connected
> > ? ? ? ?to this socket.
> >
> > ? ?Send-Q
> > ? ? ? ?The count of bytes not acknowledged by the remote host.


> >
> > Which certainly sounds like application-level bytes to me.

> I was talking about how the queue sizes are set, not how they are
> measured. This description is TCP-specific, but the SO_SNDBUF/
> SO_RCVBUF socket options are protocol-neutral.

> Think about UDP. Where do you think the source IP and port are stored
> if not in the receive queue?

Off to the side, with the rest of the meta-data :) However, that may
not be the case under Linux, which relates to how it likes to
effectively double (up to a limit) what one requests in a setsockopt()
call.

rick jones
--
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Christophe Lohr

unread,
Dec 8, 2009, 5:36:13 AM12/8/09
to
Christophe Lohr a �crit :

> ... sorry, I don't understand what are socket receive/send buffers :-(
>
> This is another level of buffer?
> What is in it?
> Is it after or before Recv-Q (along data flow within the socket)

According to "A User's Guide to TCP Windows", there is a direct
relationship between TCP window and SO_RCVBUF
http://www.ncsa.illinois.edu/People/vwelch/net_perf/tcp_windows.html

so... I'm lost...

What are socket buffers?

Regards.

David Schwartz

unread,
Dec 8, 2009, 9:33:46 AM12/8/09
to
On Dec 8, 2:36 am, Christophe Lohr <christophe.l...@enst-bretagne.fr>
wrote:

> What are socket buffers?

It's implementation-defined, and you will have to get into the
horribly gory details of the implementation to understand it.

Why do you care? Is this for curiosity? Because if you think you need
to know it to do something at application level, you're doing it
wrong. An application should never try to "teach TCP" its protocol.

DS

Christophe Lohr

unread,
Dec 8, 2009, 9:42:00 AM12/8/09
to
David Schwartz a �crit :

>
>> What are socket buffers?
>
> It's implementation-defined, and you will have to get into the
> horribly gory details of the implementation to understand it.
>
> Why do you care? Is this for curiosity?

Yes, that's it.
I'm just looking for a high-level information

Regards

David Schwartz

unread,
Dec 8, 2009, 10:26:32 AM12/8/09
to
On Dec 8, 6:42 am, Christophe Lohr <christophe.l...@enst-bretagne.fr>
wrote:

> > Why do you care? Is this for curiosity?


>
> Yes, that's it.
> I'm just looking for a high-level information

Unfortunately, the high-level information is that the underlying
network protocol provides you an ability to set something it calls a
"send buffer" and something it calls a "receive buffer" in units of
bytes. And that's it.

DS

Christophe Lohr

unread,
Dec 8, 2009, 10:39:06 AM12/8/09
to
David Schwartz a �crit :

> Unfortunately, the high-level information is that the underlying
> network protocol provides you an ability to set something it calls a
> "send buffer" and something it calls a "receive buffer" in units of
> bytes. And that's it.

ok

Thank you for your patience
Regards

navi...@gmail.com

unread,
Mar 26, 2020, 4:19:02 AM3/26/20
to
My app team was having a performance issue during a 46 minute window to be precise, the entire day was otherwise fine. During that specific 46 minute window, the netstat data shows:

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
..
tcp 129 0 10.50.240.99:1521 0.0.0.0:* LISTEN -

Now, for the entire 46 minute window, I see that the RECV-Q is stuck 129 and then it somehow goes down. Also, 129 is the max number here when I check the stats for the whole day. Except for this 46 minute window, the counter for RECV-Q is either 2 digit number of lower than 129.

CAn you please advise on :
- what is this 129 number ? Is that some TCP Queue limit?
- is this an issue with app (they are using app server/connection pooling using websphere/tomcat server maybe). If it is app issue how to confirm that ? is the app server connections getting broken and connection pooling spawns new connections?

Sorry, I am not a network expert but hoping you guys can help or advise here.

thanks
N

Jorgen Grahn

unread,
Mar 26, 2020, 7:41:37 AM3/26/20
to
On Thu, 2020-03-26, navi...@gmail.com wrote:
> On Tuesday, December 8, 2009 at 9:39:06 AM UTC-6, Christophe Lohr wrote:
>> David Schwartz a �crit :
>> > Unfortunately, the high-level information is that the underlying
>> > network protocol provides you an ability to set something it calls a
>> > "send buffer" and something it calls a "receive buffer" in units of
>> > bytes. And that's it.
>>
>> ok
>>
>> Thank you for your patience
>> Regards
>
> My app team was having a performance issue during a 46 minute window
> to be precise, the entire day was otherwise fine. During that
> specific 46 minute window, the netstat data shows:
>
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
> ..
> tcp 129 0 10.50.240.99:1521 0.0.0.0:* LISTEN -
>
> Now, for the entire 46 minute window, I see that the RECV-Q is stuck
> 129 and then it somehow goes down. Also, 129 is the max number here
> when I check the stats for the whole day. Except for this 46 minute
> window, the counter for RECV-Q is either 2 digit number of lower
> than 129.
>
> CAn you please advise on :
> - what is this 129 number ? Is that some TCP Queue limit?

The Linux netstat(8) man page:

Recv-Q Established: The count of bytes not copied by the user
program connected to this socket. Listening: Since Kernel
2.6.18 this column contains the current syn backlog.

Yours is a listening socket, so it should be the number of clients
trying to connect to tcp/1521, but the server hasn't called accept()
for them.

You can experiment with this by sending SIGSTOP to an otherwise
working server (like netcat) and watch the figures when you let
clients try to connect.

> - is this an issue with app (they are using app server/connection
> pooling using websphere/tomcat server maybe). If it is app issue
> how to confirm that ? is the app server connections getting broken
> and connection pooling spawns new connections?

Don't know anything about those technologies; sorry. They still have
to obey the TCP rules, though.

> Sorry, I am not a network expert but hoping you guys can help or advise here.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Tauno Voipio

unread,
Mar 26, 2020, 8:13:50 AM3/26/20
to
This may be a guess:

TCP/1521 is a common port used for SQL database access.
Do you have an Oracle server at the affected computer?

It seems that the server on the port is not able to respond
as fast as there are requests coming in (a webstore overloaded?).

--

-TV

Carlos E.R.

unread,
Mar 26, 2020, 8:56:09 AM3/26/20
to
On 26/03/2020 09.18, navi...@gmail.com wrote:
> On Tuesday, December 8, 2009 at 9:39:06 AM UTC-6, Christophe Lohr wrote:
.........................******

>> David Schwartz a �crit :
>>> Unfortunately, the high-level information is that the underlying
>>> network protocol provides you an ability to set something it calls a
>>> "send buffer" and something it calls a "receive buffer" in units of
>>> bytes. And that's it.
>>
>> ok
>>
>> Thank you for your patience
>> Regards
>
> My app team was having a performance issue during a 46 minute window to be precise, the entire day was otherwise fine. During that specific 46 minute window, the netstat data shows:

Replying to a post from 2009? Really?


--
Cheers, Carlos.

navi...@gmail.com

unread,
Mar 26, 2020, 10:05:50 AM3/26/20
to
YES :)

Because I found the discussions here to be very knowledgeable.

"Do you have an Oracle server at the affected computer?"

Yes, this is an Oracle database having SCAN Listener on port 21. SCAN listener has the efficiency to manage/handle thousands of connections PER SEC (depending on cpu/other resources). During this time there is NO shortage of resources - cpu,mem etc.

So not sure, if this is really an issue with SCAN or the App connections requested by users are getting broken (somehow, don't know how to prove this??) and if due to the nature of web/app connection pooling, every user with such broken connection, keeps getting a NEW connection (connection storm).


thanks
N

Carlos E.R.

unread,
Mar 26, 2020, 5:08:09 PM3/26/20
to
On 26/03/2020 15.05, navi...@gmail.com wrote:
> On Thursday, March 26, 2020 at 7:56:09 AM UTC-5, Carlos E.R. wrote:
>> On 26/03/2020 09.18, navi...@gmail.com wrote:
>>> On Tuesday, December 8, 2009 at 9:39:06 AM UTC-6, Christophe Lohr wrote:
>> .........................******
>>


>> Replying to a post from 2009? Really?
>>
>>
>> --
>> Cheers, Carlos.
>
> "Replying to a post from 2009? Really?"
>
> YES :)
>
> Because I found the discussions here to be very knowledgeable.

So, just start a new thread.



--
Cheers, Carlos.

Grant Taylor

unread,
Mar 26, 2020, 5:59:51 PM3/26/20
to
On 3/26/20 8:05 AM, navi...@gmail.com wrote:
> "Replying to a post from 2009? Really?"
>
> YES :)

Please do not partake in thread necromancy.

> Because I found the discussions here to be very knowledgeable.

So post a new message to the newsgroup.

> Yes, this is an Oracle database having SCAN Listener on port 21.

Did you mean 1521? Or are you doing something with FTP?

> SCAN listener has the efficiency to manage/handle thousands of
> connections PER SEC (depending on cpu/other resources).

Presuming that everything is tuned properly, yes.

> During this time there is NO shortage of resources - cpu,mem etc.

Please be more specific. CPU and memory are two of many potential
restrictions. Disk, processes, file handles, connections, the number of
ports available / waiting to close also come to mind.

Do you have the full netstat output?

I /think/ that the Single Client Access Name actually functions as a
redirect to other workers in the RAC cluster. Do you have similar
information for them?

> So not sure, if this is really an issue with SCAN or the App
> connections requested by users are getting broken (somehow, don't know
> how to prove this??) and if due to the nature of web/app connection
> pooling, every user with such broken connection, keeps getting a NEW
> connection (connection storm).

Do you know if the connections are being /broken/? As in they fully
establish, start to be used, and then break? Or is there a chance that
they aren't being established, timing out, and retrying?

The full output from netstat, preferably from all the nodes in the
cluster, would be helpful.

Are you part of the DBA team? Or are you a Systems Administrator? Or
something else?

Are there any other load / performance logs for the cluster &
applications there in? At a minimum for shortly before, during, and
shortly after the problem? Ideally, more historical metrics for
baseline and comparison there to.

Do the network administrators have any NetFlow sFlow data that might help?

Are the clients in the same network (L2 broadcast domain) or a different
one and passing though a router?



--
Grant. . . .
unix || die

navi...@gmail.com

unread,
Mar 27, 2020, 1:15:51 PM3/27/20
to
I am not an active on forum discussions. believe it or not, maybe this is just my 3rd or 4th post. Anyways, I get the point. I should have used a new thread. will make a note on this next time.

"Did you mean 1521"
>> Yes

I dont have the full netstat output, just some snippets. But every time the RECV-Q is at 129, I also see "socket overflowed" message. E.g:

zzz <03/24/2020 09:11:12> subcount: 720
..
tcp 129 0 10.49.240.32:1521 0.0.0.0:* LISTEN -

zzz <03/24/2020 08:09:45> subcount: 0
3374410 times the listen queue of a socket overflowed
zzz <03/24/2020 09:12:08> subcount: 720
3733139 times the listen queue of a socket overflowed

--> Increase of 358729

Looks like a login storm to me. I confirmed that the somaxconn parameter is set to 1024 already.

cat /etc/sysctl.conf|grep somax
<hostname>: net.core.somaxconn=1024

cat /proc/sys/net/core/somaxconn
<hostname>: 1024

" Please be more specific. CPU and memory are two of many potential
restrictions. Disk, processes, file handles, connections, the number of
ports available / waiting to close also come to mind.?
>> I will need to check on this.

"Do you know if the connections are being /broken/? As in they fully
establish, start to be used, and then break? Or is there a chance that
they aren't being established, timing out, and retrying?"
>>
I dont have TCP expertise like the folks here, but I think its an issue with new connections when connection storm happens. If RECV-Q is at 129, then there is some kind of a SYN backlog (is what I have been reading) when new connections bursts keep coming in.

I am still not clear why 129 ? Why not > 129 i.e. 130, 121 etc.

"Are you part of the DBA team? Or are you a Systems Administrator? Or
something else?"
>>
I am a DBA with interest in networking.

"Do the network administrators have any NetFlow sFlow data that might help?"
>>
Network folks were called in during the issue happening real time and they said everything looked good. They checked every hop to the router, switch etc. and no delay. I am guessing that was expected because there is no slowness in network path. The issue is that RECV-Q is 129, so new connections are backlogged/have to wait in Q. Not sure which parameter (if not somaxconn) is responsible for RECV-Q to be stuck at 129.

Are the clients in the same network (L2 broadcast domain) or a different
one and passing though a router?
>>
Not sure, I will have to check on this.

thanks
N



Grant Taylor

unread,
Mar 27, 2020, 3:24:02 PM3/27/20
to
On 3/27/20 11:15 AM, navi...@gmail.com wrote:
> I am not an active on forum discussions. believe it or not, maybe
> this is just my 3rd or 4th post. Anyways, I get the point. I should
> have used a new thread. will make a note on this next time.

Thank you.

> I dont have the full netstat output, just some snippets. But every
> time the RECV-Q is at 129, I also see "socket overflowed" message. E.g:
>
> zzz <03/24/2020 09:11:12> subcount: 720
> ...
> tcp 129 0 10.49.240.32:1521 0.0.0.0:* LISTEN -
>
> zzz <03/24/2020 08:09:45> subcount: 0
> 3374410 times the listen queue of a socket overflowed
> zzz <03/24/2020 09:12:08> subcount: 720
> 3733139 times the listen queue of a socket overflowed
>
> --> Increase of 358729

So, you are running out of a resource. The question is to identify
which resource, and how to adjust it.

> Looks like a login storm to me. I confirmed that the somaxconn
> parameter is set to 1024 already.
>
> cat /etc/sysctl.conf|grep somax
> <hostname>: net.core.somaxconn=1024
>
> cat /proc/sys/net/core/somaxconn
> <hostname>: 1024

Try skimming the ip-sysctl documentation for other entries related to
somaxconn.

https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

I'm wondering if there is a chance that you tipped over 129 and SYN
Cookies enabled and clients didn't handle that gracefully. Maybe some
weird interaction.

> I dont have TCP expertise like the folks here, but I think its an
> issue with new connections when connection storm happens. If RECV-Q
> is at 129, then there is some kind of a SYN backlog (is what I have
> been reading) when new connections bursts keep coming in.

I'm inclined to agree with you. At least until there is evidence to the
contrary.

> I am still not clear why 129 ? Why not > 129 i.e. 130, 121 etc.

129 is really close to (one off from) 128, which is a power of 2.
Remember that many things in computers are based on powers of 2.

> I am a DBA with interest in networking.

Fair enough.

> Network folks were called in during the issue happening real time and
> they said everything looked good. They checked every hop to the router,
> switch etc. and no delay. I am guessing that was expected because there
> is no slowness in network path. The issue is that RECV-Q is 129, so new
> connections are backlogged/have to wait in Q. Not sure which parameter
> (if not somaxconn) is responsible for RECV-Q to be stuck at 129.

Did the networking team indicate if they have any NetFlow / sFlow / etc.
data for the network? I'm wondering if the load was higher than normal,
but not high enough to cause a problem for the network.

Read: Can network monitoring information function as a stand in for
system load comparison.

> Not sure, I will have to check on this.

ACK

> thanks

You're welcome.

Grant Taylor

unread,
Mar 27, 2020, 3:34:45 PM3/27/20
to
On 3/27/20 11:15 AM, navi...@gmail.com wrote:
> Network folks were called in during the issue happening real time
> and they said everything looked good.

Ask the networking folks, particularly the DNS administrators, if they
had any problems around the problem time frame.

I'd also ask around if any other systems / teams had any problems at the
time. Email & web servers / administrators are likely more susceptible
than other systems.

There is a chance that this is a secondary problem. Meaning that
something that the RAC cluster depends on had a problem that
subsequently caused the RAC system itself to have a problem.

I know that there was a DNS issue within the last week (I don't remember
exactly when) and there is a chance that it could have impacted reverse DNS.

This could have impacted RAC if it tried to do reverse DNS lookups on
connections.

Consider if you will:
· Connections normally take 1–9 seconds to fully connect.
· Reverse DNS starts to fail causing a 30 second delay.
· Connections now take 31–39 seconds to fully connect.

That's a potential connection time increase between 31× and 4×.

/That/ is the type of thing that can cause normally well behaved systems
to fall over.

Carlos E.R.

unread,
Mar 27, 2020, 4:56:07 PM3/27/20
to
On 27/03/2020 18.15, navi...@gmail.com wrote:
> On Thursday, March 26, 2020 at 4:59:51 PM UTC-5, Grant Taylor wrote:
>> On 3/26/20 8:05 AM, navi...@gmail.com wrote:

...

> I am not an active on forum discussions. believe it or not, maybe this is just my 3rd or 4th post. Anyways, I get the point. I should have used a new thread. will make a note on this next time.

Thank you. Just for your illustration ;-), this is not a web forum,
though. This is Usenet, an ancient text mode method of communication
which Google displays as if it were a web forum.

--
Cheers, Carlos.

Grant Taylor

unread,
Mar 27, 2020, 5:52:29 PM3/27/20
to
On 3/27/20 2:52 PM, Carlos E.R. wrote:
> This is Usenet, an ancient text mode method of communication

Point of order: The age is immaterial to it's value.

> which Google displays as if it were a web forum.

Google groups is just the latest in a long line of things that Usenet is
gatewayed to.

· mailing lists
· email
· private BBS boards
· public BBS boards (exchanged through BBS network)

Carlos E.R.

unread,
Mar 27, 2020, 6:24:07 PM3/27/20
to
On 27/03/2020 22.52, Grant Taylor wrote:
> On 3/27/20 2:52 PM, Carlos E.R. wrote:
>> This is Usenet, an ancient text mode method of communication
>
> Point of order:  The age is immaterial to it's value.

I know :-)

>> which Google displays as if it were a web forum.
>
> Google groups is just the latest in a long line of things that Usenet is
> gatewayed to.
>
>  · mailing lists
>  · email
>  · private BBS boards
>  · public BBS boards (exchanged through BBS network)

Fidonet


--
Cheers, Carlos.

Grant Taylor

unread,
Mar 27, 2020, 6:59:16 PM3/27/20
to
On 3/27/20 4:23 PM, Carlos E.R. wrote:
> Fidonet

That's one of the technologies I was referring to with "BBS network".

I believe there were other FidoNet Technology Networks besides FidoNet
as well as other non-FidoNet protocols.

I recently heard that some (predominantly Unix based) BBSs used UUCP.

Ted Heise

unread,
Mar 28, 2020, 8:44:07 AM3/28/20
to
On Fri, 27 Mar 2020 16:59:14 -0600,
Grant Taylor <gta...@tnetconsulting.net> wrote:
> On 3/27/20 4:23 PM, Carlos E.R. wrote:
> > Fidonet
>
> That's one of the technologies I was referring to with "BBS
> network".

Yup. Knew what you meant here.

--
Ted Heise <the...@panix.com> West Lafayette, IN, USA

Carlos E.R.

unread,
Mar 28, 2020, 5:28:07 PM3/28/20
to
On 27/03/2020 23.59, Grant Taylor wrote:
> On 3/27/20 4:23 PM, Carlos E.R. wrote:
>> Fidonet
>
> That's one of the technologies I was referring to with "BBS network".
>
> I believe there were other FidoNet Technology Networks besides FidoNet
> as well as other non-FidoNet protocols.

Over here, I did not see any other.

>
> I recently heard that some (predominantly Unix based) BBSs used UUCP.
>
>
>


--
Cheers, Carlos.

Grant Taylor

unread,
Mar 28, 2020, 5:43:40 PM3/28/20
to
On 3/28/20 3:24 PM, Carlos E.R. wrote:
> Over here, I did not see any other.

I've seen / heard tell of other FTNs. I think they were small
communities of boards that didn't link to the larger FidoNet (proper).

Much the same way it's possible to have multiple private NNTP servers
without actually connecting to Usenet.

I don't know how common it was.

Carlos E.R.

unread,
Mar 29, 2020, 12:56:09 AM3/29/20
to
I knew of separate networks that used Fidonet protocols and software,
but not connected to it. But I did not know, in my country, of different
protocols to form a BBS network - except perhaps to connect a very small
number of BBS, like two or three that had some areas in common.

Also I heard, this century, that Fidonet software was used and developed
in Russia, at least for some years.

--
Cheers, Carlos.

David W. Hodgins

unread,
Mar 29, 2020, 1:29:08 AM3/29/20
to
On Sun, 29 Mar 2020 00:53:30 -0400, Carlos E.R. <robin_...@es.invalid> wrote:

> I knew of separate networks that used Fidonet protocols and software,
> but not connected to it. But I did not know, in my country, of different
> protocols to form a BBS network - except perhaps to connect a very small
> number of BBS, like two or three that had some areas in common.

The three bbs systems I was using starting back in the mid 80's were
https://en.wikipedia.org/wiki/FidoNet
https://en.wikipedia.org/wiki/RelayNet
and cannet, which I can no longer find any info about. It was a Toronto bbs (canrem.com) that developed it's own network software, and was one of the first
bbs systems to also connect to the internet for usenet, archie, etc. Canrem.com
was my first exposure to those networks.

> Also I heard, this century, that Fidonet software was used and developed
> in Russia, at least for some years.

It was developed in North America, and then spread worldwide.

Regards, Dave Hodgins

--
Change dwho...@nomail.afraid.org to davidw...@teksavvy.com for
email replies.

Carlos E.R.

unread,
Mar 29, 2020, 9:16:07 AM3/29/20
to
On 29/03/2020 07.28, David W. Hodgins wrote:
> On Sun, 29 Mar 2020 00:53:30 -0400, Carlos E.R.
> <robin_...@es.invalid> wrote:
>
>> I knew of separate networks that used Fidonet protocols and software,
>> but not connected to it. But I did not know, in my country, of different
>> protocols to form a BBS network - except perhaps to connect a very small
>> number of BBS, like two or three that had some areas in common.
>
> The three bbs systems I was using starting back in the mid 80's were
> https://en.wikipedia.org/wiki/FidoNet
> https://en.wikipedia.org/wiki/RelayNet
> and cannet, which I can no longer find any info about. It was a Toronto
> bbs (canrem.com) that developed it's own network software, and was one
> of the first
> bbs systems to also connect to the internet for usenet, archie, etc.
> Canrem.com
> was my first exposure to those networks.

Right. But not over here, it was rare.

>
>> Also I heard, this century, that Fidonet software was used and developed
>> in Russia, at least for some years.
>
> It was developed in North America, and then spread worldwide.

I know. But at some point usage dwindled and stopped, and development
was abandoned. And, this century, development was taken over on Russia.
At least, it happened to the software I used.

Example:

<https://en.wikipedia.org/wiki/Golded>

«By 2007 GoldED+ was the most popular cross-platform message editor in
Russian-speaking FidoNet.»

If you look on github, you see Russian names.


--
Cheers, Carlos.

Grant Taylor

unread,
Mar 29, 2020, 12:53:39 PM3/29/20
to
On 3/28/20 10:53 PM, Carlos E.R. wrote:
> I knew of separate networks that used Fidonet protocols and software,
> but not connected to it. But I did not know, in my country, of different
> protocols to form a BBS network …

Agreed.

> … except perhaps to connect a very small number of BBS, like two or
> three that had some areas in common.

I suspect that type of thing happened a number of places on a very small
scale. There needs to be a good reason to not re-use FTN if both boards
support it. The only reason that I can think of is if one of the boards
does not support FTN and something else must be used between them.

> Also I heard, this century, that Fidonet software was used and developed
> in Russia, at least for some years.

I heard it described as Russians are the predominant remaining users of
FidoNet (proper) and / or FTNs. The subtext was that it's a
communications mechanism that survives that doesn't /depend/ on the
Internet. Since the Russians are the predominant users, they are the
logical place for most support to come from, ergo development.

Carlos E.R.

unread,
Mar 30, 2020, 5:04:09 PM3/30/20
to
Yes.

Maybe it is more resilient to eavesdropping, unless they hack one of the
nodes. Maybe they have added cryptography to it :-?


--
Cheers, Carlos.

Grant Taylor

unread,
Mar 30, 2020, 8:46:36 PM3/30/20
to
On 3/30/20 3:00 PM, Carlos E.R. wrote:
> Maybe it is more resilient to eavesdropping, unless they hack one of the
> nodes. Maybe they have added cryptography to it :-?

I haven't heard any indication that cryptography was part of it. But my
ignorance of such doesn't preclude it from possibly existing.

I speculate that point-to-point dial up connections are more problematic
to snoop on than Internet traffic.

I'm sure that it can be done, but someone has to want to do it. Where
as snooping on Internet traffic is an old hat trick.

Carlos E.R.

unread,
Apr 1, 2020, 5:24:09 AM4/1/20
to
Oh, snooping on telephone is easy if you have the power to do it.

If the network is digital (I know that parts of the old soviet union was
analogical by year 2000, but I have no idea about now) is easy.
Basically, you just tell a processor to copy a stream of incoming bytes
to another position in memory map besides the one of the destination
phone. No one outside finds out, there are no clicks to hear, and the
copy is bit perfect.

Now, once you have the bit stream you need a digital signal processor to
analyze the bit stream to recreate the equivalent modem signal and
decode the original digital stream - without the recourse to negotiate
anything; the stream is unidirectional, listen only. I'm sure any
intelligence agency or police body has that capability.

The only real tricky part is having the power to snoop.

If not, then the alternative is a physical wire tap at the line, but
this might be detected. Maybe not if it is high impedance.


Not that I have ever done it :-D

--
Cheers, Carlos.

Grant Taylor

unread,
Apr 1, 2020, 12:03:14 PM4/1/20
to
On 3/31/20 9:52 PM, Carlos E.R. wrote:
> Oh, snooping on telephone is easy if you have the power to do it.

I agree with everything that you've said.

My comment is that the access required to do what you have described is
significantly higher and there are correspondingly fewer people that can
do that. Especially with comparison to people doing functional
equivalents at the IP layer.

Carlos E.R.

unread,
Apr 1, 2020, 4:28:09 PM4/1/20
to
Yes, there is so.

--
Cheers, Carlos.
0 new messages