Python: How are multiple server processes listening on the same port/fd?

3,760 views
Skip to first unread message

Amit Saha

unread,
Sep 4, 2017, 8:58:40 AM9/4/17
to grpc.io
Hey all,

This is the relevant part of my server:

def serve():
  server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
  users_service.add_UsersServicer_to_server(UsersService(), server)
  res = server.add_insecure_port('127.0.0.1:50051')
  server.start()
  try:
    while True:
      time.sleep(_ONE_DAY_IN_SECONDS)
  except KeyboardInterrupt:
    server.stop(0)


On OS X, by mistake i started the server twice in different terminal sessions and I expected it to fail the second time (address already bound, etc). However, it didn't error out and this is what I see via lsof:

$ sudo lsof -iTCP:50051

COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
Python  12120 amit    4u  IPv6 0xffbffb1742aa0f0f      0t0  TCP localhost:50051 (LISTEN)
Python  12157 amit    4u  IPv6 0xffbffb1742aa144f      0t0  TCP localhost:50051 (LISTEN)

Curiously enough, they both have the same FD. 

(I verified it with a Flask application to see if it was something Python specific, and i do see the expected error message when I try to start a second instance of the same)

Thanks for any hints in advance.

Best Wishes,
Amit.

Ken Payson

unread,
Sep 4, 2017, 4:44:35 PM9/4/17
to Amit Saha, grpc.io
gRPC Python sets the SO_REUSEADDR option on server sockets, which allows multiple servers to bind to the same port.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/7bef9f00-c9fa-4cdc-91c7-c3555c177c9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Amit Saha

unread,
Sep 4, 2017, 7:03:00 PM9/4/17
to Ken Payson, grpc.io
On Tue, 5 Sep 2017 at 6:44 am, Ken Payson <kpa...@google.com> wrote:
gRPC Python sets the SO_REUSEADDR option on server sockets, which allows multiple servers to bind to the same port.

Thanks. Is there any reason why this is set to be the default behavior?




To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.

Amit Saha

unread,
Sep 5, 2017, 9:08:00 AM9/5/17
to Ken Payson, grpc.io
On Tue, Sep 5, 2017 at 9:02 AM Amit Saha <amits...@gmail.com> wrote:
On Tue, 5 Sep 2017 at 6:44 am, Ken Payson <kpa...@google.com> wrote:
gRPC Python sets the SO_REUSEADDR option on server sockets, which allows multiple servers to bind to the same port.

Thanks. Is there any reason why this is set to be the default behavior?

Searching around, I can see that this *may* be desired behavior and hence gRPC has made a pragmatic choice. However, it seems to be most useful in a scenario where an existing socket is in the TIME_WAIT state and we want a new server process to bind to the same addr/port. However, two questions:

1. This is not the case here - both of my servers are in LISTEN

2. Next, considering (1), does it not introduce a race condition when we have more than one process listening on the same socket? (let's say for whatever reason, a server process is already running and we have started another unaware, since we don't get an error).

Will appreciate any insights/pointers.

Thanks,
Amit.

Ken Payson

unread,
Sep 5, 2017, 3:40:02 PM9/5/17
to Amit Saha, grpc.io
On Tue, Sep 5, 2017 at 6:07 AM, Amit Saha <amits...@gmail.com> wrote:


On Tue, Sep 5, 2017 at 9:02 AM Amit Saha <amits...@gmail.com> wrote:
On Tue, 5 Sep 2017 at 6:44 am, Ken Payson <kpa...@google.com> wrote:
gRPC Python sets the SO_REUSEADDR option on server sockets, which allows multiple servers to bind to the same port.

Thanks. Is there any reason why this is set to be the default behavior?

Searching around, I can see that this *may* be desired behavior and hence gRPC has made a pragmatic choice. However, it seems to be most useful in a scenario where an existing socket is in the TIME_WAIT state and we want a new server process to bind to the same addr/port. However, two questions: 

1. This is not the case here - both of my servers are in LISTEN
I think you are referring to the SO_REUSEPORT option.  The SO_REUSEADDR is different, and is intended for having multiple processes bind to the same port.  One advantage of this is that you can scale by having multiple processes serving requests. 

2. Next, considering (1), does it not introduce a race condition when we have more than one process listening on the same socket? (let's say for whatever reason, a server process is already running and we have started another unaware, since we don't get an error).

If you want to avoid this behavior, you can disable it by passing a server option:
grpc.server(thread_pool, options=(('grpc.so_reuseport', 0),))
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.

Amit Saha

unread,
Sep 6, 2017, 4:26:33 AM9/6/17
to Ken Payson, grpc.io
On Wed, 6 Sep 2017 at 5:40 am, Ken Payson <kpa...@google.com> wrote:
On Tue, Sep 5, 2017 at 6:07 AM, Amit Saha <amits...@gmail.com> wrote:


On Tue, Sep 5, 2017 at 9:02 AM Amit Saha <amits...@gmail.com> wrote:
On Tue, 5 Sep 2017 at 6:44 am, Ken Payson <kpa...@google.com> wrote:
gRPC Python sets the SO_REUSEADDR option on server sockets, which allows multiple servers to bind to the same port.

Thanks. Is there any reason why this is set to be the default behavior?

Searching around, I can see that this *may* be desired behavior and hence gRPC has made a pragmatic choice. However, it seems to be most useful in a scenario where an existing socket is in the TIME_WAIT state and we want a new server process to bind to the same addr/port. However, two questions: 

1. This is not the case here - both of my servers are in LISTEN
I think you are referring to the SO_REUSEPORT option.  The SO_REUSEADDR is different, and is intended for having multiple processes bind to the same port.  One advantage of this is that you can scale by having multiple processes serving requests. 

Sorry, but whatever I read seems to suggest the behavior you mention for SO_REUSEPORT and not SO_REUSEADDR. I will definitely look more, but if you have a handy reference you can share, that will be great.



2. Next, considering (1), does it not introduce a race condition when we have more than one process listening on the same socket? (let's say for whatever reason, a server process is already running and we have started another unaware, since we don't get an error).

If you want to avoid this behavior, you can disable it by passing a server option:
grpc.server(thread_pool, options=(('grpc.so_reuseport', 0),))

Thanks.


To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.

Amit Saha

unread,
Sep 6, 2017, 10:32:58 PM9/6/17
to Ken Payson, grpc.io
On Wed, Sep 6, 2017 at 6:26 PM Amit Saha <amits...@gmail.com> wrote:
On Wed, 6 Sep 2017 at 5:40 am, Ken Payson <kpa...@google.com> wrote:
On Tue, Sep 5, 2017 at 6:07 AM, Amit Saha <amits...@gmail.com> wrote:


On Tue, Sep 5, 2017 at 9:02 AM Amit Saha <amits...@gmail.com> wrote:
On Tue, 5 Sep 2017 at 6:44 am, Ken Payson <kpa...@google.com> wrote:
gRPC Python sets the SO_REUSEADDR option on server sockets, which allows multiple servers to bind to the same port.

Thanks. Is there any reason why this is set to be the default behavior?

Searching around, I can see that this *may* be desired behavior and hence gRPC has made a pragmatic choice. However, it seems to be most useful in a scenario where an existing socket is in the TIME_WAIT state and we want a new server process to bind to the same addr/port. However, two questions: 

1. This is not the case here - both of my servers are in LISTEN
I think you are referring to the SO_REUSEPORT option.  The SO_REUSEADDR is different, and is intended for having multiple processes bind to the same port.  One advantage of this is that you can scale by having multiple processes serving requests. 

Sorry, but whatever I read seems to suggest the behavior you mention for SO_REUSEPORT and not SO_REUSEADDR. I will definitely look more, but if you have a handy reference you can share, that will be great.

I switched to Linux for my experiments this time. Let's consider the server below:

import socket
import os

def start_server():
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    sock.bind(('localhost', 5555))
    sock.listen(0)

    while True:
        connection, address = sock.accept()
        buf = connection.recv(64)
        if len(buf) > 0:
           print os.getpid()


if __name__ == '__main__':
    start_server()

Start the instance 1:

$ lsof -i TCP:5555
COMMAND   PID  USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
python  10973 asaha    3u  IPv4 11332922      0t0  TCP localhost:5555 (LISTEN)


If I try to start a second instance of the server, I get:

Traceback (most recent call last):
  File "server.py", line 19, in <module>
    start_server()
  File "server.py", line 7, in start_server
    sock.bind(('localhost', 5555))
  File "/usr/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 98] Address already in use

Now if I change the server as follows to use SO_REUSEPORT:

import socket
import os


def start_server():
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
    sock.bind(('localhost', 5555))
    sock.listen(0)

    while True:
        connection, address = sock.accept()
        buf = connection.recv(64)
        if len(buf) > 0:
            print os.getpid()


if __name__ == '__main__':
    start_server()


I can start two server processes and I see both instances serving client requests. So that tells me that SO_REUSEADDR doesn't allow a seond process to LISTEN when another already is.

Now, let's get back to my original gRPC server. When I try to start a second instance of the server, I get this on Linux:

E0907 12:28:57.205046525   16071 server_chttp2.c:53]         {"created":"@1504751337.205028841","description":"No address added out of total 1 resolved","file":"src/core/ext/transport/chttp2/server/chttp2_server.c","file_line":260,"referenced_errors":[{"created":"@1504751337.205026361","description":"Unable to configure socket","fd":3,"file":"src/core/lib/iomgr/tcp_server_utils_posix_common.c","file_line":215,"referenced_errors":[{"created":"@1504751337.205023798","description":"OS Error","errno":98,"file":"src/core/lib/iomgr/tcp_server_utils_posix_common.c","file_line":188,"os_error":"Address already in use","syscall":"bind"}]}]}

Much better, this exactly what I expected. So, this tells me that the behaviour of SO_REUSEADDR is "different" on OS X?

FWIW, I found https://github.com/veithen/knetstat useful to be able to see the socket options set.


Amit Saha

unread,
Sep 6, 2017, 10:34:43 PM9/6/17
to Ken Payson, grpc.io
Only for gRPC's Python server i.e. (since I *did* get the error with my above serve file on OSX).

Amit Saha

unread,
Oct 7, 2017, 8:13:43 AM10/7/17
to grpc.io

Ken Payson

unread,
Oct 10, 2017, 1:58:00 PM10/10/17
to grpc.io
The reason why the SO_REUSEADDR behavior is seen on OSX but not linux has to do with the binary packaging of gRPC Python.

gRPC Python binary packages for linux are built against a "manylinux" platform that aims to be compatible with most linux distributions.  As a result, it is compiled without SO_REUSEADDR support.

gRPC Python binary packages for OSX are compiled with SO_REUSEADDR support.

If you need SO_REUSEADDR support for gRPC Python on linux, you can build from source.  If you would like to disable the behavior on OSX, you can use the following server option:

grpc.server(thread_pool, options=(('grpc.so_reuseport', 0),))
Reply all
Reply to author
Forward
Message has been deleted
0 new messages