#test1.py:-----------
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = 'localhost'
port = 5052 #server port
s.connect((host, port))
print s.getsockname()
response = []
while 1:
piece = s.recv(1024)
if piece == '':
break
response.append(piece)
#test3.py:----------------
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = 'localhost'
port = 5052 #server port
s.connect((host, port))
print s.getsockname()
response = []
while 1:
piece = s.recv(1024)
if piece == '':
break
response.append(piece)
and this basic server:
#test2.py:--------------
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = ''
port = 5052
s.bind((host, port))
s.listen(5)
while 1:
newsock, client_addr = s.accept()
print "orignal socket:", s.getsockname()
print "new socket:", newsock.getsockname()
print "new socket:", newsock.getpeername()
print
I started the server, and then I started the clients one by one. I
expected both clients to hang since they don't get notified that the
server is done sending data, and I expected the server output to show
that accept() created two new sockets. But this is the output I got
from the server:
original socket: ('0.0.0.0', 5052)
new socket, self: ('127.0.0.1', 5052)
new socket, peer: ('127.0.0.1', 50816)
original socket: ('0.0.0.0', 5052)
new socket, self: ('127.0.0.1', 5052)
new socket, peer: ('127.0.0.1', 50818)
The first client I started generated this output:
('127.0.0.1', 50816)
And when I ran the second client, the first client disconnected, and
the second client produced this output:
('127.0.0.1', 50818)
and then the second client hung. I expected the server output to be
something like this:
original socket: ('127.0.0.1', 5052)
new socket, self: ('127.0.0.1', 5053)
new socket, peer: ('127.0.0.1', 50816)
original socket: ('0.0.0.0', 5052)
new socket, self: ('127.0.0.1', 5054)
new socket, peer: ('127.0.0.1', 50818)
And I expected both clients to hang. Can someone explain how accept()
works?
I guess (but I did not try it) that the problem is not accept(), that
should work as you expect,
but the fact that at the second connection your code actually throws
away the first connection
by reusing the same variables without storing the previous values.
This could make the Python
garbage collector to attempt freeing the socket object created with
the first connection, therefore
closing the connection.
If I'm right, your program should work as you expect if you for
instance collect in a list the sockets
returned by accept.
Ciao
----
FB
The question I'm really trying to answer is: if a client connects to a
host at a specific port, but the server changes the port when it
creates a new socket with accept(), how does data sent by the client
arrive at the correct port? Won't the client be sending data to the
original port e.g. port 5052 in the client code above?
Yes, you are right about that. This code prevents the first client
from disconnecting:
newsocks = []
client_addys = []
while 1:
newsock, client_addr = s.accept()
newsocks.append(newsock)
client_addys.append(client_addr)
print "original socket:", s.getsockname()
print "new socket, self:", newsock.getsockname()
print "new socket, peer:", newsock.getpeername()
print
If I change the clients to this:
import socket
import time
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = 'localhost'
port = 5052 #server port
print s.getsockname() #<------------NEW LINE
s.connect((host, port))
print s.getsockname()
response = []
while 1:
piece = s.recv(1024)
if piece == '':
break
response.append(piece)
Then I get output from the clients like this:
('0.0.0.0', 0)
('127.0.0.1', 51439)
('0.0.0.0', 0)
('127.0.0.1', 51440)
The server port 5052(i.e. the one used in connect()) is not listed
there. That output indicates that the client socket is initially
created with some place holder values, i.e. 0.0.0.0, 0. Then accept()
apparently sends a message back to the client that in effect says,
"Hey, in the future send me data on port 51439." Then the client
fills in that port along with the ip address in its socket object.
Thereafter, any data sent using that socket is sent to that port and
that ip address.
I'm not an expert, never used TCP/IP below the socket abstraction
level, but I imagine
that after accept, the client side of the connection is someow
'rewired' with the new
socket created on the server side.
Anyhow, this is not python-related, since the socket C library behaves
exactly in the same way.
Ciao
-----
FB
Implicit in that description is that the client must get assigned a
port number before the call to connect(), or maybe the call to
connect() assigns a port number to the client. In any case, the
server has to receive both the client's ip address and port number as
part of the client's request for a connection in order for the server
to know where to send the response. The server then uses that ip
address and port number to send back a message to the client that
tells the client what port number future communications need to be
sent to.
> The question I'm really trying to answer is: if a client connects to a
> host at a specific port, but the server changes the port when it
> creates a new socket with accept(), how does data sent by the client
> arrive at the correct port? Won't the client be sending data to the
> original port e.g. port 5052 in the client code above?
The answer is that the server *doesn't* change its port. As you
could see in the output of your server, the socket that accept()
returned also had local port 5052. Each *client* will however
get a unique local port at *its* end.
A TCP connection is identified by a four-tuple:
( localaddr, localport, remoteaddr, remoteport )
Note that what is local and what is remote is relative to which
process you are looking from. If the four-tuple for a specific
TCP connection is ( 127.0.0.1, 5052, 127.0.0.1, 50816 ) in your
server, it will be ( 127.0.0.1, 50816, 127.0.0.1, 5052 ) in the
client for the very same TCP connection.
Since your client hasn't bound its socket to a specific port, the
kernel will chose a local port for you when you do a connect().
The chosen port will be more or less random, but it will make
sure that the four-tuple identifying the TCP connection will be
unique.
--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden
"There are many causes worth dying for, but ! bellman @ lysator.liu.se
none worth killing for." -- Gandhi ! Make Love -- Nicht Wahr!
You seem to be describing what I see:
----server output-----
original socket: ('0.0.0.0', 5053)
new socket, self: ('127.0.0.1', 5053)
new socket, peer: ('127.0.0.1', 49302)
original socket: ('0.0.0.0', 5053)
new socket, self: ('127.0.0.1', 5053)
new socket, peer: ('127.0.0.1', 49303)
---client1 output-----
('0.0.0.0', 0)
('127.0.0.1', 49302)
---client2 output-----
('0.0.0.0', 0)
('127.0.0.1', 49303)
But your claim that the server doesn't change its port flies in the
face of every description I've read about TCP connections and
accept(). The articles and books I've read all claim that the server
port 5053 is a 'listening' port only. Thereafter, when a client sends
a request for a connection to the listening port, the accept() call on
the server creates a new socket for communication between the client
and server, and then the server goes back to listening on the original
socket. Here are two sources for that claim:
Socket Programming How To:
http://www.amk.ca/python/howto/sockets/
Tutorial on Network Programming with Python:
http://heather.cs.ucdavis.edu/~matloff/Python/PyNet.pdf
In either case, there are still some things about the output that
don't make sense to me. Why does the server initially report that its
ip address is 0.0.0.0:
original socket: ('0.0.0.0', 5053)
I would expect the reported ip address to be '127.0.0.1'. Also, since
a socket is uniquely identified by an ip address and port number, then
the ('0.0.0.0', 5053) socket is not the same as this socket:
> But your claim that the server doesn't change its port flies in the
> face of every description I've read about TCP connections and
> accept().
Then the descriptions are wrong.
> The articles and books I've read all claim that the server
> port 5053 is a 'listening' port only.
Not true.
> Thereafter, when a client sends a request for a connection to
> the listening port, the accept() call on the server creates a
> new socket for communication between the client and server,
True. But, it doesn't change the local port number.
Both the listing socket and the connected socket are using
local port number 5053.
> and then the server goes back to listening on the original
> socket.
That's true.
> I would expect the reported ip address to be '127.0.0.1'.
> Also, since a socket is uniquely identified by an ip address
> and port number,
It isn't.
1) You seem to be conflating sockets and TCP connections. A
socket is a kernel-space data structure used to provide a
user-space API to the network stack. In user-space it's
identified by an integer index into a per-process table of
file-like-objects. That socket may or may not have a TCP
connection associated with it. It may or may not be bound
to an IP address and/or port. It is not uniquely identified
by an IP address and port number.
2) A tcp connection is a _different_ thing (though it also
corresponds to a kernel-space data structure), and as Thomas
said, it is uniquely identified by the a four-tuple:
(localaddr, localport, remoteaddr, remoteport)
[Technically, it's probably a 5-tuple with the above
elements along with a 'connection type' element, but since
we're only discussing TCP in this thread, we can ignore the
connection type axis and only consider the 4-axis space of
TCP connections.]
When a second client connects to the server on port 5053,
the first two elements in the tuple will be the same. One
or both of the last two elements will be different.
> then the ('0.0.0.0', 5053) socket is not the same as this
> socket:
>
> new socket, self: ('127.0.0.1', 5053)
Referring to sockets using that notation doesn't really make
sense. There can be more than one socket associated with the
local address ('127.0.0.1', 5053) or to any other ip/port tuple
you'd like to pick.
--
Grant Edwards grante Yow! YOU PICKED KARL
at MALDEN'S NOSE!!
visi.com
Because you called "bind" with None (or '' ?) as its first argument; that
means: "listen on any available interface"
> I would expect the reported ip address to be '127.0.0.1'. Also, since
> a socket is uniquely identified by an ip address and port number, then
> the ('0.0.0.0', 5053) socket is not the same as this socket:
>
> new socket, self: ('127.0.0.1', 5053)
You got this *after* a connection was made, coming from your own PC.
127.0.0.1 is your "local" IP; the name "localhost" should resolve to that
number. If you have a LAN, try running the client on another PC. Or
connect to Internet and run the "netstat" command to see the connected
pairs.
--
Gabriel Genellina
> But your claim that the server doesn't change its port flies in the
> face of every description I've read about TCP connections and
> accept(). The articles and books I've read all claim that the server
> port 5053 is a 'listening' port only. Thereafter, when a client sends
> a request for a connection to the listening port, the accept() call on
> the server creates a new socket for communication between the client
> and server, and then the server goes back to listening on the original
> socket.
You're confusing "port" and "socket".
A port is an external thing. It exists in the minds and hearts of packets
on the network, and in the RFCs which define the TCP protocol (UDP too, but
let's keep this simple).
A socket is an internal thing. It is a programming abstraction. Sockets
can exist that aren't bound to ports, and several different sockets can be
bound to the same port. Just like there can be multiple file descriptors
which are connected to a given file on disk.
The TCP protocol defines a state machine which determines what packets
should be sent in response when certain kinds of packets get received. The
protocol doesn't say how this state machine should be implemented (or even
demands that it be implemented at all). It only requires that a TCP host
behave in a way which the state machine defines.
In reality, whatever operating system you're running on almost certainly
implements in the kernel a state machine as described by TCP. That state
machine has two sides. On the outside is the network interface, which
receives and transmits packets. On the inside is the socket interface to
user-mode applications. The socket is just the API by which a user program
interacts with the kernel to get it to do the desired things on the network
interface(s).
Now, what the articles and books say is that there is a listening SOCKET.
And when you accept a connection on that socket (i.e. a TCP three-way
handshake is consummated on the network), the way the socket API deals with
that is to generate a NEW socket (via the accept system call). There
really isn't any physical object that either socket represents. They're
both just programming abstractions.
Does that help?
> En Mon, 25 Feb 2008 20:03:02 -0200, 7stud <bbxx78...@yahoo.com>
> escribió:
> > On Feb 25, 10:56 am, Thomas Bellman <bell...@lysator.liu.se> wrote:
> >> 7stud <bbxx789_0...@yahoo.com> wrote:
>
> > In either case, there are still some things about the output that
> > don't make sense to me. Why does the server initially report that its
> > ip address is 0.0.0.0:
> >
> > original socket: ('0.0.0.0', 5053)
>
> Because you called "bind" with None (or '' ?) as its first argument; that
> means: "listen on any available interface"
It really means, "Listen on ALL available interfaces".
The server disambiguates the packets when it demultiplexes the
connection packet streams by using the remote endpoint to differentiate
between packets that are part of different connections. TCP guarantees
that no two ephemeral port connections from the same client will use the
same port.
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
> TCP guarantees
> that no two ephemeral port connections from the same client will use the
> same port.
Where "client" is defined as "IP Address". You could certainly have a
remote machine that has multiple IP addresses using the same remote port
number on different IP addresses for simultaneous connections to the same
local port.
Correct.
---
When you surf the Web, say to http://www.google.com, your Web browser
is a client. The program you contact at Google is a server. When a
server is run, it sets up business at a certain port, say 80 in the
Web case. It then waits for clients to contact it. When a client does
so, the server will usually assign a new port, say 56399, specifically
for communication with that client, and then resume watching port 80
for new requests.
---
http://heather.cs.ucdavis.edu/~matloff/Python/PyNet.pdf
If two sockets are bound to the same host and port on the server, how
does data sent by the client get routed? Can both sockets recv() the
data?
> When you surf the Web, say to http://www.google.com, your Web browser
> is a client. The program you contact at Google is a server. When a
> server is run, it sets up business at a certain port, say 80 in the
> Web case. It then waits for clients to contact it. When a client does
> so, the server will usually assign a new port, say 56399, specifically
> for communication with that client, and then resume watching port 80
> for new requests.
Actually the client is the one that allocates a new port. All
connections to a server remain on the same port, the one it listens
on:
>>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>>> s.bind(('127.0.0.1', 10000))
>>> s.listen(1)
>>> s.accept()
# now, connect to port 10000 from elsewhere
(<socket._socketobject object at 0xb7adf6f4>, ('127.0.0.1', 36345))
>>> s1 = _[0]
>>> s1
<socket._socketobject object at 0xb7adf6f4>
>>> s1.getsockname()
('127.0.0.1', 10000) # note the same port, not a different one
7stud wrote:
>
> If two sockets are bound to the same host and port on the server, how
> does data sent by the client get routed? Can both sockets recv() the
> data?
I have learned a lot of stuff I did not know before from this thread,
so I think I can answer that.
There must be a layer of software that listens at the port. When it
receives a packet, it can tell which client sent the packet (host and
port number). It uses that to look up which socket is handling that
particular connection, and passes it as input to that socket.
Therefore each socket only receives its own input, and is not aware of
anything received for any other connection.
Not a technical explanation, but I think it describes what happens.
Frank Millman
Note that the condition I mentioned earlier (with the caveat added by
Roy) ensures that while addr1 and addr2 might be the same, or p1 and p2
might be the same, they can *never* be the same together: if the TCP
layer at addr1 allocates port p1 to one client process, when another
client process asks for an ephemeral port TCP guarantees that it wonn't
be given p1, because that is already logged as in use by another process.
So, in Python terms that represents a guarantee that
(addr1, p1) != (addr2, p2)
and consequently (addr1, p1, addrS, pS) != (addr2, p2, addrS, pS)
Now, when a packet arrives at the server system addressed to the server
endpoint, the TCP layer (whcih maintains a record of *both* endpoints
for each connection) simply looks at the incoming address and port
number to determine which process, of the potentially many using (addrS,
pS), it needs to be delivered to.
If this isn't enough then you should really take this problem to a
TCP/IP group. It's pretty basic, and until you understand it you will
never make sense of TCP/IP client/server communications.
http://holdenweb.com/linuxworld/NetProg.pdf
might help, but I don't guarantee it.
regards
Steve
(Hey, I know you! ;) )
Right.
7stud, what you seem to be missing, and what I'm not sure if anyone has
clarified for you (I have only skimmed the thread), is that in TCP,
connections are uniquely identified by a /pair/ of sockets (where
"socket" here means an address/port tuple, not a file descriptor). It is
fine for many, many connections, using the same local port and IP
address, so long as the other end has either a different IP address _or_
a different port. There is no issue with lots of processes sharing the
same socket for various separate connections, because the /pair/ of
sockets is what identifies them. See the "Multiplexing" portion of
section 1.5 of the TCP spec (http://www.ietf.org/rfc/rfc0793.txt).
Reading some of what you've written elsewhere on this thread, you seem
to be confusing this address/port stuff with what accept() returns. This
is hardly surprising, as unfortunately, both things are called
"sockets": the former is called a socket in the various RFCs, the latter
is called a socket in documentation for the Berkeley sockets and similar
APIs. What accept() returns is a new file descriptor, but the local
address-and-port associated with this new thing is still the very same
ones that were used for listen(). All the incoming packets are still
directed at port 80 (say) of the local server by the remote client.
It's probably worth mentioning at this point, that while what I said
about many different processes all using the same local address/port
combination is true, in implementations of the standard Berkeley sockets
API the only way you'd _arrive_ at that situation is that all of those
different connections that have the same local address/port combination
is that they all came from the same listen() call (ignoring mild
exceptions that involve one server finishing up connections while
another accepts new ones). Because while one process has a socket
descriptor bound to a particular address/port, no other process is
allowed to bind to that combination. However, for listening sockets,
that one process is allowed to accept many connections on the same
address/port. It can handle all those connections itself, or it can fork
new processes, or it can pass these connected sockets down to
already-forked processes. But all of them continue to be bound to the
same local address-and-port.
Note that, if the server's port were to change arbitrarily for every
successful call to accept(), it would make it much more difficult to
filter and/or analyze TCP traffic. If you're running, say, tcpdump, the
knowledge that all the packets on a connection that was originally
directed at port 80 of google.com, will continue to go to port 80 at
google.com (even though there are many, many, many other connections out
there on the web from other machines that are all also directed at port
80 of google.com), is crucial to knowing which packets to watch for
while you're looking at the traffic.
--
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/
> 7stud, what you seem to be missing, and what I'm not sure if anyone has
> clarified for you (I have only skimmed the thread), is that in TCP,
> connections are uniquely identified by a /pair/ of sockets (where
> "socket" here means an address/port tuple, not a file descriptor).
Using the word "socket" as a name for an address/port tuple is
precisely what's causing all the confusion. An address/port
tuple is simply not a socket from a python/Unix/C point of
view, and a socket is not an address/port tuple.
> It is fine for many, many connections, using the same local
> port and IP address, so long as the other end has either a
> different IP address _or_ a different port. There is no issue
> with lots of processes sharing the same socket for various
> separate connections, because the /pair/ of sockets is what
> identifies them. See the "Multiplexing" portion of section 1.5
> of the TCP spec (http://www.ietf.org/rfc/rfc0793.txt).
Exactly.
> Reading some of what you've written elsewhere on this thread,
> you seem to be confusing this address/port stuff with what
> accept() returns. This is hardly surprising, as unfortunately,
> both things are called "sockets": the former is called a
> socket in the various RFCs,
I must admit wasn't familiar with that usage (or had forgotten
it).
--
Grant Edwards grante Yow! Look DEEP into the
at OPENINGS!! Do you see any
visi.com ELVES or EDSELS ... or a
HIGHBALL?? ...
> If two sockets are bound to the same host and port on the server, how
> does data sent by the client get routed? Can both sockets recv() the
> data?
Undefined.
You certainly won't find the answer in the RFCs which define the protocol
because sockets aren't part of the protocol.
Unfortunately, you won't find the answer in the Socket API documentation
either because the socket API documentation is pretty vague about most
stuff.
One possible answer is that the operating system won't let you bind two
sockets to the same (address, port) pair. But, another possibility is that
it will. And even if it won't, consider the case of a process which forks;
the child inherits the already bound socket from the parent.
So, either way, you're left with the question, what happens with two
sockets both bound to the same (address, port) pair? For the sake of
simplicity, I'm assuming UDP, so there's no connection 4-tuple to worry
about. The answer is, again, undefined. One reasonable answer is that
packets received by the operating system are doled out round-robin to all
the sockets bound to that port. Another is that they're duplicated and
delivered to all sockets. Anything is possible.
But, as other posters have said, this really isn't a Python question. This
is a networking API question. Python just gives you a very thin layer on
top of whatever the operating system gives you, and lets all the details of
the OS implementation quirks shine through.
> ---
> When you surf the Web, say to http://www.google.com, your Web browser
> is a client. The program you contact at Google is a server. When a
> server is run, it sets up business at a certain port, say 80 in the
> Web case. It then waits for clients to contact it. When a client does
> so, the server will usually assign a new port, say 56399, specifically
> for communication with that client, and then resume watching port 80
> for new requests.
> ---
> http://heather.cs.ucdavis.edu/~matloff/Python/PyNet.pdf
You should *not* trust all you find on the Net...
--
Gabriel Genellina
FWIW, the word was used to mean the address/port tuple (RFC 793) before
there was ever a python/Unix/C concept of "socket".
And I totally agree that it's confusing; but I submit that IETF has a
stronger claim over the term than Unix/C/Python, which could have just
stuck with "network descriptor" or some such. ;)
--
Didn't give it a thorough read, but I did see a section about the server
setting up a new socket, called a "connection socket".
Which isn't incorrect, but proves Grant's point rather well, that the
confusion is due to the overloaded term "socket". In that context, it's
speaking quite clearly of the "Python/C/Unix" concept of a "socket", and
not (as some other texts do) of the address/port combination.
To reiterate for 7stud, accepting a new "connection socket" does _not_
change the address or port from the originally bound "for-listening" socket.
I could claim I was innocently unaware of that usage, though I
have read the RFCs, so I'll go with Steve Martin's classic
excuse: "I forgot."
> And I totally agree that it's confusing; but I submit that
> IETF has a stronger claim over the term than Unix/C/Python,
> which could have just stuck with "network descriptor" or some
> such. ;)
They probably had to come up with a system call name that was
uniquely identified by six characters or something like that.
--
Grant Edwards grante Yow! Does someone from
at PEORIA have a SHORTER
visi.com ATTENTION span than me?