贾连晨
unread,Apr 17, 2024, 9:02:53 AM4/17/24Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Pantheon
Hello,
I'm trying to take advantage of the server's multiple CPU cores to perform tests quickly. However, I've encountered a problem. When I run the tests, I get the error "Failed to connect to tunnel server after 5 tries, exiting.." It seems that the server and client cannot establish a normal channel.
To analyze the problem, I added some output statements in the code. In `tunnelservershell.cc`, before line 135 `send_wrapper_only_datagram( listening_socket, (uint64_t) -2 );`, I added `cerr<<"return "<<listening_socket.local_address().ip()<<" "<<listening_socket.local_address().port()<<endl;`. In `tunnelclientshell.cc`, at line 131 `send_wrapper_only_datagram( server_socket, (uint64_t) -1 );`, I added `cerr<<"send:"<<server_socket.local_address().ip()<<" "<<server_socket.local_address().port()<<endl;`.
After observing the input and output, I found that the output contents are as follows:
```
send:100.64.0.2 41649
Tunnelserver got connection from tunnelclient
return 100.64.0.1 57822
Tunnelclient got connection from tunnelserver at 100.64.0.1 41649
Tunnel is connected
[tsm] tunnel 2 python /home/jlc/disk/pantheon/src/wrappers/cubic.py receiver 45667 None
Tunnelclient received no response from tunnelserver, retrying 1/5
send:100.64.0.2 48302
Tunnelclient received no response from tunnelserver, retrying 2/5
send:100.64.0.2 48302
Tunnelclient received no response from tunnelserver, retrying 3/5
send:100.64.0.2 48302
[tcm] tunnel 1 python /home/jlc/disk/pantheon/src/wrappers/cubic.py sender 100.64.0.3 44859 None 1
[tcm] tunnel 2 python /home/jlc/disk/pantheon/src/wrappers/cubic.py sender 100.64.0.3 45667 None 2
Tunnelclient received no response from tunnelserver, retrying 4/5
send:100.64.0.2 48302
Tunnelclient received no response from tunnelserver, retrying 5/5
send:100.64.0.2 48302
Failed to connect to tunnel server after 5 tries, exiting..
Tunnel connection timeout
[tcm] tunnel 1 mm-tunnelclient $MAHIMAHI_BASE 36048 100.64.0.4 100.64.0.3 --ingress-log=/home/jlc/disk/pantheon/tmp/cubic_acklink_run1_flow1_uidcffa5765-98d2-4b87-a57d-ee6e05676611.log.ingress --egress-log=/home/jlc/disk/pantheon/tmp/cubic_datalink_run1_flow1_uidcffa5765-98d2-4b87-a57d-ee6e05676611.log.egress
[tcm] tunnel 1 readline
Tunnelclient listening for server on port 46617
send:100.64.0.2 46617
Tunnelclient received no response from tunnelserver, retrying 1/5
send:100.64.0.2 46617
Tunnelclient received no response from tunnelserver, retrying 2/5
send:100.64.0.2 46617
Tunnelclient received no response from tunnelserver, retrying 3/5
send:100.64.0.2 46617
Tunnelclient received no response from tunnelserver, retrying 4/5
send:100.64.0.2 46617
Tunnelclient received no response from tunnelserver, retrying 5/5
send:100.64.0.2 46617
Failed to connect to tunnel server after 5 tries, exiting..
Tunnel connection timeout
[tcm] tunnel 1 mm-tunnelclient $MAHIMAHI_BASE 36048 100.64.0.4 100.64.0.3 --ingress-log=/home/jlc/disk/pantheon/tmp/cubic_acklink_run1_flow1_uidcffa5765-98d2-4b87-a57d-ee6e05676611.log.ingress --egress-log=/home/jlc/disk/pantheon/tmp/cubic_datalink_run1_flow1_uidcffa5765-98d2-4b87-a57d-ee6e05676611.log.egress
[tcm] tunnel 1 readline
Tunnelclient listening for server on port 58701
send:100.64.0.2 58701
Tunnelclient received no response from tunnelserver, retrying 1/5
send:100.64.0.2 58701
Tunnelclient received no response from tunnelserver, retrying 2/5
send:100.64.0.2 58701
Tunnelclient received no response from tunnelserver, retrying 3/5
send:100.64.0.2 58701
```
It looks like there is a port number error. I have included the complete log and the `test_mp.py` file that reproduces this problem in the appendix.