Bruce,
Thanks for your suggestions.
I've done more experiments and I think you are right that this is a
result of virtualization. Here's a recent research paper that talks
about this issue (
http://www.ece.rice.edu/~pjv/mclock.pdf). It notes
that none of the current Hypervisors try to make reservations,
guarantees or limits on I/O.
I played around with ttcp and wrote a simple UDP client/server to test
this theory and it clearly shows ec2 has high packet loss and a very
wide variability (See Experiment 1 below).
Then I re-ran the original 6 PE test on three non-virtual nodes in a
lightly loaded network. With three machines I was able to achieve a
throughput of 10,000 messages/sec with 0.7% packet loss. No packet
loss at 3400 messages/sec.
All of this makes me think that you may be correct that TCP/IP is more
appropriate for virtual environments.
I've answered your other questions below.
Frank
===============================
Experiment 1 UDP packet loss comparison
* client sends some test inf 10,000 1k blocks as fast as it can
* server listens for UDP requests, compares with expected results and
prints the error rate
Test 1:
* two large instances on amazon EC2, us-east-1b region
* repeatedly run tests in sequence
* nothing else running on these machines
Results:
* very inconsistent packet loss
./udp.py --hostname
ec2-72-44-54-70.compute-1.amazonaws.com -c client
--bufsize=1000 --count 10000 --delay 0
Expected 10000, got 4557, error rate 0.5443
Expected 10000, got 7588, error rate 0.2412
Expected 10000, got 7515, error rate 0.2485
Expected 10000, got 7104, error rate 0.2896
Expected 10000, got 592, error rate 0.9408
Expected 10000, got 5502, error rate 0.4498
Expected 10000, got 7727, error rate 0.2273
Expected 10000, got 7142, error rate 0.2858
Expected 10000, got 5796, error rate 0.4204
Expected 10000, got 5541, error rate 0.4459
Expected 10000, got 5810, error rate 0.419
Expected 10000, got 7314, error rate 0.2686
Test 2:
* two non-virtual intel dual core 64bit 3Ghz "developer" machines
* quiet network
* low CPU utilization
* repeatedly run tests in sequence
Results:
* very little loss
Expected 10000, got 9999, error rate 0.0001
finished, count: 10000
Expected 10000, got 9999, error rate 0.0001
finished, count: 10000
finished, count: 10000
finished, count: 10000
finished, count: 10000
finished, count: 10000
Test 3:
* two dual core AMD 2Ghz machinees, circa 2008
* busy network
* server with low CPU utilization (~99% free)
* repeatedly run tests in sequence
Results:
* very consistent packet loss ~ 17%
./udp.py -c server --hostname=10.19.136.80
listening on host 10.19.136.80, port 5077
Expected 10000, got 8152, error rate 0.1848
Expected 10000, got 8330, error rate 0.167
Expected 10000, got 8303, error rate 0.1697
Expected 10000, got 8310, error rate 0.169
Expected 10000, got 8308, error rate 0.1692
Expected 10000, got 8287, error rate 0.1713
Expected 10000, got 8285, error rate 0.1715
Expected 10000, got 8348, error rate 0.1652
Expected 10000, got 8070, error rate 0.193
Expected 10000, got 8119, error rate 0.1881
===============================
Answers to questions:
> - Was that single machine test run on an EC2 instance or a dedicated
> box?
A: the single machine test was on a dual core dell laptop running
WinXP running a VirtualBox Ubuntu guest OS.
> - Try running the command "netstat -su" on each instance before the
> test and again after the test. The difference between the two "packet
> receive errors" counts will indicate your UDP packet loss due to full
> UDP buffer.
I did that and saw packet loss (see NETSSTAT Test 1).
> - In your logs, check for the strings "lll_dr" and "pec_dr" anywhere
> in any of the logs (the two will never be on the same line). These
> lines have the count of messages that were successfully received by
> the listener thread but could not be used because some internal queue
> filled up. Or you could just send us your logs.
A: yes there was lost data (see NETSSTAT Test 1).
> - The queue sizes would be interesting. Look for the strings "lll_qsz"
> and "pec_qsz" in the logs (again, the two will not be on the same
> line).
========================================================
NETSSTAT Test 1:
* 1 EC2 instance
* 3400 messagessec
* 100,000 total messages
* queue size 8000
Before test: ---------------------------------
[ec2-user@ip-10-114-234-81 bin]$ netstat -su
IcmpMsg:
InType3: 8
OutType3: 2
Udp:
698702 packets received
2 packets to unknown port received.
13248 packet receive errors
757342 packets sent
RcvbufErrors: 13248
UdpLite:
IpExt:
InOctets: 348825372
OutOctets: 370021990
TEST RESULTS
PE 1 93843
PE 2 90368
PE 3 87267
PE 4 36025
PE 5 35032
PE 6 33880
TOTAL 376415, 15% loss
After test: ---------------------------------
[ec2-user@ip-10-114-234-81 bin]$ netstat -su
IcmpMsg:
InType3: 8
OutType3: 2
Udp:
1083037 packets received
2 packets to unknown port received.
21674 packet receive errors
1150104 packets sent
RcvbufErrors: 21674
UdpLite:
IpExt:
InOctets: 542270871
OutOctets: 564602787
Error stats:
grep pec_dr ../s4_core/logs/s4_core/*
../s4_core/logs/s4_core/s4_core.log:2011-01-25 02:12:29,177 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_dr = 7890
../s4_core/logs/s4_core/s4_core.mon:2011-01-25 02:12:29,177 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_dr = 7890
grep pec_qsz ../s4_core/logs/s4_core/*
../s4_core/logs/s4_core/s4_core.log:2011-01-25 02:12:29,179 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz = 6622
../s4_core/logs/s4_core/s4_core.log:2011-01-25 02:12:29,185 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz_w = 482
../s4_core/logs/s4_core/s4_core.log:2011-01-25 02:12:59,168 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz = 127
../s4_core/logs/s4_core/s4_core.log:2011-01-25 02:12:59,169 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz_w = 2958
../s4_core/logs/s4_core/s4_core.log:2011-01-25 02:13:29,166 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz_w = 0
../s4_core/logs/s4_core/s4_core.mon:2011-01-25 02:12:29,179 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz = 6622
../s4_core/logs/s4_core/s4_core.mon:2011-01-25 02:12:29,185 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz_w = 482
../s4_core/logs/s4_core/s4_core.mon:2011-01-25 02:12:59,168 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz = 127
../s4_core/logs/s4_core/s4_core.mon:2011-01-25 02:12:59,169 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz_w = 2958
../s4_core/logs/s4_core/s4_core.mon:2011-01-25 02:13:29,166 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:pec_qsz_w = 0
grep lll_qsz ../s4_core/logs/s4_core/*
../s4_core/logs/s4_core/s4_core.log:2011-01-25 02:12:29,186 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:lll_qsz = 16
../s4_core/logs/s4_core/s4_core.log:2011-01-25 02:12:59,170 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:lll_qsz = 1
../s4_core/logs/s4_core/s4_core.mon:2011-01-25 02:12:29,186 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:lll_qsz = 16
../s4_core/logs/s4_core/s4_core.mon:2011-01-25 02:12:59,170 monitor
INFO (Log4jMonitor.java:57) S4::S4CoreMetrics:lll_qsz = 1