I know your problem, see my inline reply.
On Wed, Dec 12, 2012 at 6:36 PM, Michael Chang <
mi...@tellapart.com> wrote:
> Hey Josiah.
>
> The %CPU looks okay.
>
> Through iftop, it looks like we're using about 25Mbps of network bandwidth.
> From the output of 'free', we have about 4GB of free memory (buffer+cache)
> It doesn't look like our cpus are being taxed. Is %iowait high here? We're
> making RDB snapshots every 5 minutes for each process here
>
> 02:15:01 AM CPU %user %nice %system %iowait %steal
> %idle
> 02:25:01 AM all 4.71 0.03 1.78 14.42 0.14
> 78.92
> 02:25:01 AM 0 0.74 0.13 1.29 32.34 0.41
> 65.09
> 02:25:01 AM 1 3.14 0.00 1.07 0.17 0.04
> 95.58
> 02:25:01 AM 2 6.84 0.00 2.22 11.88 0.06
> 79.01
> 02:25:01 AM 3 8.11 0.00 2.53 13.31 0.06
> 75.98
>
> We're running this on a m1.xlarge instance on ec2, with writing RDB files to
> an EBS volume, which I realize could be a point of contention. (how big
> though?)
EBS is remote data storage. Any data you write to your EBS disk gets
written over the network. Think of it like NFS. So any time your MySQL
reads/writes, it is doing a network read/write. Any time Redis dumps
to an RDB file, the data gets written to the network. If any of your
slaves loses sync and reconnects, a dump occurs (over the network),
which then gets read over the network again to sync to the slave
(unless you've got enough spare memory for it not to be an issue).
Check your free memory, buffers, and cache.
My recommendation:
1. Use an Amazon RDS instance with replication to multiple
availability zones. Let them take care of maintaining MySQL (also add
daily snapshots, just in case).
2. Switch Redis to an instance-store backed EC2 VM, use one of the
filesystem watch applications to notice when a dump occurs, which then
can signal an upload to S3 (explicitly limit your bandwidth to ensure
that this doesn't kill your network utilization), and remember to
create a machine image for this after you've gotten it set up right.
These two changes will not only improve your reliability and
resilience to EC2 outages, but your network hiccups will go away.
- Josiah
>
https://groups.google.com/d/msg/redis-db/-/RZtpVRcW8NoJ.