FreeBSD on Amazon AWS EC2 long standing performance problems

336 views
Skip to first unread message

Gunther Schadow

unread,
Feb 5, 2021, 9:53:56 AM2/5/21
to freebsd-p...@freebsd.org
Hi, I've been with FreeBSD since 386BSD 0.0new. Always tried to run
everything on it. I saw us lose the epic race against Linux over the
stupid BSDI lawsuit. But now I'm afraid I am witnessing the complete
fading of FreeBSD from relevance in the marketplace as the performance
of FreeBSD on AWS EC2 (and as I see in the chatter from other "cloud"
platforms) falls far behind that of Linux. Not by a few % points, but
by factors if not an order of magnitude!

The motto "the power to serve" meant that FreeBSD was the most solid
and consistently performing system for heavy multi-tasking network
and disk operation. A single thread was allowed to do better on another
OS without us feeling shame, but overall you could rely on FreeBSD
being your best choice to overall server performance.

The world has changed. We used to run servers on bare metal in a cage
in physical data center. I did that. A year or two of instability with
the FreeBSD drivers for new beefy hardware didn't scare me off.

Now the cost and flexibility calculations today changed the market
away from bare metal to those "cloud" service providers, Amazon AWS
(>38% market share), Azure (19% market share), and many others. I
still remember searching for "hosting" providers who would
offer FreeBSD (or any BSD) as an option and it was hard to find. On
Amazon AWS we have the FreeBSD image ready to launch, that is good.

But the problem is, it's disk (and network?) performance is bad (to
horrible) and it's really sad and embarrassing. Leaving FreeBSD beaten
far behind and for realistic operations, it's impossible to use, despite
being so much better organized than Linux. I have put significant
investment into a flexible scalable FreeBSD image only to find now that I
just cannot justify using FreeBSD when Linux out of the box is several
times faster.

There have been few problem reports about this over many years, and
they all end the same way: either no response, or defensive response
("your measures are invalid"), with the person reporting the problem
eventually walking away with no solution. Disinterest. I can link to
those instances. Examples:

https://lists.freebsd.org/pipermail/freebsd-performance/2009-February/003677.html
https://forums.freebsd.org/threads/aws-disk-i-o-performance-xbd-vs-nvd.74751/
https://forums.freebsd.org/threads/aws-ec2-ena-poor-network-performance-low-pps.77093/#post-492744
https://forums.freebsd.org/threads/poor-php-and-python-performances.72427/
https://forums.freebsd.org/threads/freebsd-was-once-the-power-to-server-but-in-an-aws-world-we-have-fallen-way-waaay-behind-and-there-seems-no-interest-to-fix-it.78738/page-2

My intention is not to rant, vent, proselytize to Linux (I hate Linux)
but to see what is wrong with FreeBSD? And how it can be fixed? Why does
it seem nobody is interested in getting the dismal AWS EC2 performance
resolved? This looks to me like a vicious cycle: FreeBSD on AWS is
bad so nobody will use it for any real work, and because nobody uses it
there is no interest in making it work well. In addition there is no interest
on the side of FreeBSD people to make it better. It's got to be the lack
of interest, not of anyone not having access to the AWS EC2 hardware.

What can be done? I am trying to run a company, so I cannot justify playing
with this for much longer shooting in the dark. If I wasn't the boss myself,
my boss would have long told me to quit this nonsense and use Linux.
If I saw interest, I could justify holding out just a little longer. But
I don't see any encouraging feedback. Is there anyone at all in the FreeBSD
dev or FreeBSD.org as an organization interested in actually being competitive
in the AWS EC2 space (and other virtualization "clouds")? If so, how many?
How can this be fixed? How can I help? I cannot justify spending too much
more of my own time on it, but I could help making resources available
or paying for someone who has both a sense of great urgency to redeem
FreeBSD and the know-how to make it happen.

regards,
-Gunther


_______________________________________________
freebsd-p...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-perform...@freebsd.org"

claudiu vasadi

unread,
Feb 5, 2021, 10:42:38 AM2/5/21
to Gunther Schadow, freebsd-p...@freebsd.org
FreeBSD got killed a long time ago. thank the leadership, or lack
thereof, for it. I was in your shoes and I had no choice but to ditch
it all together eventually. want pf? openbsd. want something else, go
linux :shrug:
several on the forum (old and new) feel the same and had to make the
same decision in the end..... it's just sad.

Gordon Bergling

unread,
Feb 5, 2021, 1:17:12 PM2/5/21
to Gunther Schadow, freebsd-p...@freebsd.org
Sorry, for top posting.

Can you verify your feelings by numbers?

--Gordon
--
signature.asc

Gunther Schadow

unread,
Feb 5, 2021, 3:45:56 PM2/5/21
to freebsd-p...@freebsd.org
Gordon Bergling wrote:
> Can you verify your feelings by numbers?

Yes, like I said

>> Not by a few % points, but by factors if not an order of magnitude!

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253261

Do this:

dd if=/dev/zero of=/dev/nvd2 bs=100M status=progress

and you see that it's writing with the "whopping" speed of 70 MB/s.

That used to be good, but it is no longer good. Compare Amazon Linux doing
the same thing at 300 MB/s.

Now, when you put a file system over it, zfs or ufs, then instantly the
performance gets better:

newfs /dev/nvd2
mount /dev/nvd2 /mnt
dd if=/dev/zero of=/mnt/test bs=100M status=progress

now that works at about 250 MB/s. Decent. So, problem solved?

No! It turns out if I create a PostgreSQL database over this setup, then
again there is massive delay on the read and write and throughput will drop
to even worse than 70 MB/s. Creating one index takes 10 times as long as
that same on the Linux system.

PS: no need to point out that Linux uses buffer cache for direct write to
device and BSD doesn't. Those effects will not make a difference when you
write (or read) more than the buffer cache size (e.g., a few GBs).

Jin Guojun[VFF]

unread,
Feb 5, 2021, 6:00:23 PM2/5/21
to freebsd-p...@freebsd.org
On 2021-02-05 12:45, Gunther Schadow wrote:
> Gordon Bergling wrote:
>> Can you verify your feelings by numbers?
>
> Yes, like I said
>
>>> Not by a few % points, but by factors if not an order of magnitude!
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253261
>
> Do this:
>
> dd if=/dev/zero of=/dev/nvd2 bs=100M status=progress
>
> and you see that it's writing with the "whopping" speed of 70 MB/s.
>
> That used to be good, but it is no longer good. Compare Amazon Linux
> doing
> the same thing at 300 MB/s.
>
> Now, when you put a file system over it, zfs or ufs, then instantly the
> performance gets better:
>
> newfs /dev/nvd2
> mount /dev/nvd2 /mnt
> dd if=/dev/zero of=/mnt/test bs=100M status=progress
>
> now that works at about 250 MB/s. Decent. So, problem solved?

It is not clear if this compares Apple to Apple.

What disk drives and CPUs are on FreeBSD, and what are disk drive(s) and
CPU(s) on AWS?

Knowing the drive brand and models will tell approximately the disk
throughput. Agree, 70MB/s is slow for modern disks, but your information
does not provide clue why this could be slow.

Can this setup get 250MB/s on FreeBSD 11.4? or 300MB/s with Ubuntu 16.04
on the same hardware?

-Jin

Sean Chittenden

unread,
Feb 5, 2021, 6:14:46 PM2/5/21
to Jin Guojun[VFF], freebsd-p...@freebsd.org
To be clear, this is a known issue that needs attention: it is not a
benchmarking setup problem. Network throughput has a similar problem and
needs similar attention. Cloud is not a fringe server workload. -sc

Gunther Schadow

unread,
Feb 5, 2021, 7:10:16 PM2/5/21
to freebsd-p...@freebsd.org
Hi Sean and Brendan

On 2/5/2021 6:14 PM, Sean Chittenden wrote:

> To be clear, this is a known issue that needs attention: it is not a
> benchmarking setup problem. Network throughput has a similar problem and
> needs similar attention. Cloud is not a fringe server workload. -sc

I am so glad you're saying that, because I was afraid I'd have to argue
again and again to make people see there is a problem.

But I come here anyway with some numbers:

This is the Amazon Linux system which I fell back to in desperation, launched,
added ZFS, compiled and set up PostgreSQL-13.2 (whatever newest) and am now

pg_dumpall -h db-old |dd bs=10M status=progress |psql -d postgres

we had sucked the data tables at 32 MB/s that is about the same speed as I
got on the FreeBSD. I assume that might be network bound.

Now it's regenerating the indexes and I see the single EBS g3 volume on fire:

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
nvme1n1 0.00 7.00 781.00 824.00 99508.50 76675.00 219.54 1.98 1.66 1.75 1.57 0.41 66.40
nvme1n1 0.00 3.00 1740.00 456.00 222425.00 33909.50 233.46 4.43 2.42 2.29 2.91 0.46 100.80
nvme1n1 0.00 0.00 1876.00 159.00 239867.00 16580.00 252.04 3.56 2.20 2.09 3.47 0.49 100.00
nvme1n1 0.00 0.00 1883.00 151.00 240728.00 15668.00 252.11 3.49 2.15 2.10 2.83 0.49 100.00
nvme1n1 0.00 0.00 1884.00 152.00 240593.50 15688.00 251.75 3.54 2.19 2.13 3.00 0.49 100.00
nvme1n1 0.00 1.00 1617.00 431.00 206680.00 50047.50 250.71 4.50 2.63 2.49 3.13 0.48 98.40
nvme1n1 0.00 1.00 1631.00 583.00 208331.50 47909.00 231.47 4.75 2.54 2.49 2.66 0.45 100.00
nvme1n1 0.00 0.00 1892.00 148.00 241128.50 15440.00 251.54 3.20 2.01 1.96 2.73 0.49 100.00

On FreeBSD I haven't got the numbers saved now, but with all the simultaneous
reading and writing activity going on, the system got down to just about 40 MB/s
read and 40 MB/s write, if lucky.

There is heavy read of base tables, then sorting in temporary space (resad/write),
then write to the WAL and to the index. This stuff would bring FreeBSD to its
knees.

Here on Linux I'm not even trying to be smart. With the FreeBSD attempt I had
already taken the different read and write streams to different EBS disks each
having a different ZFS pool. That helped a little bit. But not much. On this
Linux thing here I didn't even bother and it's going decently fast. There is
still some 20% iowait, which could probably be optimized by doing what I did for
FreeBSD, separate devices, or, I might try to just make a single ZFS volume
striped from 5 smaller 100 GB drives rather than a single 500 GB drive.

This was just to provide some numbers that I have here.

But I absolutely maintain that this has got to be a well known problem and that
it's not about the finer subtleties of how to make a valid benchmark
comparison.

regards,
-Gunther

Brendan Gregg

unread,
Feb 5, 2021, 7:23:10 PM2/5/21
to Gunther Schadow, freebsd-p...@freebsd.org
G'Day Gunther,

I can at least share some experiences from the other side, and how
they can apply to FreeBSD.
I think the better question is how many full time staff work on EC2
FreeBSD performance, and how to create more roles.

Large companies have performance engineering teams to reduce cost, as
do latency-sensitive companies of any size. I'd estimate there are
well over 100 staff with the title "performance engineer" who work on
Linux: most of whom work on Linux as part of another product.

There is a mentality with these large companies, whether it makes
sense or not, to run their own datacenters. So most of the performance
engineers on Linux are looking at bare metal performance and not EC2.

Fortunately for Linux, there are a few of us who do work on EC2. I
work at Netflix on the streaming side, and myself and a colleague
(Amer) are performance engineers who work on Linux EC2 a lot (as well
as other things). Other teams work on Linux EC2 performance from time
to time (e.g., BaseOS and Titus). If you were to add up all our
collective time, it probably works out to 2 full time engineers
working on Linux EC2 performance. Our focus is LTS releases, so we
aren't finding issues every day (we likely would if we were looking at
mainline), but we do find some and get them fixed.

So who is the Netflix of EC2 FreeBSD? What large company, with a
performance team (or is large enough to create one) runs (or may
consider running) FreeBSD on EC2? I'd identify the company and
management, and help them with a proposal and job description for a
performance engineer. My Systems Performance book (2nd Ed) documents
methodologies that are applicable to BSD (and includes various
mentions of BSD), and could help a new engineer get started.

There are some who do work on FreeBSD EC2 sometimes, e.g.:

https://twitter.com/cperciva/status/1211125881264934917

Work like this is great, but performance work is endless and needs
full-time attention. So I wouldn't say the problem is a lack of
interest, but a lack of full-time roles. Who can create them?

(Yes, Netflix does create full-time roles for FreeBSD bare metal
performance on the OCA team. But their focus is bare metal.)

As for the low-level performance issues: EC2 has been switching to the
Nitro hypervisor, and I imagine there may be driver work to make sure
it's using it correctly. I summarized the switch as:

http://www.brendangregg.com/blog/2017-11-29/aws-ec2-virtualization-2017.html

Brendan

Mark Saad

unread,
Feb 6, 2021, 5:06:00 AM2/6/21
to Sean Chittenden, freebsd-p...@freebsd.org

> On Feb 5, 2021, at 6:14 PM, Sean Chittenden <se...@freebsd.org> wrote:
>
> To be clear, this is a known issue that needs attention: it is not a

So silly question ; how does FreeBSD work on Google gcp, Microsoft Azure , Digital Ocean?
Do they all suffer the same issue or is this just an Amazon issue ?

Also just a bit of advice; Contrary to popular belief Amazon does not actually sell magic beans .


---
Mark Saad | none...@longcount.org

Łukasz Wąsikowski

unread,
Feb 6, 2021, 7:07:37 AM2/6/21
to Mark Saad, Sean Chittenden, freebsd-p...@freebsd.org
W dniu 2021-02-06 o 00:35, Mark Saad pisze:

> Also just a bit of advice; Contrary to popular belief Amazon does not actually sell magic beans .

AWS has 32% market share. I don't know if they beans are magic or not,
but this is the biggest cloud today. If FreeBSD want to be on this
train, it has to perform at least as good as competitors. It's simply
matter of survival for this project.

In 2007 about 90% of my machines were running FreeBSD (30-40 servers).
Now I run FreeBSD on 9 boxes, and 130 is running Linux. And I'm not the
only one out there who was forced to do this transition.

--
Best regards,
Łukasz Wąsikowski

Mark Saad

unread,
Feb 6, 2021, 1:09:50 PM2/6/21
to Łukasz Wąsikowski, Sean Chittenden, freebsd-p...@freebsd.org

On Feb 6, 2021, at 7:07 AM, Łukasz Wąsikowski <luk...@wasikowski.net> wrote:
>
All
So what I was getting at, is do we have good data on what the issue is ? Can we make a new wiki page on the FreeBSD wiki to track what works what and doesn’t . Does one exist ?

To be clear we should check if the issue something that aws is doing with their xen platform , kvm/qemu one or common to all ? Also does that same issue appear on google and Microsoft’s platforms? This will at least get some bounds on the problem and what if any fixes may exist .
There are some commercial FreeBSD products running on aws . Maybe the vendors know some stuff that can help ?


Thoughts ?

---
Mark Saad | none...@longcount.org

Gunther Schadow

unread,
Feb 10, 2021, 11:06:08 AM2/10/21
to freebsd-p...@freebsd.org
I think we got enough feed back now from other professionals to suggest that it
would do the FreeBSD project good to acknowledge the issue and create some sort
of work in progress / project statement, perhaps a wiki, where people can flock
to to look for workarounds, current statue, just to not feel so lost and lonely
and wondering if anybody cares at all. Such a meeting-point could at least be
something of a self-help group, for emotional support and positive thinking =8).

I have a slight sign of hope also: in my latest Amazon Linux db server deployment
the performance is only half of what I got in my previous db server deployment.
Go figure. But it's a fresh launch and I can't figure out what else I had optimized
with the previous Amazon Linux box. So, while not improving anything, it at least
closes the inequality between Linux and FreeBSD in the socialistic way ;)

Now I am trying the FreeBSD install again with the same disk setup. Because one
thing I clearly know is that it makes a huge difference of having one large EBS
device and a single partition / file system vs. the same large EBS device with
many partitions and file systems, separating tables, indexes, temporary sort space,
etc. so that there is less random access contention on a single file system.

Why that is important, I wonder, actually. And it's not the underlying EBS that
matters so much. Like in bare metal world, the approach would be separate disks
vs. striping (RAID-0) to get more spindles involved instead of having disk seek.
But none of that should mater that much any more with SSD drives (which EBS gp2 and
gp3 are!)

Fortunately now Amazon has gp3 volumes also which support 3000 io transactions
per second (IOps) and 250 or up to 1000 MB/s throughput despite being as small as
4 GB. So what I'm doing now is instead of a partitioned 1 TB gp2 volume with many
partitions, I make myself over 20 separate smaller gp3 volumes, the advantage of
this is that I can individually resize those without colliding with the neighboring
partition.

Tomorrow I should have better comparison numbers for my database on both Linux and
FreeBSD configured the exact same way, and it might be closer now. But sadly only
in the socialist way.

regards,
-Gunther

Sean Chittenden

unread,
Feb 10, 2021, 11:48:12 AM2/10/21
to freebsd-p...@freebsd.org
Expect to see something about this on this year's Community Survey and the
Core Team will do something with this information.

Cloud has been an increasingly important workload for FreeBSD users. Or at
least, our community is moving a fair number of workloads to virtualized
metal, and running on the cloud is in need of attention as or more than
some of our other subsystems. Many of the people who have the skills to
jump in and diagnose and fix this type of problem have been sucked up with
other professional commitments and aren't available. Which is to say, we
need to increase the depth of our bench for performance-related work.

But if someone's reading this thinking, "well I'll wait until Core...."
don't wait! Jump in.

If people are looking for a single starting point, I'd suggest trying to
conjure up an iflib-backed ethernet driver and reducing the administrative
friction of spinning up one of our images in the Cloud. "Secure by
default" or conservative image defaults aren't doing anyone any favors in
the cloud era.

Or begin porting the io_uring kernel interface and begin working down the
stack into the driver layer.

$0.02. -sc

Gunther Schadow

unread,
Feb 13, 2021, 8:31:44 PM2/13/21
to freebsd-p...@freebsd.org
On 2/10/2021 11:47 AM, Sean Chittenden wrote:

> Expect to see something about this on this year's Community Survey and the
> Core Team will do something with this information.
>
> Cloud has been an increasingly important workload for FreeBSD users.

Thank you Sean,

I think it would really be good to have this kind of Wiki or point of visible
discussion of FreeBSD on AWS. Because there isn't just bad news, there is also
good news.

I have a very expensive reporting job on my PostgreSQL database. And I have
3 servers now. An older Amz Linux setup which I did a few years ago, another
even earlier. I thought my earlier Amz Linux server was a lot faster than
FreeBSD that I set up now.

But then in my desperation I set up a new Amz Linux where I even got ZFS
installed. But, lo and behold, when I did the report generation I found that
FreeBSD worked it in 12 hours while the new Linux system took 21 hours. So
there is now clarity that while perhaps some read and write speeds may look
better on Linux, in something as complex as my database the result can be
better on FreeBSD.

There may still be ways to improve the Linux setup, perhaps even more than the
FreeBSD, but I don't see anything obvious. The good news is that I can now
go forth with my FreeBSD solution which is so much nicer to manage.

PS: there are still tremendous slowness. Especially using the AWS CLI Python
setup is unbearably and ridiculously slow, so sow in fact that I have rather
worked in learning how to sign my own requests and issue those AWS CLI
operations with my Java workaround.

regards,
-Gunther
Reply all
Reply to author
Forward
0 new messages