Crude sizing guidelines

1,644 views
Skip to first unread message

Ned Batchelder

unread,
Sep 11, 2015, 1:53:01 PM9/11/15
to opene...@googlegroups.com
We often get asked by new adopters of Open edX about how to size their deployment.  This is a very complicated topic: choosing the number and size of machines to host Open edX can depend on lots of subtle factors, including the complexity of the courses.

But I'd like to throw out some simple starting figures based on our experience using Amazon Web Services.  I hope others will contribute more data, and we can refine these guidelines.

The basic question we are trying to answer is: how many AWS instances should I allocate to support my expected use?

The key information you'll provide is: how many simultaneous students you expect on your site during peak times.  This depends on the total number of enrolled students and how many courses you run.  It also depends on how widely spread around the world your students are, whether there are reasons they would all do coursework at the same time, and so on.  All of this boils down to your estimate about how many students will be on the site at the same time during your site's busiest time.

We looked at some of our own deployments to get a sense of how much machine supports how many users.  There are some considerations that make a comparison of edx.org to a new Open edX installation difficult, but we've tried to account for those factors.

Amazon offers many different instance types.  http://www.ec2instances.info/ has a great comparison.

My rough guideline is: an LMS worker will support about 75 simultaneous active users.  To be on the safe side, give each LMS worker 1Gb of RAM.  When choosing an AWS instance model, RAM will be the bottleneck, not CPU.  Use your active simultaneous user estimate to decide how many workers you need.  Choose an AWS instance model: m2.2xlarge is a good choice.  Divide the number of workers by the amount of RAM (30Gb for m2.2xlarge) to determine the number of instances you need.

Some good devops principles to consider:
  • Never have only one machine, so allocate at least two instances
  • Start larger than you think you need, you can scale it back once you see the real system under real load.
As an example, suppose you estimate that you will have 800 active simultaneous users at your busiest time. 800/75 --> 10.6, so you will need 11 workers.  You need 11Gb of RAM.  This fits easily within a m2.2xlarge instance.

That covers the web workers.  You also need to configure at least a database machine.  We'll add to these guidelines in the future.

Do people have thoughts on this guideline? Is your experience radically different? Do you have a better way of describing this?

--Ned.

Pierre Mailhot

unread,
Sep 11, 2015, 2:19:09 PM9/11/15
to Open edX operations
Thanks for sharing this Ned.

That looks just about right. Let me explain.

When we ran our own load testing earlier this year on different size of EC2 instances, we saw that there was a relation of about 100 simultaneous users per CPU.

We currently run a front-end server with an m3.xlarge instance. So that's 4 vCPU and 15 GB of RAM. Like you suggested, we decided to start larger than what we needed. And judging from your numbers, we're definitely on the safe side.

Changhan Ryu

unread,
Sep 14, 2015, 7:46:50 AM9/14/15
to Open edX operations
Thanks, Ned, 
For our pilot course, we expect approxi. 200 active users. Based on your calculation, 200/75 = approx 3 ~ 4 workers which means 3~4 Gb of RAM. So 'T2 Medium' looks good. What do you think?

Fred Smith

unread,
Sep 15, 2015, 11:22:14 AM9/15/15
to opene...@googlegroups.com
Changhan,

't2' instances use a shared CPU with other instances, and provide throttled performance after bursting, which means they're not suitable for sustained workloads.  See https://aws.amazon.com/ec2/instance-types/t2/.  You're likely to end up using all your CPU or network credits, which will result in requests taking a long time to fulfill or timing out.  As a result, I wouldn't recommend using these types of instances for production, synchronous workloads, unless you know that your systems will be idle for a significant period every day.

A better choice, for about the same price per month ($48/month vs. $37/month) is the m3.medium.  It has slightly less RAM, which means you'll be able to run 3 workers instead of 4, but it provides guaranteed resources.

Run your systems using m3 series instances for a little while. If your cloudwatch CPU utilization graphs are very peaky, or if there are clearly defined peak times during the day, you might save money by migrating to a t2 series instance.

Also, as a best practice, I'd recommend running at least two instances using an ELB for production workloads.  This significantly escalates the cost of your implementation as you need to offload database to RDS and session storage to elasticache, but it will improve reliability of your service.

-Fred

On Sun, Sep 13, 2015 at 10:25 PM, Changhan Ryu <flue...@gmail.com> wrote:
Thanks, Ned, 
For our pilot course, we expect approxi. 200 active users. Based on your calculation, 200/75 = approx 3 ~ 4 workers which means 3~4 Gb of RAM. So 'T2 Medium' looks good. What do you think?

--
You received this message because you are subscribed to the Google Groups "Open edX operations" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openedx-ops...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openedx-ops/46b8a322-50af-436c-b9b7-5d0f891efea8%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Joel Barciauskas

unread,
Sep 15, 2015, 1:03:29 PM9/15/15
to opene...@googlegroups.com
Does anyone have thoughts or experience with database sizing? What's the base disk space required and how quickly have people seen disk grow over time, based on number of courses and students?

Also, are people running read replicas for MySQL and replica sets for Mongo? How many? How much RAM and CPU is required for those?

I'd like to get together reference production architectures for a few common sizes.


For more options, visit https://groups.google.com/d/optout.



--
Joel Barciauskas
edX | Engineering Manager, Open edX
141 Portland Street, 8th floor
Cambridge, MA 02139
jo...@edx.org

Changhan Ryu

unread,
Sep 16, 2015, 9:51:06 PM9/16/15
to Open edX operations
Thanks Fred. 
It must be a good reference for me to choose a model. 


Changhanhan

Nate Aune

unread,
Oct 4, 2015, 2:11:54 PM10/4/15
to Open edX operations
I would also like to get some guidelines for database sizing. The wiki page "Hosting edX in production" has a nice summary of the AWS infrastructure that edX.org uses (as of 2015-02-18):

But the MongoDB database sizes are not mentioned (my understanding is that edX is moving away from using Compose.io to host MongoDB, and instead moving towards hosting those on AWS?). 

What does that MongoDB cluster look like with replica sets? (Maybe I'll find out all this and more at John Eskew's upcoming MongoDB Boston meetup talk ;) http://www.meetup.com/Boston-MongoDB-User-Group/events/223993132/
  
The MySQL server is mentioned as a multi-AZ deployment of size db.m2.4xlarge, but according to the RDS pricing page, this size is no longer available. The m2 is no longer available, and now it's m3. Would the closest equivalent be the db.m3.2xlarge or the db.r3.4xlarge?

Also missing from this description is the servers required for memcached (I'm assuming edX uses AWS ElasticCache for that?) Is the edX DevOps team considering using AWS's latest Elasticsearch offering? https://aws.amazon.com/elasticsearch-service/

thanks,
Nate
 

Nate

jo...@edx.org

unread,
Oct 26, 2015, 11:08:47 AM10/26/15
to Open edX operations
Bertrand responded to our site survey on behalf of IONISx, which has a pretty significant amount of traffic, and agreed to have their numbers shared with the community. 

For a peak of 600 concurrent users, they have the following dedicated servers:

2 LMS (2x6GB RAM)
1 Studio
1 Workers
1 Mongo (modulestore)
1 MySQL (32GB RAM)

If you'd like to contribute to the survey and help us provide better guidance, please fill out our survey as well! 
Reply all
Reply to author
Forward
0 new messages