Thanks for the tip on Hugo. It will help partially but not completely. And my mistake, I meant a droplet, not bare metal.
Some more notes on the site architecture might make the problem clearer.
We currently use a Python backend (Flask) to handle all requests. Texts and dictionaries are mostly static content, but we also have a proofing interface that needs read-write support, and small features in the library need read-write (e.g. user settings and bookmarks). For redundancy, we have two Python web workers served through gunicorn. They share some memory due to copy-on-write but I think are still using redundant memory due to multiple imports of sqlalchemy, etc. in addition to other side data.
Ambuda also handles a variety of async tasks for things like running OCR, splitting PDFs into separate images, and calling other APIs. These touch the database and likewise cause imports of sqlalchemy and other libraries.
In total Ambuda is around 5-6 Python processes, which each pay the 40MB cost of the Python interpreter, additional cost for Python imports and side data, and normal memory consumption for handling requests and async tasks. This is not efficient.
So while texts could be served as static files, this doesn't fix the core problem, which is the resource-heavy proofing setup that needs async workers to process PDFs, run OCR, tokenize, run reports, etc.
Some options I am mulling over:
- spend some time on removing or optimizing heavy imports and data.
- switch to an async Python framework, which means we can reduce the number of Celery processes we create by putting more async tasks in the web backend itself.
- move off of Python to something like Go. I like Rust much more than Go but I value quick iteration and am wary of Rust's compilation times.
- throw money at the problem and get a better droplet.
Arun