Does having a large custom DB table degrade overall performance?

19 views
Skip to first unread message

barbara schendel

unread,
May 19, 2025, 1:51:32 PMMay 19
to Minneapolis St. Paul WordPress User Group
Hi all!

We have some custom functionality for my client's site, which effectively turns it into a web app. The site features audio lessons which people can play, and when they do, for paid members it keeps track of their play history, which they can review on a "My Journey" page where we list all of their recent history. They can also bookmark and "like" lessons, and coming later this year it will also allow them to create journal entries to record notes about the lessons they do. (Journal entries will behave a little bit like a comment except they will only be visible to the user, and they will reference individual lessons.)

We have around 850 users and around 150 lessons, and most users tend to do 1-3 lessons per week. (lessons are designed to be repeated, so running out is not an issue. But they are an hour long so the most that anyone would do is maybe one per day) Since this accumulated data will eventually start to get large, we are storing this information into a custom database table.

Now here is the thing -- our developer discouraged us from keeping infinite histories (urging us to limit it to only 6 months or a year), saying that having a large database table (even a custom table) could slow down the site. He is extremely knowledgeable, but somehow in this case, I just don't believe it. If this were true then sites like Amazon would slow to a crawl. 
The only way that I could see this being the case is on the "my journey" page when we display the play history and journal entries. So for that reason we will paginate to only show 15 per page, just like you'd do for any other WP post. 

So I want to get your opinions: Could having a large custom database table degrade site performance? (and if so, why/how/which operations?) Is there anything special you would do to optimize performance & storage over the long term? 

Thank you for your two cents!!
~barbara~



David Arago

unread,
May 19, 2025, 1:58:33 PMMay 19
to mpls-stpaul-wordpress
Hi,

It could and I would say at some point it will.  Did you consider partitioning the database by some key, date, region?  We had large database and we had to purge old records because it was affecting performance.

--
You received this message because you are subscribed to the Google Groups "Minneapolis St. Paul WordPress User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpls-stpaul-word...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/mpls-stpaul-wordpress/7a223b12-9498-4935-b739-7f4287f2db3an%40googlegroups.com.

Toby C

unread,
May 20, 2025, 6:38:35 AMMay 20
to mpls-stpau...@googlegroups.com
How big is the database right now?  Just curious, because if it's relatively small at present, then it might not be an issue for many years...

But yeah, clearly wp.com has figured this out.  I remember talking to jjj a decade or so ago about the challenges wp.com had with solving the problem...The solution he described was very much out of my skill set, so I just stood there nodding like a bobblehead doll, and all I can say about the migration is that jjj's breath was not an issue.

Toby


Patrick Lewis

unread,
May 20, 2025, 9:36:49 AMMay 20
to Minneapolis St. Paul WordPress User Group
It's not the size of the database in bytes or the number of records, but the database design and the queries that access it. This can't be judged without a look under the hood at both the database and code. 

Performance on large tables often degrades when the data is accessed by queries against  non-indexed (or improperly indexed) columns. Looking at the selection and ordering criteria of a query gives a good first clue about what to consider indexing.

The comparison to Amazon is flawed. Amazon throws tons of resources at it's site. This is probably not true of your site. You can be sure that Amazon does move data to less accessible storage as it ages. 

You should create some benchmarks to help judge how performance is affected over time (i.e., how long does it take to perform various operations?).

Did your developer leave open the possibility of moving some data into separate tables that would still be accessible? 

Do you know what MySQL database engine is in use? It is probably InnoDB, but if it is MyISAM, consider updating it to InnoDB. Again, without seeing the database or code this advice is somewhat suspect.

Finally, I would recommend looking at the queries that access this custom table to judge them for their efficiency.

Toby C

unread,
May 20, 2025, 10:44:17 AMMay 20
to mpls-stpau...@googlegroups.com
Great thoughts, Pat.

Another thing to consider is that shared hosts usually have a database size limit.  

Toby



--
You received this message because you are subscribed to the Google Groups "Minneapolis St. Paul WordPress User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpls-stpaul-word...@googlegroups.com.

Barbara Schendel-Kent

unread,
May 20, 2025, 1:31:37 PMMay 20
to mpls-stpau...@googlegroups.com
Thanks a lot for the good responses guys. I am not sure what DB engine they use, but the site is hosted at Flywheel, so we should be able to increase resources pretty easily if needed. The database has not been built yet (we are still in the planning stages which is why I am asking all these questions) but I have been researching and I believe that I have come to the same conclusion as you Patrick, in that indexing, partitioning and possibly archiving will be keys to maintaining good performance. And with that in mind, from what I can tell there seems to be no reason to think this should be a problem as long as we are wise in our design and making sure the partitions align well with the queries we plan to make. 

I don't have much experience (at all) with database design, so I am wondering, taking the Journal Entries for example... we will need to query it to find entries that match the userID (again 850 or so), lessonID (150 or so), and date.  So how would you recommend partitioning the journal-entry database to optimize for this? Should we partition by lessonID? Or perhaps by range of UserIDs? Basically I am wondering, is it better to have more partitions where each one has fewer rows, OR fewer partitions with more rows? I'm guessing the former, so that the query does not have as much to search through, is that assumption correct?

Thank you again!

Nick Ciske

unread,
May 20, 2025, 3:39:22 PMMay 20
to mpls-stpau...@googlegroups.com
\I believe FlyWheel uses MySQL (vs say MariaDB). MySQL supports multiple table formats, of which InnoDB is one / the best performant option for most relational data. 

Which table format is used is dependent on how the table is created. If your DB has any MyISAM tables, you’ll want to convert those to InnoDB. This is step 0 and should be done on any WP install.

Proper indexing is step 1. See this article for more info.

Indexes are most efficient on numbers and timestamps (not strings), so you’ll want to make sure the date is a true DATE field (not a char or varchar). IDs are simple to index as they are just numbers.

Don’t forget about indexes on core WP tables too (this may be in place already, but do verify it as it’s a huge boost as that table gets queried on basically every request).

Increasing resources (i.e. a dedicated DB instance) is generally step 2-?. Most web databases never scale past this point.

Adding some caching (e.g. Redis) can cache frequently accessed data and offload frequent queries from the MySQL server — these may not be your custom queries, but fewer queries for, say, options means more for your custom ones.

Adding a read replica can be considered — this is a second server that mirrors the first and you only read from it. This also gives you some redundancy as it can fail over to become the main server and a new read replica spun up.

When all that starts to have performance issues… then you can consider partitioning, but that’s unlikely to become an issue anytime soon. Partitioning by user ID (since that is always known) would be a good idea — as you can easily partition into 10 tables based on the last digit of the user id (which should fairly even distribute records cross tables). HyperDB is a tool that can assist with this. If you need more than 10 well indexed tables and a read replica you have much bigger problems...

Here’s an article that basically says this all again ;-)

---

Oh, and of course there’s the option to use a different DB host altogether (e.g. PlanetScale ) or different DB engine (e.g. DynamoDB which can be free) and query it via PHP SDK or an API … and let someone else worry about scaling it.

_________________________
Nick Ciske
CTO/CISO | LuminFire

Barbara Schendel-Kent

unread,
May 21, 2025, 11:51:14 AMMay 21
to mpls-stpau...@googlegroups.com
Nick, this info was great and just what I was looking for. Lots to read and discuss with our dev. Thank you very much! 

--
You received this message because you are subscribed to the Google Groups "Minneapolis St. Paul WordPress User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpls-stpaul-word...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages