Big look-up table question...

Paul Madley

unread,

Aug 26, 2011, 8:12:25 AM8/26/11

to ph...@googlegroups.com

Hi,

Not *quite* sure if this list is intended for asking this kind of question, but heh you're all experts :o)

We're going to have a list of something like 3 million codes, each with an ID to the item in question, the date it was used, and a few other things. Currently all in one (MySQL) table called... "codes"

Now, can anybody see a practical advantage/disadvantage to actually having 10 (for example) tables, with an equalisish number of records. As in, a table for codes starting A&B, C&D, etc etc.

Basically; I wonder if the overhead of PHP deciding which table to query is greater than looking up the records in just one large table. Any thoughts?

Cheers.

P.

Paul Beckwith

unread,

Aug 26, 2011, 8:19:07 AM8/26/11

to ph...@googlegroups.com

Hi Paul,

If you split the records up into different tables this will help you when running queries on the tables as they will be smaller.

As for disadvantages I think it really depends on what you're going to be doing with the data...

Hope this helps,

Paul Beckwith

Director | iBek Systems Ltd

www.ibeksystems.co.uk

Office: +44 (0)1253 283924 | Mobile: +44 (0)7708555807

Facebook: iBek Systems Ltd | Twitter: @ibecsystems

--

You received this message because you are subscribed to the Google

Groups "PHPNW" group.

Post to list: ph...@googlegroups.com

Unsubscribe: phpnw+un...@googlegroups.com

Archive: http://groups.google.com/group/phpnw?hl=en

PHPNW Website: http://phpnw.org.uk/

twitter: http://twitter.com/PHPNW

Events: http://upcoming.yahoo.com/group/4709/

LinkedIn: http://www.linkedin.com/e/gis/112906/3FCE41597A1B

Facebook: http://www.facebook.com/group.php?gid=17897252075

IRC: #phpnw (irc.freenode.net)

Paul Waring

unread,

Aug 26, 2011, 8:30:49 AM8/26/11

to ph...@googlegroups.com

On 26/08/11 13:12, Paul Madley wrote:
> Not *quite* sure if this list is intended for asking this kind of question, but heh you're all experts :o)
>
> We're going to have a list of something like 3 million codes, each with an ID to the item in question, the date it was used, and a few other things. Currently all in one (MySQL) table called... "codes"
>
> Now, can anybody see a practical advantage/disadvantage to actually having 10 (for example) tables, with an equalisish number of records. As in, a table for codes starting A&B, C&D, etc etc.

3m rows is not that many in modern databases, MySQL should be able to
find and return records reasonably quickly if you're not running it on
an overloaded server. If your queries are already too slow, experiment
with adding indexes to the columns you search on (i.e. whatever you use
in your WHERE clause) to see if that speeds things up. If it's a table
which you only read data from, you might want to look at keeping the
whole thing in memory - there are various options for doing this in MySQL.

The major disadvantage of splitting up the table is that you now have 10
tables to maintain - what happens if you need to change the schema at a
later date?

--
Paul Waring
http://www.phpdeveloper.org.uk

Kieran Barnes

unread,

Aug 26, 2011, 8:36:17 AM8/26/11

to ph...@googlegroups.com, Paul Waring

I agree with Paul Waring, 3 million records isn't a big deal really.
Your fieldset sounds quite small so if its read only, store it in
memory. And as Paul suggests, index your "where" column.

Modern servers with correctly configured fast disks, ie SAS or SCSI not
SATA and direct attached or large bandwidth iSCSI attached.

Theres much more to think about other than the query when your data is
getting large.

--
Kieran Barnes

David Goodwin

unread,

Aug 26, 2011, 8:57:31 AM8/26/11

to ph...@googlegroups.com

If it's slow, turn on MySQL's slow query log and have a look.
Check the output from 'explain select blah blah blah from codes blah blah blah'
Create additional indexes as necessary

Run http://mysqltuner.pl on the server - if you're using innodb then a major speedup can be had from increasing the default size of the innodb cache (which will reduce IO etc on the server) as it defaults to about 8mb. - running mysqltuner.pl will tell you lots of things you can change with MySQL.

David.

Paul Madley

unread,

Aug 26, 2011, 10:08:26 AM8/26/11

to ph...@googlegroups.com

Thanks for all the advice folks. I generated 3,000,000 test (but valid) records and then write a quick script to look up a random 1000 in one go and it took > 0.002 seconds on my local machine, even with the rand() overhead.

So yep, cheers folks I think this is acceptable :o)

On 26 Aug 2011, at 14:01, David Goodwin wrote:

>
> On 26 Aug 2011, at 13:12, Paul Madley wrote:
>

> If it's slow, turn on MySQL's slow query log and have a look.
> Check the output from 'explain select blah blah blah from codes blah blah blah'
> Create additional indexes as necessary
>
> Run http://mysqltuner.pl on the server - if you're using innodb then a major speedup can be had from increasing the default size of the innodb cache (which will reduce IO etc on the server) as it defaults to about 8mb. - running mysqltuner.pl will tell you lots of things you can change with MySQL.
>
> David.
>
>

Ian Horst

unread,

Aug 27, 2011, 7:14:17 AM8/27/11

to ph...@googlegroups.com

Hi Paul,

Nowadays a few millions rows are not really that much as people already
mentioned. We do handle millions and millions of raws.

Splitting table in multiple ones is not a good idea. Better consider to
use MySQL partitioning if you use MySQL 5.1 and up. You can take into
consideration the most used SQL queries to partition your table. And
performance boost is significant. Please read through
http://dev.mysql.com/doc/refman/5.1/en/partitioning.html to get an idea
what magic you can do.

Ian

Reply all

Reply to author

Forward