Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
does reddit partition dbs?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  5 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Yan Chunlu  
View profile  
 More options Sep 18 2012, 1:56 am
From: Yan Chunlu <springri...@gmail.com>
Date: Tue, 18 Sep 2012 13:56:29 +0800
Local: Tues, Sep 18 2012 1:56 am
Subject: does reddit partition dbs?

I have been using reddit to build a site for awhile, and the tables has
became very large.  one of the relation table just got about 50 millions of
records, the latest id is "55453433", too many slow queries and the system
is not stable.

I guess reddit is using londiste for replication, and it seems currently
reddit only segregated things and relations to different db.

maybe the single table of reddit has already reached billions of records,
does postgresql still works fine?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Keith Mitchell  
View profile  
 More options Sep 18 2012, 12:30 pm
From: Keith Mitchell <kemit...@reddit.com>
Date: Tue, 18 Sep 2012 09:30:12 -0700
Local: Tues, Sep 18 2012 12:30 pm
Subject: Re: [reddit-dev] does reddit partition dbs?

If you look at the example.ini file, you'll see lines for main_db,
comment_db, comment2_db, etc.

By default (for small installations / developer workspaces) they all point
to the same database. You can modify them to point at different databases.
The settings in that area of the .ini file describe a bit about what ends
up where.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David King  
View profile  
 More options Sep 18 2012, 5:30 pm
From: David King <dk...@ketralnis.com>
Date: Tue, 18 Sep 2012 14:30:40 -0700
Local: Tues, Sep 18 2012 5:30 pm
Subject: Re: [reddit-dev] does reddit partition dbs?

> I have been using reddit to build a site for awhile, and the tables has became very large. one of the relation table just got about 50 millions of records, the latest id is "55453433", too many slow queries and the system is not stable.

I wouldn't expect 50 million records to be in the performance-breaking range for postgres, even with reddit's schema. What kinds of queries are you doing that are slow?

> I guess reddit is using londiste for replication, and it seems currently reddit only segregated things and relations to different db.

I really doubt that you're at a point that you need more than one DB machine, I'd try to fix the one you have

> maybe the single table of reddit has already reached billions of records, does postgresql still works fine?

Despite what you've read on the internet, reddit hasn't had just a single table for many years

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Yan Chunlu  
View profile  
 More options Sep 19 2012, 3:40 am
From: Yan Chunlu <springri...@gmail.com>
Date: Wed, 19 Sep 2012 15:40:34 +0800
Local: Wed, Sep 19 2012 3:40 am
Subject: Re: [reddit-dev] does reddit partition dbs?

yeah, I am aware of that there could be many db engines, and db_manager
could balance the load on those engines. Which could split read and write
on tables.
but I did not found any code related to db partitioning, such as consistent
hashing on ids or something like instagram is doing:
http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids...

@david, thanks for the tip about the data size and performance.  and those
slow queries are probably my bad, I will try to debug and fix it.
just curious about the future plan, that when do I need to split things and
relations to different machine, and when should I split the single table,
do sharding, etc.

thanks!

On Wed, Sep 19, 2012 at 12:30 AM, Keith Mitchell <kemit...@reddit.com>wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David King  
View profile  
 More options Sep 20 2012, 10:19 pm
From: David King <dk...@ketralnis.com>
Date: Thu, 20 Sep 2012 19:19:13 -0700
Local: Thurs, Sep 20 2012 10:19 pm
Subject: Re: [reddit-dev] does reddit partition dbs?

> yeah, I am aware of that there could be many db engines, and db_manager could balance the load on those engines. Which could split read and write on tables.
> but I did not found any code related to db partitioning, such as consistent hashing on ids or something like instagram is doing:
> http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids...

Not in postgres, but the datatypes in Cassandra do this automatically

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »