difference between scaling and sharding

1,678 views
Skip to first unread message

Aziz Nait Oumghar

unread,
Apr 29, 2011, 10:15:37 AM4/29/11
to nosql-di...@googlegroups.com
Hi,

I'm actually working  in mongodb database I am want to know what is the difference between horizontal scaling and sharding .

thnx

Dwight Merriman

unread,
Apr 29, 2011, 10:31:21 AM4/29/11
to nosql-di...@googlegroups.com
horizontal scaling means scaling up a system to more capacity (or speed, etc.) by adding more servers.
vertical scaling means getting a bigger box.

sharding is one particular way to approach horizontal scaling.  there are others.  in fact, some problems scale out horizontally quite easily : web application servers for example, without a lot of complexity.

--
You received this message because you are subscribed to the Google Groups "NOSQL" group.
To post to this group, send email to nosql-di...@googlegroups.com.
To unsubscribe from this group, send email to nosql-discussi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nosql-discussion?hl=en.

Georges DICK

unread,
Apr 29, 2011, 10:39:07 AM4/29/11
to nosql-di...@googlegroups.com
Horizontal scaling means splitting tables across different servers (e.g. one row per server), but with only one instance of the same date. Sharding is basically the same, but allow multiple instances of the same data to be spread across multiple servers.

2011/4/29 Aziz Nait Oumghar <aziznai...@gmail.com>
Hi,

I'm actually working  in mongodb database I am want to know what is the difference between horizontal scaling and sharding .

thnx

--
You received this message because you are subscribed to the Google Groups "NOSQL" group.
To post to this group, send email to nosql-di...@googlegroups.com.
To unsubscribe from this group, send email to nosql-discussi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nosql-discussion?hl=en.



--
Cordialement,

Georges DICK
email geo...@monaco.net
http://georgesdick.com

Ricky Ho

unread,
Apr 29, 2011, 12:29:16 PM4/29/11
to nosql-di...@googlegroups.com
I would use the term "sharding" for what you called "horizontal scaling", and I would use the term "replication" for what you called "sharding".  And I would reserve the term "horizonal scaling" as the general scheme to enable infinite growth.

I know there is not strict definition on these terms by the way.

Rgds,
Ricky

Angel Java Lopez

unread,
Apr 29, 2011, 12:41:38 PM4/29/11
to nosql-di...@googlegroups.com
Following Ricky Ho:

horizontal scaling: a broad term, not necessary related to data.
replication: (every?) data/row A resides in server S1, and in server S2
sharding: a data/row A resides in server A, another data/row B resides in server B, with some "assign-a-server-for-this-row" algorithm

Many sharding implementations have replicas: data/row A resides in servers A1, A2.... data/row B resides in servers B1, B2... But I guess there are orthogonal concepts.

Angel "Java" Lopez
http://www.ajlopez
http://twitter.com/ajlopez

Aziz Nait Oumghar

unread,
Apr 29, 2011, 12:43:58 PM4/29/11
to nosql-di...@googlegroups.com

thanx for your answers ,

In mongodb I think sharding is splitting a collection of documents across different nodes (servers) but I don't know any examples of horizontal scaling in mongodb.Can some one give me an example (replication,I think, is not an example of horizontal scaling)

Jeremiah Peschka

unread,
Apr 29, 2011, 12:45:55 PM4/29/11
to nosql-di...@googlegroups.com
Both replication and sharding are examples of horizontal scaling

Replication is used to scale reads - you have more copies of your data so you can read from more locations.
Sharding is used to scale writes - there are more locations available to write data.
---
Jeremiah Peschka
Founder, Brent Ozar PLF
SQL MVP, MCITP: DBA, Database Developer


On Apr 29, 2011, at 9:43 AM, Aziz Nait Oumghar wrote:

>
> thanx for your answers ,
>
> In mongodb I think sharding is splitting a collection of documents across different nodes (servers) but I don't know any examples of horizontal scaling in mongodb.Can some one give me an example (replication,I think, is not an example of horizontal scaling)
>

Aziz Nait Oumghar

unread,
Apr 29, 2011, 12:46:03 PM4/29/11
to nosql-di...@googlegroups.com

ah,sorry, I didn't see your answer Angel ,I think I understood

thanks

Aziz Nait Oumghar

unread,
Apr 29, 2011, 12:51:13 PM4/29/11
to nosql-di...@googlegroups.com

perfect !!!

sasikala karthi

unread,
Apr 29, 2011, 11:37:16 AM4/29/11
to nosql-di...@googlegroups.com
MongoDB  supports auto-sharding facility also. 

On Fri, Apr 29, 2011 at 8:09 PM, Georges DICK <george...@gmail.com> wrote:
Horizontal scaling means splitting tables across different servers (e.g. one row per server), but with only one instance of the same date. Sharding is basically the same, but allow multiple instances of the same data to be spread across multiple servers.
2011/4/29 Aziz Nait Oumghar <aziznai...@gmail.com>
Hi,

I'm actually working  in mongodb database I am want to know what is the difference between horizontal scaling and sharding .


--
You received this message because you are subscribed to the Google Groups "NOSQL" group.
To post to this group, send email to nosql-di...@googlegroups.com.
To unsubscribe from this group, send email to nosql-discussi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nosql-discussion?hl=en.

Daniel Smith

unread,
Apr 29, 2011, 12:42:31 PM4/29/11
to nosql-di...@googlegroups.com
On Fri, Apr 29, 2011 at 9:29 AM, Ricky Ho <rickyp...@gmail.com> wrote:
I would use the term "sharding" for what you called "horizontal scaling", and I would use the term "replication" for what you called "sharding".  And I would reserve the term "horizonal scaling" as the general scheme to enable infinite growth.

I know there is not strict definition on these terms by the way.

Rgds,
Ricky



I always think of sharding as spreading data out in a hashed way, and scaling as using more powerful machines and/or replication + load balancing.


--
Daniel Smith - Sonoma County, California
http://daniel.org/resume

Jim Peters

unread,
Apr 29, 2011, 1:15:02 PM4/29/11
to nosql-di...@googlegroups.com
Some people use both sharding and replication. Sharding to scale writes and shard replication for reliability and to scale reads.

--
You received this message because you are subscribed to the Google Groups "NOSQL" group.
To post to this group, send email to nosql-di...@googlegroups.com.
To unsubscribe from this group, send email to nosql-discussi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nosql-discussion?hl=en.



--
Jim Peters
+1-415-608-0851 (Cell)
+1-416-466-9790 (Home)
+1-415-508-8651 (Google Voice)

Angel Java Lopez

unread,
Apr 29, 2011, 1:26:49 PM4/29/11
to nosql-di...@googlegroups.com
I want to add:

Replication can be used to "fail-over". If some nodes are death, or under maintenance, the data could be read. A replication implementation could use a "preferred" server for read, but it could have others in case of primary server failure, or if it were under work load (but I don't know if this approach is widely used).

Someone said to me: sharding can be used to scale reads, too. You can launch in parallel a complex "select".

I wrote this, in general terms, not only nosql.

Jeremiah Peschka

unread,
Apr 29, 2011, 2:01:35 PM4/29/11
to nosql-di...@googlegroups.com
I hesitate to refer to replication as any kind of fail over, availability, or DR

Replica sets in MongoDB (and replication as implemented in Dynamo style systems) provides automatic failover, but that's because of implementation specifics not because of replication in general.

MongoDB uses write elections to poll and elect a master server and Dynamo systems are effectively masterless. 

Plain old replication (think MySQL replication) just pushes writes from a publisher to a subscriber.

Dapeng Li

unread,
Apr 29, 2011, 2:37:21 PM4/29/11
to nosql-di...@googlegroups.com
Also following the replication, :

Horizontal scaling is a general term on performance. IMHO, performance means the response time for a request. High scalability thus means to get response within an acceptable amount of time for a request, no matter how big the underlying database is in size, and how high the concurrency is at that moment.

Replication and sharding are about data.

Replication: the goal is to keep the two instances in a replication to have the same copy of data.  (The word 'same' is not exactly right here. We can decide to skip the data update in certain tables and databases in the slave. But the essential idea is to keep the data in sync in the two instances.)  Replication can be used to distribute the load on the requests to the same copy of data.

Sharding: the different instances among the shards have different copies of data.  This is the primary difference from replication, in my opinion.  Sharding can be used to distribute the load on the requests to different copies of data. 

Both replication and sharding can achieve horizontal scaling, solving different scaling problems, though. 

Dapeng Li
Reply all
Reply to author
Forward
0 new messages