Select Page

Break Your Database: The Merits of Database Sharding

Author: Eric Russo | | October 27, 2013

Database sharding is a data-partitioning scheme spreading data across various servers in a distributed fashion.

Also known as “shared-nothing” database partitioning; the concept was developed by Google and has gained popularity among enterprises. Among those other organizations adopting sharding include Amazon, Skype, YouTube, Facebook, and Wikipedia.

Sharding, simply put, divides a database into parts, or shards. Each of these shards can be hosted on a different server. The main benefit to using this technique is a boost in performance, which is the result of the technology using a distributed approach to database storage and access. Some IT professionals refer to it as horizontal scaling.

As Amazon explains in a patent recently issued for its approach:

“In relational database management systems, data is organized into tables containing rows and columns. Each row corresponds to an instance of a data item, and each column corresponds to an attribute for the data item. Sharding produces partitions by rows instead of columns. Through partitioning, the data in a single table may be spread among potentially many different physical data stores, thereby improving scalability.”

The task of the database professional or programmer implementing such a strategy is to establish rules that explicitly state those machines on which a piece of data will be stored.

If you have a database that is in four shards, for example, as db Shards, a blog by software development firm CodeFutures explains, each of these shards could be four different and separate MySQL instances. Each of these shards is hosted on its own server. As an example, each shard could have a limit of 1,000 connections with 800 concurrent transactions. Because the queries are distributed, each server will, on average, be able to process four times the number of concurrent requests.

Theo Schlossnagle, president and chief executive officer for OmniTI Computer Consulting, says the approach isn’t new. He defines sharding as:

“Sharding is the act of creating shards. Somehow, somewhere somebody decided that what they were doing was so cool that they had to make up a new term for what people have been doing for many, many years. It is partitioning… [S]ometimes that partitioning is proper federation. You don’t need a cool name to effectively accomplish what’s been around for a long time. More so, you don’t need a name that implies you broke something irreparably.”

Whatever you choose to call the approach, this shared-nothing approach to database partitioning may have appreciable benefits worth investigating for your organization.

Image by Paul Hammond; spinner image by H.Adam.

How to Solve the Oracle Error ORA-12154: TNS:could not resolve the connect identifier specified

The “ORA-12154: TNS Oracle error message is very common for database administrators. Learn how to diagnose & resolve this common issue here today.

Vijay Muthu | February 4, 2021

Data Types: The Importance of Choosing the Correct Data Type

Most DBAs have struggled with the pros and cons of choosing one data type over another. This blog post discusses different situations.

Craig Mullins | October 11, 2017

How to Recover a Table from an Oracle 12c RMAN Backup

Our database experts explain how to recover and restore a table from an Oracle 12c RMAN Backup with this step-by-step blog. Read more.

Megan Elphingstone | February 2, 2017

Subscribe to Our Blog

Never miss a post! Stay up to date with the latest database, application and analytics tips and news. Delivered in a handy bi-weekly update straight to your inbox. You can unsubscribe at any time.

Work with Us

Let’s have a conversation about what you need to succeed and how we can help get you there.

CONTACT US

Work for Us

Where do you want to take your career? Explore exciting opportunities to join our team.

EXPLORE JOBS