Scalability and elasticity are the trendiest words in the database arena these days – everybody scales, and everybody claims that only they scale the right way :) .
There are several ways to scale a database. When evaluating a database solution, ask yourself how it scales and see if it scales in a way that would be optimal for the needs of your application.
Before we can choose the most suitable scaling solution, there are a couple of things to consider:
MySQL Scalability Issues – What is it that we want to scale in a database?
- Capacity – The most natural scaling scenario, where data has grown and the disk/memory/attached storage is too small.
- Throughput – Application usage has grown, resulting in a decline in performance. Throughput is influenced by latency & concurrency, which means you can either improve the response time for every task (latency) , do a lot of tasks in parallel (concurrency), or use a combination of both (remember that one influences the other, and as concurrency increases, it affects latency as well.)Scaling throughput can be accomplished either for a specific type of transaction – such as scaling for read operations only – or for all types of transactions and operations: reads, writes and everything in between.
What we are willing to undergo during the scale event?
The Scale can be:
- Offline – This scale event requires stopping the applications running against the database until we bring back the newly scaled (up or down) database. This scale event can take anywhere from minutes to hours. In some cases, it may also require a re-portioning event or something of the sort.To maintain service during the time that we need to take the database offline, we often back up the data and restore it elsewhere to maintain service from a temporary location. Also, once the scaled instance is once again operational, you’ll need to sync all your data with the data in your backup.
- Online – The database continues to serve the application during this scale event. At most, the application might suffer some performance degradation during the scaling time, but overall service is maintained.Since your database records could change during the scale event, any solution that offers online scalability needs to be able to ensure data consistency.
So what types of scales are there?
Scaling Up and Down
Scaling up and down are the most common and ‘brute’ ways to scale:
Need more capacity or need better performance? Then buy a bigger, faster, meaner machine. The cloud infrastructure makes this approach very easy from the operational standpoint: you simply spawn the next available VM class and voila!
This approach can be achieved through a Do-It-Yourself operation of purchasing and installing additional virtual resources, or by subscribing to a database service that provides the ability to allocate additional instances on demand.
Advantages of scaling up or down:
- It’s simple – This type of scale is usually fairly simple to deploy, and is supported by most of the databases used today (so that once more processors and cores are identified, the database software is able to take advantage of the additional resources.)
- No code changes to the application – it is still runs against a single (larger) machine.
- Scales both throughput and capacity
Disadvantages of scaling up or down:
- Cost – High-end hardware is usually costly when deploying physical servers as well as when provisioning virtualized resources on the cloud. Moreover, if considering non-standard equipment (ultra-fast networks, PCI SSD storage etc.), the cost for such an infrastructure will be so expensive, making it unfeasible for most organizations. In addition, most of the cloud infrastructure uses commodity hardware, so the more high-end configurations and hardware are usually not available at all.
- Limited MySQL scalability – At the end of the day, you can’t outgrow the capabilities of a single machine. In the cloud infrastructure, those capabilities are also restricted to the hardware offers by your chosen cloud provider.
- Scaling up or down is usually done offline – This is the case because the hardware needs to be changed. To avoid service downtime, some kind of master/slave architecture needs to be established and upgraded one by one.
Scaling Out and In
This approach is more complex to achieve because it requires the database to be designed to support scale-out capabilities. There are various ways to scale-out; each solution carries its own advantages, disadvantages, tradeoffs, and architectural implications. We’ll address two popular architectures: read replicas and sharding.
Distributed read replicas:
This is the most common way to scale out. In this configuration, you have a single master server where all writes are performed and then distributed to all the read replicas. Read operations can be done against the master or the read replicas. The distribution of write operations can be synchronous, where the write operation will be blocking until it reaches all the read replicas (ensuring data consistency), or it can be asynchronous, where the write operation is unaffected, resulting in data inconsistency since you may not be reading the most up-to-date data.
Advantages of scaling out and in using read replicas:
- Simple – relatively easy to implement compared to other distribution approaches
- Scales beyond the limitations of a single machine
- Usually is done online (but the client is not automatically updated on the added/removed replicas)
Disadvantages of scaling out and in using read replicas:
- Scaling is only for throughput and read operation (sometimes at the expense of write operations).
- Application changes are usually required to perform the reads and writes in the right place and to add or remove nodes.
The read replicas approach is used by MySQL replication, Amazon RDS, Zimory, and more. It is usually suitable for 1st generation web applications where the application is mostly read (90/10, 80/20 – scalability is needed mostly for read operations). For next-generation web apps that are user and content-driven, this approach is less suitable.
With the expansion of social applications and social/sharing features, today’s applications are characterized with a much higher ratio of write operations. Some applications even demonstrate a ratio of 70/30 for write operations.
Sharding:
Sharding is another way to resolve MySQL scalability issues. It usually means splitting up the data by some logic derived from the application. This can be done by selecting a key in the data and splitting the data by hashing that key and having some distribution logic. It can also be done by identifying the application needs and setting different tables or different data sets in different databases (splitting the North-America sales data from the EMEA sales data, etc.)
This approach is simple from the database standpoint, but is very complex from the application standpoint since the application needs to be modified to deal with the data being scattered into the different shards. Moreover, combining data from different shards can be very complex and involves development in the application (you can’t just run a simple JOIN.)
Advantages of scaling out and in using sharding:
- Scales beyond the limitations of a single machine
- Scales both read and write operations (but makes some operations impossible to achieve in the database)
- Scales both throughput and capacity
Disadvantages of scaling out and in using sharding:
- Complex and requires application changes
- Scaling is usually offline and requires a re-partitioning event – and may require application changes.
Today, there are some solutions that introduce auto-sharding (Scalebase, Dbshards). This approach makes sharding more similar to shared-nothing partitioning, thus taking the sting out of some sharding complexities. However, it still requires application awareness and could prove to be a limiting factor if you needed to update your app or migrate to a different database solution.