Sergiy Serdyuk - Fotolia
Options for scaling out SQL Server applications to boost workloads
Scaling out a database to meet the needs of a heavy processing workload can be a challenge. Here are details on the SQL Server scalability methods available to ease the process.
One of the big challenges for SQL Server admins has always been scaling databases to accommodate heavier data processing workloads. Complicating matters, Microsoft provides many different SQL Server scalability options, but they aren't all appropriate for every situation.
First, there's the basic question of whether to scale up or scale out a SQL Server architecture. The former involves moving a database to a bigger server with more processors, memory and data storage; the latter extends SQL Server applications across multiple servers, which spreads out data processing workloads and offers potential data redundancy and availability benefits.
In this case, we'll focus on the various ways to scale out SQL Server. Before deciding which scalability method to use, database administrators must consider factors such as the frequency of data updates, whether data can be partitioned between different databases and their ability to modify the applications that run on SQL Server. With the answers in hand, they can create a scale-out plan that meets their specific needs.
To help find these answers, here are basic details on some of the commonly used SQL Server scale-out approaches, condensed from the technical information posted by Microsoft.
Scalable shared databases. A database doesn't have to be physically located on a database server. By instead placing it on a storage-area network (SAN), the data can be accessed simultaneously by multiple SQL Server instances running on different servers. In essence, each instance connected to the SAN works from the same database copy.
This scalability method can typically be used without any modifications to SQL Server applications. A big disadvantage, though, is that it's suitable only for workloads with static data. Scalable shared databases work extremely well in read-intensive applications that involve complex queries -- a data warehouse, for example.
However, scalable shared databases aren't suited to applications with a mix of read/write operations. A lock is placed on a SQL Server database when writing data to it. If data must be written to a shared database, all but one SQL Server instance will be momentarily detached from it until the write operation is complete. As a result, write operations negatively affect database performance in a very significant way.
Distributed partitioned views. Unlike scalable shared databases, this technique for scaling out SQL Server workloads works well for data that is frequently updated, making it a good fit for online transaction processing (OLTP) applications. The potential disadvantage is that the data must be able to be partitioned. It's also likely that the database applications will need modifications to synchronize processing operations across all of the partitions.
The concept behind the distributed partitioned views approach is simple: The data in a large database is broken down into several smaller distributed databases. For example, a database containing invoices might be partitioned by year. Invoices from 2016 could be placed into one database, while invoices from 2017 would go into another. Updates and queries can then be run against the database that contains the relevant data.
Data-dependent routing. This is essentially a variation of distributed partitioned views designed for high-volume OLTP applications. A database is similarly broken down into a series of smaller databases. The difference lies in how the process of routing queries to the right database is managed. With distributed partitioned views, SQL Server itself is aware of the data partitioning structure and determines where to access the requested data. In the case of data-dependent routing, SQL Server applications or middleware services determine where the necessary data resides.
Linked servers. SQL Server can access remote databases as if they were local. Because of that, it's possible to configure the software to treat a scaled-out collection of databases as if they were a single large database. Linked servers offer a viable alternative when existing SQL Server applications can't be easily changed to support another scale-out approach.
However, because SQL Server connects to a series of independent databases, you can't have a formal coupling between them. In addition to some limitations within SQL Server, such as a lack of referential integrity between local and remote tables, there is significant processing overhead involved in accessing remote databases. SQL Server architectures that involve linked servers must be designed in such a way that minimizes the need to access remote data, especially via joins between local and remote tables.
Peer-to-peer replication. One more mechanism for scaling out SQL Server applications is replicating data between different servers. Peer-to-peer replication is essentially the opposite of the scalable shared databases method. Rather than having multiple database servers share a single copy of a database, the servers each contain their own copy. When a write operation occurs, the change is replicated to all of the copies.
Peer-to-peer replication can handle a moderate number of database writes, but it's best suited to data that is relatively static. Because of the latency involved in the replication process, it also isn't a good fit for use with data that's extremely time-sensitive. It is, however, a good general purpose scalability option for databases that process more reads than writes.