Sergiy Serdyuk - Fotolia
SQL Server database design best practices and tips for DBAs
Good database design is a must to meet processing needs in SQL Server systems. In a webinar, consultant Koen Verbeeck offered advice on how to make that happen.
Knowing what you want from SQL Server is probably the biggest step toward running it successfully in business applications, according to consultant Koen Verbeeck. It's not just about knowing how SQL Server works, he said -- successful deployments also come from knowing your company's business goals and how to best utilize SQL Server to meet them.
To make that process work, it would be an understatement to say Verbeeck views SQL Server database design as important. "We can talk hours and days and even weeks about database design," he said in a recent webinar on SQL Server development and design best practices.
Every database technology has its advantages and disadvantages, said Verbeeck, a senior business intelligence consultant at Brussels Airport Co., which operates the Belgian capital's primary airport. He encouraged database administrators (DBAs) and developers to experiment with and explore different designs for their SQL Server systems. "You really need to evaluate your specific database needs and find out what works best for you."
Verbeeck said users should be asking themselves the following questions: What am I doing with my database today? What will I be doing with it next week, next year and in five years? When it comes to designing databases, thinking about the long term is a good idea, he advised, before going on to detail a list of design tips.
Tables and indexes in SQL Server databases
From Verbeeck's perspective, the smaller -- or narrower -- that database tables are, the better. Avoiding the use of blob and nvarchar columns in SQL Server database design is a key to keeping tables on the narrow side, he said. Verbeeck recommended not using nvarchar data types in particular, because they require twice as much space in a table as varchar columns do.
Even though data storage is less expensive than it used to be, it takes a query longer to read larger columns, Verbeeck pointed out. Conversely, if a query has less data to read, it'll perform the task that much faster.
If you use clustered indexes, however, it doesn't matter what kind of columns your SQL Server table contains, Verbeeck added. "Have a clustered index -- it will help you in the long run," he said.
A clustered index sorts the rows in a table based on their key values; when you put one in a table, the index essentially becomes the table, according to Verbeeck. He noted, though, that data staging tables don't always benefit from clustered indexes, as they can slow the process down.
Though Verbeeck vouched for clustered indexes overall, he emphasized that DBAs and developers should also examine other SQL Server indexing options before committing to a particular design.
For example, nonclustered indexes that include pointers to the rows containing particular key values are typically used to speed up SELECT queries. But Verbeeck said they can slow down other queries, especially data manipulation language, or DML, statements. Nonclustered indexes also create extra copies of the data, which take up more space in a database. This tends to result in unused indexes, which DBAs should watch for as part of the database monitoring process, Verbeeck advised.
More ways to boost database performance
columnstore indexes are especially useful for querying data warehouses, because they help accelerate large table scans, Verbeeck said. When data is stored in SQL Server in columns instead of the traditional row-based format, DBAs can control which of those columns are read or checked by the system when queries are run, making the whole process much quicker, he explained.
Verbeeck added that columnstore indexes can also be used with in-memory OLTP, an in-memory engine built into SQL Server to accelerate the performance of transactional databases.
A so-called covering index is a clustered or nonclustered index that Verbeeck said is built to include -- or cover -- all of the columns required by a particular query. As a result, covering indexes typically don't need to go back to a database table to retrieve extra columns.
When one does need to go back to the table, that step is referred to as a key lookup, which can slow down queries and should be avoided if possible, Verbeeck said. One way to do so, he suggested, is to use INCLUDE commands to cover extra columns. Also, by including keywords in specific columns, you can narrow your search as part of a query.
While good SQL Server database design is necessary, it won't solve every single problem that arises. One stopgap measure Verbeeck recommended is adding more memory. Like storage, memory isn't as expensive as it used to be -- and it's much cheaper than hiring an outside database expert to help improve your design, Verbeeck said. He compared increased memory to a bandage -- in many cases, it's a temporary fix, but it'll buy you time to come up with a more solid solution to the problem.
Give your databases room to grow
In the webinar, which was sponsored by database tools vendor Idera and posted on MSSQLTips.com, Verbeeck also recommended checking the auto-grow settings in SQL Server to make sure they align with the need for added file or log space when transactions are run. When a new database is created, he said, the default settings applied to it may not be what your application needs.
Koen VerbeeckSenior BI consultant at Brussels Airport Co.
Verbeeck also recommended allowing databases to "grow in peace." He advised planning for all the data you think you'll need in them, and then tacking on an extra year's worth of storage space. This gives you extra space and room to grow in the event that your system accumulates or produces more data than expected. However, creating a bigger database will take longer, he said.
The tempdb system database is another valuable SQL Server resource that can be used to store temporary tables, indexes, stored procedures and other database objects, but Verbeeck said it shouldn't be treated like the perfect solution to all SQL Server database design issues. He quoted Brent Ozar, a SQL Server consultant and trainer who half-jokingly refers to tempdb as "SQL Server's public toilet."
Tempdb requires monitoring, Verbeeck said, and not just for storing temporary objects and files; if a system doesn't have enough memory, tempdb will automatically make up for it, which can take valuable resources away from other programs and cause SQL Server to run slowly.
"Don't turn it into a bottleneck," he warned, adding that preallocating space for tempdb -- and making sure it's large enough to accommodate your workloads -- is the best way to prevent performance problems.
Recovery mode options for SQL Server
Verbeeck also encouraged SQL Server DBAs and developers to explore the available recovery options as part of the database design process. There are three basic recovery modes for logging transactions and restoring them when needed: full, bulk-logged and simple recovery.
Full recovery is a default setting that can be disabled, if desired. In the full recovery mode, everything done in a database is logged. It supports point-in-time recovery, which can come in handy if you require restores for specific transactions, Verbeeck said. But it can also slow down SQL Server systems, he added -- because everything is logged, your inserts and queries will run more slowly than normal.
For those users who don't require full recovery, bulk-logged recovery is an adjunct model in which database operations are logged more minimally; transactions can be recovered to the end of a previous database backup, but not to a specific point in time. In the simple recovery mode, no log backups are done, and the existing transaction log files are recycled so the storage space can be used again.
Finally, Verbeeck said null, or unknown, values in databases can create problems in comparisons and Boolean logic. "Nulls are evil," he joked. Humor aside, he strongly advised against using them in SQL Server systems. While you can declare a column as NOT NULL, Verbeeck said putting in a "dummy value" instead is a better approach.