How Table Partitioning in SQL Server Improves Query Performance (Explained in Legos)January 14, 2021 No Comments
Featured article by Brenda Bentz, Independent Technology Author
Are your very large SQL Server Enterprise database tables causing slow query performance? I have been able to successfully use table partitioning to improve query performance by up to 75% for some Clients. Here is how it works.
What is table partitioning in MS SQL?
Table partitioning in MS SQL Server is an advanced option that breaks a table into logically smaller chunk (partition) and then stores each chunk to a multiple filegroups. Each chunk can be accessed and maintained separately. Partitioning is not visible to users; it behaves like one logical table to the end user. Partitioning was introduced in SQL 2005 as an Enterprise edition feature but starting with SQL 2016 SP1 it is available on all the editions.
To compare this to Legos – imagine sorting through an entire Lego bin of 1000 pieces for a 4 stud by 2 stud red piece.
Now – imagine that those 1000 Legos had now been sorted into bins by study dimensions and color.
The first bin is representative of a large SQL Server table and the latter scenario is representative of a large SQL Server table that has been partitioned.
How does table partitioning help improve SQL Server performance?
In cases of very large databases and very large tables – we had one with 27 billion rows for example – table partitioning makes querying more efficient and faster – by enabling the option to apply the query to only the most relevant partition of table data.
For example, say you re-index data nightly to reduce data fragmentation. With table partitioning, you can either choose to re-index the table data in its entirety or selectively. Doing so selectively will drastically reduce the time to execute this daily action.
For some clients, I have seen between 50 and 75% improvement in performance. This is because table partitioning impacts performance at both the hardware and database tiers.
Now, think about our Lego bins sorted by size and color. It’s easy to see how much less time it would take us to locate our 4 stud by 2 stud red piece. Why? Because we would start searching in the bin with only red pieces. Let us say perhaps that we also had a bin that contained only red pieces that are 2 studs wide. The search would take even less time.
In the same way, we can see how the query executed on a table split into smaller, individual filegroups that access only a fraction of the data result in faster queries being returned.
That partitioning keys are going to be performing worse than they did before partitioning was instated.
Other Benefits of SQL Server Table Partitioning
More Efficient Backups
With table partitioning, filegroups can be backed up and restored individually. Read-only, historical, unchanging data can be excluded from you regular backups and restore process. Partitions also help to cut down on disk space storage used to store backups.
Help Automate Maintenance of Historical Data
Partitioning will help reduce maintenance time used to maintain an organization’s data retention policy. For example, you have table that requires you to keep data from the past twelve months. If partitioned by month, you can run a job that would drop the oldest monthly partition and load new data in the current months’ partition. In many cases, enlisting the help of a database managed services provider is essential.
Improves Performance of Database Maintenance
Additionally, performing maintenance such as index and statistics maintenance on larger tables are typically time consuming and resource intense. With partitioning, configured indexes and statistics maintenance can be updated by partition number. This reduces maintenance time since only subset of the data will be update verse the whole table.
Now think back to our Lego example. My son frequently plays with the Legos, taking out the storage bins and moving them around. I had the bins in order of the smallest to largest pieces, in sequential order. But now, he has taken down the Lego bins from their storage shelf, forgotten how they originally all fit into the storage cubbies, and placed them back out of order and with spatial gaps between each bin, creating fragmentation.
In the same way, the frequency at which data moves in and out of the tables can quickly leave tables quite messy and fragmented. And database index maintenance is one way of ensuring that it gets reorganized on a regular basis – but can be resource intense when performed regularly on a large table. Imagine how much time I would spend reorganizing all of the Lego bins into their original, most optimized state, if I tried to keep all of them perfect – I would not be able to get anything else done!
So instead, I pick only the storage bin rows that I know my son has used recently and reorganize those each night. Partitioning allows the ability to perform the index reorganization or rebuild on only a single partition, only the most recent data for example, shortening the time it takes to perform maintenance.
Compression Partitions to save Money on Disk Space
Compression can be enabled on a partition-by-partition basis. Enabling compression is most beneficial when a subset of data does not change frequently. Typically, compressed tables use extra CPU to compress/decompress the data. However, it will use less I/O to read and write the data since the table is using less space on disk. Using less space on disk will in turn help reduce disk storage costs.
Real Life Table Partitioning Use Case:
A 24×7 shop inserts over a million rows a day into one table. They only used the last 45 days’ worth of data but were not permitted to delete the older data. Any type of maintenance (re-indexing, update stats and backups) was becoming more and more challenging (we were running over our scheduled maintenance windows). We also started having performance issues due to locking and blocking.
We setup a sliding window to manage and keep the same number of partitions on a partitioned table.
Nightly, a new partition is created to accommodate the new data and at the same time the oldest partition is taken out from the partitioned table to maintain the same number of partitions. The oldest partition that has been taken out of the partitioned table was archived. The best part of managing a sliding window scenario in SQL Server is its meta data operation, and hence it is significantly faster.
Requirements that made this work:
1. The source and target tables (or partitions) had the identical columns, indexes and use the same partition column
2. The source and target tables existed on the same filegroup
3. The target table partition transferred over was empty
When should I consider SQL Server Table Partitioning?
- The underlying hardware is unable to handle the size of the table (i.e. the size of the table is larger than allocated memory or the disks cannot handle larger I/O requests). In our Lego analogy, it’s time to partition our Legos when the sheer number of Legos is making it very difficult to locate the specific Lego we need to add to our next Lego creation.
- A table is housing various ranges data which are not accessed regularly, but are still needed (recent data, not quite as recent data, historical data).
- Need to quick delete or archival of ranges of data with minimal downtime.
- Experience locking since processes need to search thru older various ranges of data.
- Database maintenance is not completing within required SLA for maintenance window.
- Setting up and maintaining partitioning is not an easy task. Going back to the Lego scenario, sorting through a large bin was very time and resource intensive. Similarly, a server will be put under that same strain. Moving the data to the appropriate filegroup will require a rebuild of the clustered index. This process will require data movement which is timely and a resource intense process.
- Partitioning is not a set and forget solution. Processes will need to be setup to accommodate for new and archived data. To draw on our analogy, as new Legos are added to the collection, we will have to add new bins and organizational methods or add them into the existing storage bin organizational system.
- Setting the correct indexes are a key. A system will see performance improvement if the query is joining or filtering on the partition key. Otherwise partitioning could hurt performance.
Ensuring right sizing of partitions is key. If a partition contains too much data, you may not gain any benefits since you will be returning/searching thru too much data. If it is too small, it will be difficult to maintain. In the same way, it would not make sense for me to create a Lego bin for only Lego Tiles (smooth pieces that don’t have any studs) as we only have 10 of those pieces in the entire set. The result would be a mostly empty bin, taking up space on the storage shelf.
About the author
Brenda Bentz is a Senior Consultant for Fortified Data has been a DBA since 1998. Brenda is a Database Architect and information systems analysis with extensive experience with SQL Server at an enterprise level using full cycle system design process including requirement definition, proof of concept, design, n-tier implementation, and maintenance.
DATA and ANALYTICS