Skip to main content

Command Palette

Search for a command to run...

SQL Server - Query Optimization Techniques

Published
5 min read

Optimizing SQL Server queries is a balance of art and science. It involves reducing I/O operations, minimizing CPU usage, and managing memory effectively.

I. Data Retrieval & Selection Strategies

The most common performance bottlenecks occur simply because the query asks for too much data.

1. Avoid SELECT *

Fetching all columns consumes unnecessary I/O and network bandwidth. It also prevents the use of "Covering Indexes" (indexes that contain all the data required for a query).

  • Bad Practice:

      SELECT * FROM Sales.Orders WHERE OrderDate > '2023-01-01';
    
  • Best Practice:

      SELECT OrderID, CustomerID, TotalDue FROM Sales.Orders WHERE OrderDate > '2023-01-01';
    

2. Use TOP or OFFSET FETCH (Pagination)

If you only need a sample or the first $N$ rows, do not retrieve the whole dataset.

  • Edge Case: Using TOP without an ORDER BY is non-deterministic (you will get random rows). Always define a sort order.

  • Example:

      -- Standard Pagination
      SELECT OrderID, TotalDue
      FROM Sales.Orders
      ORDER BY OrderDate DESC
      OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY;
    

3. UNION ALL vs. UNION

UNION removes duplicate records, which requires a hidden sorting operation (expensive). UNION ALL simply concatenates datasets.

  • Optimization: If you know the datasets do not overlap, or you don't care about duplicates, always use UNION ALL.

II. SARGability (Search ARGument ABle)

This is the most critical concept for Indexing. For an index to be used (an "Index Seek"), the column in the WHERE clause must be left alone.

4. Avoid Functions on Indexed Columns

wrapping a column in a function prevents SQL Server from using the index b-tree effectively, forcing an "Index Scan" (reading every row).

  • Bad Practice (Non-SARGable):

      -- SQL has to calculate YEAR() for every single row in the table
      SELECT * FROM Employees WHERE YEAR(HireDate) = 2023;
    
  • Best Practice (SARGable):

      -- SQL can jump directly to the start of the range
      SELECT * FROM Employees WHERE HireDate >= '2023-01-01' AND HireDate < '2024-01-01';
    

5. Avoid Leading Wildcards

  • Bad Practice: WHERE Name LIKE '%Smith' (Must scan the whole table).

  • Best Practice: WHERE Name LIKE 'Smith%' (Can use the index alphabetically).

6. Watch for Implicit Conversions

If you compare a VARCHAR column to an NVARCHAR parameter (Unicode), SQL Server must convert the entire column to match the parameter, ignoring the index.

  • Edge Case:

      DECLARE @Phone NVARCHAR(20) = '555-0199';
      -- If PhoneNumber column is VARCHAR, this causes an implicit conversion scan
      SELECT * FROM Customers WHERE PhoneNumber = @Phone;
    
  • Fix: Ensure variable data types match column data types exactly.


III. Logical Operators & Joins

7. EXISTS vs. IN

For checking existence, EXISTS is often faster because it short-circuits (stops looking once it finds a match). IN can be risky with NULL values.

  • Edge Case (The NOT IN Trap): If the subquery inside NOT IN returns even a single NULL, the entire query returns zero rows.

  • Best Practice:

      SELECT * FROM Orders o
      WHERE EXISTS (SELECT 1 FROM Customers c WHERE c.CustomerID = o.CustomerID AND c.Region = 'North');
    

8. Join Elimination & Subqueries

Modern SQL optimizers are good at rewriting subqueries as joins, but explicit Joins are generally preferred for readability and ensuring the optimizer doesn't get confused by deep nesting.

  • Optimization: Avoid joining tables you don't need columns from if you just want to filter. Use EXISTS instead.

9. Avoid DISTINCT as a "Band-aid"

Developers often use DISTINCT to fix a query that is returning duplicates due to a bad JOIN (e.g., a one-to-many join where you only wanted the "one" side). This causes an expensive Sort operation. Fix the Join logic instead.


IV. Indexing Architecture

10. The Covering Index

This is the "Holy Grail" of optimization. If an index contains (includes) all columns requested in the SELECT list, SQL Server never has to touch the actual table (Heap/Clustered Index). It serves data entirely from the smaller index file.

  • Example:

      -- Query
      SELECT EmailAddress FROM Customers WHERE LastName = 'Jones';
    
      -- Optimized Index
      CREATE NONCLUSTERED INDEX IX_Customers_LastName
      ON Customers (LastName)
      INCLUDE (EmailAddress); -- 'EmailAddress' is stored at the leaf level
    

11. Clustered vs. Non-Clustered

  • Clustered: The physical sorting of the data (usually Primary Key).

  • Non-Clustered: A separate look-up list.

  • Tip: If you frequently sort by a column (e.g., DateCreated), consider if that should be your Clustered Index, or if it needs a Non-Clustered index to support sorting.


V. Advanced & Server-Level Edge Cases

12. Parameter Sniffing

Sometimes SQL Server creates an execution plan based on the first parameter passed (e.g., a Date range with 1 row). Later, you pass a parameter that returns 1,000,000 rows, but it uses the "small" plan, crashing performance.

  • Solution:

      -- Forces a new plan to be generated for every execution
      SELECT * FROM Sales WHERE Region = @Region OPTION (RECOMPILE);
    

13. Transactions and Locking (NOLOCK)

Locking blocks other users. If you are running a heavy reporting query and data precision (exact real-time consistency) isn't 100% required, use NOLOCK.

  • Warning: This allows "Dirty Reads" (reading data currently being updated but not committed).

      SELECT * FROM HugeTable WITH (NOLOCK) WHERE ID > 1000;
    

14. Temporary Tables vs. Table Variables

  • Table Variables (@Table): Stored in memory (mostly). Good for small datasets (<100 rows). Do not have statistics, so the optimizer assumes they always have 1 row (bad for joins).

  • Temp Tables (#Table): Stored in tempdb. Good for larger datasets. Can be indexed and have statistics.


Summary Checklist for Optimization

TechniqueAreaImpact
Select Specific ColumnsI/OHigh
Remove Leading WildcardsIndexingHigh
Fix Implicit ConversionsCPU/IndexHigh
Use EXISTS over INLogicMedium
Covering Indexes (INCLUDE)I/OVery High
SARGable DatesCPU/IndexVery High

More from this blog

S

SQL Insights

31 posts