SQL Server - Query Optimization Techniques

Optimizing SQL Server queries is a balance of art and science. It involves reducing I/O operations, minimizing CPU usage, and managing memory effectively.

I. Data Retrieval & Selection Strategies

The most common performance bottlenecks occur simply because the query asks for too much data.

1. Avoid `SELECT *`

Fetching all columns consumes unnecessary I/O and network bandwidth. It also prevents the use of "Covering Indexes" (indexes that contain all the data required for a query).

Bad Practice:

  SELECT * FROM Sales.Orders WHERE OrderDate > '2023-01-01';

Best Practice:

  SELECT OrderID, CustomerID, TotalDue FROM Sales.Orders WHERE OrderDate > '2023-01-01';

2. Use `TOP` or `OFFSET FETCH` (Pagination)

If you only need a sample or the first $N$ rows, do not retrieve the whole dataset.

Edge Case: Using TOP without an ORDER BY is non-deterministic (you will get random rows). Always define a sort order.

Example:

  -- Standard Pagination
  SELECT OrderID, TotalDue
  FROM Sales.Orders
  ORDER BY OrderDate DESC
  OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY;

3. `UNION ALL` vs. `UNION`

UNION removes duplicate records, which requires a hidden sorting operation (expensive). UNION ALL simply concatenates datasets.

Optimization: If you know the datasets do not overlap, or you don't care about duplicates, always use UNION ALL.

II. SARGability (Search ARGument ABle)

This is the most critical concept for Indexing. For an index to be used (an "Index Seek"), the column in the WHERE clause must be left alone.

4. Avoid Functions on Indexed Columns

wrapping a column in a function prevents SQL Server from using the index b-tree effectively, forcing an "Index Scan" (reading every row).

Bad Practice (Non-SARGable):

  -- SQL has to calculate YEAR() for every single row in the table
  SELECT * FROM Employees WHERE YEAR(HireDate) = 2023;

Best Practice (SARGable):

  -- SQL can jump directly to the start of the range
  SELECT * FROM Employees WHERE HireDate >= '2023-01-01' AND HireDate < '2024-01-01';

5. Avoid Leading Wildcards

Bad Practice: WHERE Name LIKE '%Smith' (Must scan the whole table).
Best Practice: WHERE Name LIKE 'Smith%' (Can use the index alphabetically).

6. Watch for Implicit Conversions

If you compare a VARCHAR column to an NVARCHAR parameter (Unicode), SQL Server must convert the entire column to match the parameter, ignoring the index.

Edge Case:

  DECLARE @Phone NVARCHAR(20) = '555-0199';
  -- If PhoneNumber column is VARCHAR, this causes an implicit conversion scan
  SELECT * FROM Customers WHERE PhoneNumber = @Phone;

Fix: Ensure variable data types match column data types exactly.

III. Logical Operators & Joins

7. `EXISTS` vs. `IN`

For checking existence, EXISTS is often faster because it short-circuits (stops looking once it finds a match). IN can be risky with NULL values.

Edge Case (The NOT IN Trap): If the subquery inside NOT IN returns even a single NULL, the entire query returns zero rows.

Best Practice:

  SELECT * FROM Orders o
  WHERE EXISTS (SELECT 1 FROM Customers c WHERE c.CustomerID = o.CustomerID AND c.Region = 'North');

8. Join Elimination & Subqueries

Modern SQL optimizers are good at rewriting subqueries as joins, but explicit Joins are generally preferred for readability and ensuring the optimizer doesn't get confused by deep nesting.

Optimization: Avoid joining tables you don't need columns from if you just want to filter. Use EXISTS instead.

9. Avoid `DISTINCT` as a "Band-aid"

Developers often use DISTINCT to fix a query that is returning duplicates due to a bad JOIN (e.g., a one-to-many join where you only wanted the "one" side). This causes an expensive Sort operation. Fix the Join logic instead.

IV. Indexing Architecture

10. The Covering Index

This is the "Holy Grail" of optimization. If an index contains (includes) all columns requested in the SELECT list, SQL Server never has to touch the actual table (Heap/Clustered Index). It serves data entirely from the smaller index file.

Example:

  -- Query
  SELECT EmailAddress FROM Customers WHERE LastName = 'Jones';

  -- Optimized Index
  CREATE NONCLUSTERED INDEX IX_Customers_LastName
  ON Customers (LastName)
  INCLUDE (EmailAddress); -- 'EmailAddress' is stored at the leaf level

11. Clustered vs. Non-Clustered

Clustered: The physical sorting of the data (usually Primary Key).
Non-Clustered: A separate look-up list.
Tip: If you frequently sort by a column (e.g., DateCreated), consider if that should be your Clustered Index, or if it needs a Non-Clustered index to support sorting.

V. Advanced & Server-Level Edge Cases

12. Parameter Sniffing

Sometimes SQL Server creates an execution plan based on the first parameter passed (e.g., a Date range with 1 row). Later, you pass a parameter that returns 1,000,000 rows, but it uses the "small" plan, crashing performance.

Solution:

  -- Forces a new plan to be generated for every execution
  SELECT * FROM Sales WHERE Region = @Region OPTION (RECOMPILE);

13. Transactions and Locking (`NOLOCK`)

Locking blocks other users. If you are running a heavy reporting query and data precision (exact real-time consistency) isn't 100% required, use NOLOCK.

Warning: This allows "Dirty Reads" (reading data currently being updated but not committed).
```
  SELECT * FROM HugeTable WITH (NOLOCK) WHERE ID > 1000;
```

14. Temporary Tables vs. Table Variables

Table Variables (@Table): Stored in memory (mostly). Good for small datasets (<100 rows). Do not have statistics, so the optimizer assumes they always have 1 row (bad for joins).
Temp Tables (#Table): Stored in tempdb. Good for larger datasets. Can be indexed and have statistics.

Summary Checklist for Optimization

Technique	Area	Impact
Select Specific Columns	I/O	High
Remove Leading Wildcards	Indexing	High
Fix Implicit Conversions	CPU/Index	High
Use `EXISTS` over `IN`	Logic	Medium
Covering Indexes (INCLUDE)	I/O	Very High
SARGable Dates	CPU/Index	Very High

SQL Server - Query Optimization Techniques

I. Data Retrieval & Selection Strategies

1. Avoid `SELECT *`

3. `UNION ALL` vs. `UNION`

II. SARGability (Search ARGument ABle)

4. Avoid Functions on Indexed Columns

5. Avoid Leading Wildcards

6. Watch for Implicit Conversions

III. Logical Operators & Joins

7. `EXISTS` vs. `IN`

8. Join Elimination & Subqueries

9. Avoid `DISTINCT` as a "Band-aid"

IV. Indexing Architecture

10. The Covering Index

11. Clustered vs. Non-Clustered

V. Advanced & Server-Level Edge Cases

12. Parameter Sniffing

13. Transactions and Locking (`NOLOCK`)

14. Temporary Tables vs. Table Variables

Summary Checklist for Optimization

Comments

More from this blog

SQL Server - User and Role Management

A Guide to Database Auditing Strategy

Database Data Archiving Strategy

Database Read Replicas - Guideline

Database Design Guidelines: Naming & Data Types

Command Palette

I. Data Retrieval & Selection Strategies

1. Avoid SELECT *

2. Use TOP or OFFSET FETCH (Pagination)

3. UNION ALL vs. UNION

II. SARGability (Search ARGument ABle)

4. Avoid Functions on Indexed Columns

5. Avoid Leading Wildcards

6. Watch for Implicit Conversions

III. Logical Operators & Joins

7. EXISTS vs. IN

8. Join Elimination & Subqueries

9. Avoid DISTINCT as a "Band-aid"

IV. Indexing Architecture

10. The Covering Index

11. Clustered vs. Non-Clustered

V. Advanced & Server-Level Edge Cases

12. Parameter Sniffing

13. Transactions and Locking (NOLOCK)

14. Temporary Tables vs. Table Variables

Summary Checklist for Optimization

Comments

More from this blog

1. Avoid `SELECT *`

2. Use `TOP` or `OFFSET FETCH` (Pagination)

3. `UNION ALL` vs. `UNION`

7. `EXISTS` vs. `IN`

9. Avoid `DISTINCT` as a "Band-aid"

13. Transactions and Locking (`NOLOCK`)