SQL Talk

Showing posts from September, 2017Show All

What is In-Memory Tables in SQL

What is Parameter Sniffing?

Parameter Sniffing SQL Server creates an optimal plan for a stored procedure by usin…

What do you understand by database tuning advisor?

Database Engine Tuning Advisor examines how queries are processed in the databases y…

What is SQL Profiler?

1). What is SQL Profiler? Ans: Microsoft SQL Server Profiler is a rich graphical use…

What is mean by Statistics and When to Update statistics?

Statistics determine the selectivity of the indexes. If an indexed column has unique…

What is Index Rebuild and Index Reorganize and What is the difference between index Rebuild and Index Reorganize operations?

Index fragmentation can be resolved by rebuilding and reorganizing SQL Server indexe…

How can we get the fragmentation percentage of a database index?

Using SSMS, from the Fragmentation tab of the index Properties window. Checking…

What is Fragmentation in SQL Server

Explain Index Depth, Density and Selectivity factors and how these factors affect index performance?

Index depth is the number of levels from the index root node to the leaf nodes. An …

How can you find the missing indexes that are needed to potentially improve the performance of your queries?

The Missing Index Details option in the query execution plan, if available. The s…

What is the “Forwarding Pointers issue” and how can we fix it?

What are the Physical joins available in Sql Server?

Introduction There are three types of physical join operators in SQL Server, namely …

What is In-Memory Tables in SQL

What is Parameter Sniffing?

Parameter Sniffing

SQL Server creates an optimal plan for a stored procedure by using the parameters that are passed the first time to the stored procedure is executed is called Parameter Sniffing.

SQL Server does this exercise to build the execution plan once and use it multiple times without recreating the execution plan again and again. The compilation process of the stored procedure can take some precious time for the execution of the procedure is very large. This entire exercise is done to save those repeated moments recompiling the stored procedures.

While this works out great in most of the cases, there is often a negative impact of the parameter sniffing issue for the stored procedure.

Let us see a very simple example. To begin with, let us create the following stored procedure.

CREATE OR ALTER PROC GetCustomerOrders (@CustomerID INT)
AS
SELECT *
FROM WideWorldImporters.Sales.Orders
WHERE CustomerID = @CustomerID
GO

Now let us enable the actual execution plan for your query so we can, later on, inspect the execution plans.

First, run the stored procedure with the first value.

1	`EXEC` `GetCustomerOrders 1060`

Let us observe the execution plan. You will notice in the execution plan we see operators reading 4 rows out of potential 4 rows.

SQL SERVER - Parameter Sniffing Simplest Example parametersniffing1

Next, run the stored procedure with a different value.

1	`EXEC` `GetCustomerOrders 90`

Let us observe the execution plan. You will notice in the execution plan that the operator is reading 150 rows out of potential 4 rows.

SQL SERVER - Parameter Sniffing Simplest Example parametersniffing2

Now let us look at the property of the execution plan by right-clicking on the left-most SELECT operator in the SSMS.

Pay attention to the parameter list. You will see here there is mention of the parameter @CustomerID. This will also demonstrate that the parameter is compiled with the value 1060 but currently it is running this with a value of 90.

SQL SERVER - Parameter Sniffing Simplest Example parametersniffing3

This process of building the execution plan for the stored procedure with the initial run parameter is called Parameter Sniffing.

Performance Issue

In the real world, there is no performance issue of the execution plan of the new parameter is the same as the original execution plan. However, if the newer parameter is better suited for the different execution plan there can be huge potential performance degradation.

Let us understand this further by clearing the execution plan for our stored procedure by recompiling it.

1	`EXEC` `sp_recompile` `'GetCustomerOrders'`

Now let us run the stored procedure again with the newer value of the 90 and you will see that the execution plan changes from the original execution plan. Earlier the execution plan displayed the count of the rows as 150 out of 4 which is not corrected as 150 of 150.

1	`EXEC` `GetCustomerOrders 90`

SQL SERVER - Parameter Sniffing Simplest Example parametersniffing4

If you now run the stored procedure with the earlier parameter where we got 4 rows, you will notice that that query now uses this newer plan and display 4 out of 150. This may be a potential performance issue as the query may get more resources allocated.

1	`EXEC` `GetCustomerOrders 1060`

SQL SERVER - Parameter Sniffing Simplest Example parametersniffing5

The simplest solution to overcome the parameter sniffing is to recompilation, however, it has also many different issues.

What do you understand by database tuning advisor?

Database Engine Tuning Advisor examines how queries are processed in the databases you specify, and then recommends how you can improve query processing performance by modifying database structures such as indexes, indexed views, and partitioning.

What is SQL Profiler?

1). What is SQL Profiler?

Ans: Microsoft SQL Server Profiler is a rich graphical user interface to SQL Trace for monitoring T-SQL Statements of Database Engine. The events are saved in a trace file that can later be analysed or used to replay a specific series of steps when trying to diagnose a problem.

2). What is the use of Profiler?
Ans: The functions SQL Server Profiler tool can perform have been listed below:
Creating trace
Watching trace
Storing trace
Replaying trace

What is mean by Statistics and When to Update statistics?

Statistics determine the selectivity of the indexes. If an indexed column has unique values then the selectivity of that index is more, as opposed to an index with non-unique values. The query optimizer uses these indexes in determining whether to choose an index or not while executing a query.

Some situations under which you should update statistics:

If there is a significant change in the key values in the index
If a large amount of data in an indexed column has been added, changed, or removed (that is if the distribution of key values has changed), or the table has been truncated using the TRUNCATE TABLE statement and then repopulated
The database is upgraded from a previous version

Look up SQL Server books online for the following commands:

UPDATE STATISTICS,
STATS_DATE,
DBCC SHOW_STATISTICS,
CREATE STATISTICS,
DROP STATISTICS,
sp_autostats,
sp_createstats,
sp_updatestats

What is Index Rebuild and Index Reorganize and What is the difference between index Rebuild and Index Reorganize operations?

Index fragmentation can be resolved by rebuilding and reorganizing SQL Server indexes regularly. The Index Rebuild operation removes fragmentation by dropping the index and creating it again, defragmenting all index levels, compacting the index pages using the Fill Factor values specified in rebuild command, or using the existing value if not specified and updating the index statistics using FULLSCAN of all the data.

The Index Reorganize operation physically reorders leaf level pages of the index to match the logical order of the leaf nodes. The index reorganizes operation will be always performed online. Microsoft recommends fixing index fragmentation issues by rebuilding the index if the fragmentation percentage of the index exceeds 30%, where it recommends fixing the index fragmentation issue by reorganizing the index if the index fragmentation percentage exceeds 5% and less than 30%.

How can we get the fragmentation percentage of a database index?

Using SSMS, from the Fragmentation tab of the index Properties window. Checking the fragmentation percentage of all indexes in a specific database, using the UI method requires a big effort, as you need to check one index at a time.
The sys.dm_db_index_physical_stats dynamic management function, that was first Introduced in SQL Server 2005. The sys.dm_db_index_physical_stats DMF can be joined with the sys.indexes DMV to return the fragmentation percentage of all indexes under the specified database.

What is Fragmentation in SQL Server

Explain Index Depth, Density and Selectivity factors and how these factors affect index performance?

Index depth is the number of levels from the index root node to the leaf nodes. An index that is quite deep will suffer from performance degradation problem. In contrast, an index with a large number of nodes in each level can produce a very flat index structure. An index with only 3 to 4 levels is very common.
Index density is a measure of the lack of uniqueness of the data in a table. A dense column is one that has a high number of duplicates.
Index selectivity is a measure of how many rows scanned compared to the total number of rows. An index with high selectivity means a small number of rows scanned when related to the total number of rows.

How can you find the missing indexes that are needed to potentially improve the performance of your queries?

The Missing Index Details option in the query execution plan, if available.
The sys.dm_db_missing_index_details dynamic management view, that returns detailed information about missing indexes, excluding spatial indexes,
A combination of the SQL Server Profiler and the Database Engine Tuning Advisor tools.

What is the “Forwarding Pointers issue” and how can we fix it?

What are the Physical joins available in Sql Server?

Introduction

There are three types of physical join operators in SQL Server, namely Nested Loops Join, Hash Match Join, and Merge Join. In this article, we will be discussing how these physical join operators are working and what are the best practices for these different joins.

As you are aware, there are different types of logical joins, Inner Join, Outer Join (Left, Right, Full) and cross joins that you performed in order to achieve the required results as shown in the below figure.

SQL Join Chart - Custom Poster Size : SQL

Source: https://www.reddit.com/r/SQL/comments/aysflk/sql_join_chart_custom_poster_size/

Depending on data volume and the available indexes, different types of physical join operators are used. Therefore, by knowing the details of physical operators, you can improve query performance.

Nested Loops Join

Nested Loops Join has a very simple mechanism. Out of the two tables, the table with a smaller number of records is selected, and it will loop through the second table until matches are found. As you can see this is a not very scalable option for large tables. Hence this is mainly used when there is a table with a smaller number of records and the joining column is indexed in the second table.

Let us create two simple tables with the following scripts.

1
2
3
4
5
6
7
8
9
10
11
12
CREATE TABLE Table1
(ID INT)
 
INSERT INTO Table1
VALUES
(1),(5),(3),(6),(7)
CREATE TABLE Table2
(ID INT PRIMARY KEY CLUSTERED)
 
INSERT INTO Table2
VALUES
(1),(2),(5),(6),(3),(8),(7)

The above script will create two tables and insert few records. Let us join these two tables with the following query.

1
2
SELECT * FROM Table1 
INNER JOIN Table2 ON Table1.ID = Table2.ID

You will see the following query plan from the following figure for the above query.

Query execution plan for the Nested Loops Join.

As seen from the above execution plan, the smaller table is selected as the outer table and nested loops join will occur with the inner table.

It is important to remember that you don’t need to join these tables in the same order in your script. Even if you join them differently, the database engine will use the same query plan by identifying the table with smaller records. If you execute the following query, you will get the same execution as above.

1
2
SELECT * FROM Table2 
INNER JOIN Table1 ON Table2.ID = Table1.ID

There are several other usages of the Nested Loops physical join that are listed below.

Key Lookup

Not only for table joins, but Nested Loops joins are also used for key lookups. Let us run the following query in the sample database, AdventureWorks database on the Product table in which there is a clustered index on the ProductID column and a non-clustered index on the ProductNumber column.

1
2
3
SELECT ProductID, Name FROM 
Production.Product 
WHERE ProductNumber = 'CA-6738'

Since there is a non-clustered index on the ProductNumber column, mostly likely that the non-clustered index will be used. Please note that if there are large number of records, there can be a situation where it will perform a clustered index scan instead of the Non-Clustered index seek.

Let us look at the query plan for the above query.

Query Execution plan to retrieve data from clustered index.

Though no join statement is used in the above query, it has used a Nested Loops join in order to combine the Non-clustered index and retrieve the Product Name from the clustered index. If you want to avoid the Nested Loops join, you can create an include index as shown below.

1
2
3
4
CREATE UNIQUE NONCLUSTERED INDEX [AK_Product_ProductNumber2] ON [Production].[Product]
(
    [ProductNumber] ASC
) INCLUDE (Name)

Then the new query plan would look like the following and you will see that nested loops join is removed.

Include indexes to avoid physical join operators.

Since new query requirements can be covered from the non-clustered index, there is no need to fetch data from the clustered index. Therefore, the requirement of Nested Loops joins will not be there.

Cross Join

When cross joins are performed, the only possible way of joining these tables is via Nested Loops Join. Let us execute the following query to observe the different scenarios.

1
2
SELECT * FROM Sales.SalesOrderHeader 
    CROSS JOIN Production.Product

You will see from the following query plan that Nested Loops Join is used ignoring the sizes of the table.

Thus, the Cross Join queries are slower and should not be used against large volume tables.

Table Variable

Let’s see what is the behavior of the Joins when the table variable is used to join. To Demonstrate, we will use the following query with a table variable.

1
2
3
4
5
6
7
8
DECLARE @ProductTable TABLE (ProductID INT,ProductName VARCHAR(50))
INSERT INTO @ProductTable (ProductID,ProductName)
SELECT ProductID,Name FROM Production.Product 
 
SELECT D.ProductID,D.OrderQty,P.ProductName,D.UnitPrice,D.LineTotal 
FROM Sales.SalesOrderDetail D
INNER JOIN @ProductTable P
ON D.ProductID = P.ProductID

You will see the following query plan for the above join query.

This is due to the fact that the database engine is estimating that the Table variable has one record. Since a table with one record is a small table, the above query uses the Nested Loops join.

This can be verified from the query plan as shown in the below figure.

Verifying the Number of reads with table variable.

As you can see, though there are 504 records in the table, the engine has been estimated for only one record.

Merge Join

The Merge join is the most efficient join in SQL Server. In simple terms, if you see a Merge Join, your query plan is an efficient query plan and you do not need to do many changes to improve query performances. Since the Merge Join operator uses sorted data inputs, it can use any two large datasets.

Let us look at the following query.

1
2
3
SELECT* FROM Sales.SalesOrderDetail D
INNER JOIN Sales.SalesOrderHeader H
ON H.SalesOrderID = D.SalesOrderID

The following figure shows the query plan for the above query which shows Merge Join.

You need to verify how it receives the sorted input to the Merge Join. If it receives data from the index, then it would be fine. However, if SQL Server does perform any operation to sort the data stream, then you may need to look at the indexes and better try to modify the indexes in order to achieve better results.

Further, if there are duplicates for the joining conditions or many to many relationships between the join conditions, a Work table will be created in the tempdb which may result in performance issues and Tempdb contention. Therefore, if there are duplicates you may try to resolve those data duplication issues before joining the tables if possible.

Hash Match Join

The mechanism for hash match Join is to create a hash table and then match records. Hash table creates in the memory. However, since Hash Match Join will be used for a large dataset, most likely that memory will not be sufficient to hold the data. In that type of situation, Hash Match Join uses tempdb heavily. Further, Hash Match Join is a blocking join that means until the entire join is completed, users will not get the data output. These two properties make Hash Match join a slow operator to join tables in SQL Server. In the case you observe a Hash Match Join in the query plan, you need to look at how to improve the performance. In the case of a data warehouse, hash joins are fine but not for transactional systems. Mainly, you need to look at modifying the indexes or include new indexes. In addition, you may look at the options of rewriting the queries.

One important factor to note here, that Hash Match Join is utilized only for Equi joins.

Summary

In this article, we looked at different physical join operators in SQL Server namely, Nested Loops Join, Merge Join, and Hash Match Join. Nested Loops are used to join smaller tables. Further, nested loop join uses during the cross join and table variables.

Merge Joins are used to join sorted tables. This means that Merge joins are utilized when join columns are indexed in both tables while Hash Match join uses a hash table to join equi joins.

It is important to understand the usages of the different physical join operations. In order to achieve better results, we can look at the options of modifying the indexes or adding new indexes. Further, we can rewrite queries to achieve better results by changing the physical operators.

Ref:
https://www.sqlshack.com/internals-of-physical-join-operators-nested-loops-join-hash-match-join-merge-join-in-sql-server/#:~:text=There%20are%20three%20types%20of,Match%20Join%2C%20and%20Merge%20Join.

SQL Talk

What is In-Memory Tables in SQL

What is Parameter Sniffing?

What do you understand by database tuning advisor?

What is SQL Profiler?

What is mean by Statistics and When to Update statistics?

What is Index Rebuild and Index Reorganize and What is the difference between index Rebuild and Index Reorganize operations?

How can we get the fragmentation percentage of a database index?

What is Fragmentation in SQL Server

Explain Index Depth, Density and Selectivity factors and how these factors affect index performance?

How can you find the missing indexes that are needed to potentially improve the performance of your queries?

What is the “Forwarding Pointers issue” and how can we fix it?

What are the Physical joins available in Sql Server?

What is In-Memory Tables in SQL

What is Parameter Sniffing?

Performance Issue

What do you understand by database tuning advisor?

What is SQL Profiler?

What is mean by Statistics and When to Update statistics?

What is Index Rebuild and Index Reorganize and What is the difference between index Rebuild and Index Reorganize operations?

How can we get the fragmentation percentage of a database index?

What is Fragmentation in SQL Server

Explain Index Depth, Density and Selectivity factors and how these factors affect index performance?

How can you find the missing indexes that are needed to potentially improve the performance of your queries?

What is the “Forwarding Pointers issue” and how can we fix it?

What are the Physical joins available in Sql Server?

Introduction

Nested Loops Join

Key Lookup

Cross Join

Table Variable

Merge Join

Hash Match Join

Summary

Interview Questions and Answers

About Me

Blog Archive

Labels

Facebook

Recent Posts