Joins in SQL server are used to query (retrieve) data from 2 or more related tables…
I found this good article SQL Server–HOW-TO: quickly retrieve accurate row count for …
Joins in SQL server are used to query (retrieve) data from 2 or more related tables. In general tables are related to each other using foreign key constraints.
In SQL server, there are different types of JOINS.
1. CROSS JOIN
2. INNER JOIN
3. OUTER JOIN
Outer Joins are again divided into 3 types
1. Left Join or Left Outer Join
2. Right Join or Right Outer Join
3. Full Join or Full Outer Join
Now let's understand all the JOIN types, with examples and the differences between them.
Employee Table (tblEmployee)
Departments Table (tblDepartment)
SQL Script to create tblEmployee and tblDepartment tables
General Formula for Joins
SELECT ColumnList
FROM LeftTableName
JOIN_TYPE RightTableName
ON JoinCondition
CROSS JOIN
CROSS JOIN, produces the cartesian product of the 2 tables involved in the join. For example, in the Employees table we have 10 rows and in the Departments table we have 4 rows. So, a cross join between these 2 tables produces 40 rows. Cross Join shouldn't have ON clause.
CROSS JOIN Query:
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
CROSS JOIN tblDepartment
JOIN or INNER JOIN
Write a query, to retrieve Name, Gender, Salary and DepartmentName from Employees and Departments table. The output of the query should be as shown below.
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
INNER JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id
OR
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id
Note: JOIN or INNER JOIN means the same. It's always better to use INNER JOIN, as this explicitly specifies your intention.
If you look at the output, we got only 8 rows, but in the Employees table, we have 10 rows. We didn't get JAMES and RUSSELL records. This is because the DEPARTMENTID, in Employees table is NULL for these two employees and doesn't match with ID column in Departments table.
So, in summary, INNER JOIN, returns only the matching rows between both the tables. Non matching rows are eliminated.
LEFT JOIN or LEFT OUTER JOIN
Now, let's say, I want all the rows from the Employees table, including JAMES and RUSSELL records. I want the output, as shown below.
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
LEFT OUTER JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id
OR
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
LEFT JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id
Note: You can use, LEFT JOIN or LEFT OUTER JOIN. OUTER keyowrd is optional
LEFT JOIN, returns all the matching rows + non matching rows from the left table. In reality, INNER JOIN and LEFT JOIN are extensively used.
RIGHT JOIN or RIGHT OUTER JOIN
I want, all the rows from the right table. The query output should be, as shown below.
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
RIGHT OUTER JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id
OR
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
RIGHT JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id
Note: You can use, RIGHT JOIN or RIGHT OUTER JOIN. OUTER keyowrd is optional
RIGHT JOIN, returns all the matching rows + non matching rows from the right table.
FULL JOIN or FULL OUTER JOIN
I want all the rows from both the tables involved in the join. The query output should be, as shown below.
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
FULL OUTER JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id
OR
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
FULL JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id
Note: You can use, FULLJOIN or FULL OUTER JOIN. OUTER keyowrd is optional
FULL JOIN, returns all rows from both the left and right tables, including the non matching rows.
Joins Summary
SELF JOIN
joining a table with itself is called as SELF JOIN. SELF JOIN is not a different type of JOIN. It can be classified under any type of JOIN - INNER, OUTER or CROSS Joins.
Have you ever thought of a need to join a table with itself. Consider tblEmployees table shown below.
Write a query which gives the following result.
Self Join Query:
A MANAGER is also an EMPLOYEE. Both the, EMPLOYEE and MANAGER rows, are present in the same table. Here we are joining tblEmployee with itself using different alias names, E for Employee and M for Manager. We are using LEFT JOIN, to get the rows with ManagerId NULL. You can see in the output TODD's record is also retrieved, but the MANAGER is NULL. If you replace LEFT JOIN with INNER JOIN, you will not get TODD's record.
Select E.Name as Employee, M.Name as Manager
from tblEmployee E
Left Join tblEmployee M
On E.ManagerId = M.EmployeeId
In short, joining a table with itself is called as SELF JOIN. SELF JOIN is not a different type of JOIN. It can be classified under any type of JOIN - INNER, OUTER or CROSS Joins. The above query is, LEFT OUTER SELF Join.
Inner Self Join tblEmployee table:
Select E.Name as Employee, M.Name as Manager
from tblEmployee E
Inner Join tblEmployee M
On E.ManagerId = M.EmployeeId
Cross Self Join tblEmployee table:
Select E.Name as Employee, M.Name as Manager
from tblEmployee
Cross Join tblEmployee
I found this good article SQL Server–HOW-TO: quickly retrieve accurate row count for table from martijnh1
which gives a good recap for each scenarios.
I need this to be expanded where I need to provide a count based on a specific condition and when I figure this part, I'll update this answer further.
In the meantime, here are the details from article:
Method 1:
Query:
SELECT COUNT(*) FROM Transactions
Comments:
Performs a full table scan. Slow on large tables.
Method 2:
Query:
SELECT CONVERT(bigint, rows)
FROM sysindexes
WHERE id = OBJECT_ID('Transactions')
AND indid < 2
Comments:
Fast way to retrieve row count. Depends on statistics and is inaccurate.
Run DBCC UPDATEUSAGE(Database) WITH COUNT_ROWS, which can take significant time for large tables.
Method 3:
Query:
SELECT CAST(p.rows AS float)
FROM sys.tables AS tbl
INNER JOIN sys.indexes AS idx ON idx.object_id = tbl.object_id and
idx.index_id < 2
INNER JOIN sys.partitions AS p ON p.object_id=CAST(tbl.object_id AS int)
AND p.index_id=idx.index_id
WHERE ((tbl.name=N'Transactions'
AND SCHEMA_NAME(tbl.schema_id)='dbo'))
Comments:
The way the SQL management studio counts rows (look at table properties, storage, row count). Very fast, but still an approximate number of rows.
Method 4:
Query:
SELECT SUM (row_count)
FROM sys.dm_db_partition_stats
WHERE object_id=OBJECT_ID('Transactions')
AND (index_id=0 or index_id=1);
Comments:
Quick (although not as fast as method 2) operation and equally important, reliable.
Index | Query | Comment |
1 | SELECT COUNT(*) FROM Transactions | Performs a full table scan. Slow on large tables. |
2 | SELECT CONVERT(bigint, rows) FROM sysindexes WHERE id = OBJECT_ID('Transactions') AND indid < 2 | Fast way to retrieve row count. Depends on statistics and is inaccurate. Run DBCC UPDATEUSAGE(Database) WITH COUNT_ROWS, which can take significant time for large tables. |
3 | SELECT CAST(p.rows AS float) FROM sys.tables AS tbl INNER JOIN sys.indexes AS idx ON idx.object_id = tbl.object_id and idx.index_id < 2 INNER JOIN sys.partitions AS p ON p.object_id=CAST(tbl.object_id AS int) AND p.index_id=idx.index_id WHERE ((tbl.name=N'Transactions' AND SCHEMA_NAME(tbl.schema_id)='dbo')) | The way the SQL management studio counts rows (look at table properties, storage, row count). Very fast, but still an approximate number of rows. |
4 | SELECT SUM (row_count) FROM sys.dm_db_partition_stats WHERE object_id=OBJECT_ID('Transactions') AND (index_id=0 or index_id=1); | Quick (although not as fast as method 2) operation and equally important, reliable. |