Tuesday, September 25, 2007

All about SQL Joins

The first thing which comes into our mind on using the word 'JOIN' is combining one object with another. The same concept also applies when using joins in SQL. Before going through the concept of joins we must first understand that why we actually need joins.

Joins are basically used in those conditions when we want to extract combined information from 2 or more tables (1 table in case of self join). Whenever any join is used SQL Server internally performs the Cartesian product of two or more tables and then selects the matching rows from the final table (table formed after Cartesian product). A Cartesian product is nothing but the multiplication of tables. In this each row of a table is multiplied by every row of other table.
Cartesian product can be explained with the help of following example:

In this example there are 2 tables namely Students and Courses. We will find the Cartesian product of these 2 tables.








Table 3 is a combination of both Students and Courses. Now if we query this newly formed table for selecting all those records where SName of Students = SName of Courses then the result will be records of all the students with their respective course. This is shown as follows:



So, this is how whenever any join is used, SQL Server internally does the Cartesian product of the tables and then selects the records based on the specified matching criteria.

I would like you to practice the concept of Cartesian product because once this concept is clear then we have won 80% of the war.


Joins in SQL


There are actually 4 types of joins in SQL namely:


1) Inner Join
2) Outer Join
3) Cross Join
4) Self Join



INNER JOIN:

The INNER JOIN returns all rows from both tables where there is a match. If there are rows in table 1 that do not have matches in table 2 then those rows will not be listed. This is can be explained with the help of following example:

Suppose I have 2 tables namely Employees and Salary







Now, if I want to know the Names and Salary of all the employees from the above 2 tables then I have to perform an inner join as follows:

SELECT A.EmpName AS Name, B.EmpSal AS Salary

FROM Employees A INNER JOIN Salary B

ON A.EmpName = B.EmpName


The result of the above query will be as follows:




You must be wondering why Renal’s salary has not been shown. The reason is because Renal’s salary is not present in the ‘Salary’ table and as I have written earlier that INNER JOIN only selects those rows which match a condition. In the above query the matching condition (A.EmpName = B.EmpName) is not fulfilled in case or Renal. Hence, Renal’s salary is not shown.


TODO: Find the Cartesian product of Employees and Salary table above and find out why Renal’s salary is not shown.

OUTER JOIN:

Outer joins are used to select those values from a table which do not have a direct match of values in the other table. Outer joins are of 3 types namely Right Outer Join, Left Outer Join and Fully Outer Join.

Right Outer Join returns all the rows from the second table even if there are no matches in the first table. If there are no matches the column value of the second table is NULL.

Left Outer Join returns all the rows from the first table even if there are no matches in the second table. If there are no matches the column value of the first table is NULL.

Full Outer Join returns all the rows from the first and second table even if there are no matches in both the tables. If there are no matches the column value of both the tables is NULL.


Left outer join can be explained with the following example:

Suppose there are 2 tables Students and OfficeBearers.





I want a list of all the students with their OfficeBearer post. For this I have to perform a LEFT OUTER JOIN on both the tables as follows:

SELECT A.SName, B.Post FROM Students A LEFT OUTER JOIN

OfficeBearers B ON A.SName = B.SName

The result of the above query will be as follows:





Here the post of Rita will be NULL because there is no post of Rita in the ‘OfficeBearers’ table.


Right outer join can be explained with the following example:

Suppose there are 2 tables Students and Books.






Now, I want the list of all the books irrespective of the fact whether they have been issued by the students or not. For this I have to run the following query:

SELECT B.BName, A.SName FROM Students A RIGHT OUTER JOIN

Books B ON A.SName = B.SName


The result of the above query will be as follows:



Here the student name associated with the ‘FLASH’ book will be NULL because there is no name as ‘Simran’ in the Students table which matches with the Books table.

Full Outer join returns all the rows from both the tables irrespective of the fact whether there are any matching entries or not. This can be explained with the help of following query

SELECT B.BName, A.SName FROM Students A RIGHT OUTER JOIN

Books B ON A.SName = B.SName

The result of the query will be as follows:



Here Catherine has not issued any book that is why BName infront of her name is NULL also the book FLASH has not been issued by any students that is why SName against it is NULL.


CROSS JOIN:

A cross join is nothing else but just the Cartesian product of both the tables. If we take into consideration the above example of Students and Books then to find the CROSS JOIN of these tables following query will be used:

SELECT B.BName, A.SName FROM Students A CROSS JOIN Books B

The result of the above query will be as follows:



SELF JOIN:

Its not necessary that joins should involve 2 or more tables. We can also join a table from within itself. Joining a table with itself can be useful when you want to compare values in a column to other values in the same column of the same table. This can be explained with the help of following example:Suppose I have a table named as product as follows:




SELECT A.PName, A.Cost, A.MaxPrice, B.PName, B.Cost, B.MaxPrice FROM Product A, Product B WHERE (A.Cost >= (B.Cost * 2))

The result of the above query will be as follows: