SQL Join, Union
SQL Left Join
A SQL Left Join
is a type of Outer Join
. While joining two tables using Left Join all the rows of the first (left) table and the rows of the second (right) table where the column values match are included in the final set. Row values on the right table for which column values do not match are taken as NULL. A Left Join should be explicit in a SELECT
statement otherwise the RDBMS will assume an Inner Join
. Left Join can be used instead of a Right Join by reversing the order of the tables.
Let us consider a sales database with product, manufacturer, customer and order tables. To get a list of all products in the warehouse all we need is a basic SELECT statement from the Product table with no Joins involved.
SELECT Products.ProductName
FROM Products
The query result will be-
Product.ProductName |
---|
HotPlate |
FlashMemory |
CVDmc |
Now let’s assume we want to get all product information but also like to know which of them are out of stock. Here we are seeking information from both the product table and the Inventory table. Hence a Join is involved. And since we wish to have all product information anyways irrespective of whether it is out of stock or not, we are doing a Left Outer Join of product table and Inventory table where the columns match for ProductId.
SELECT Products.ProductName as Product, Inventory.OutOfStock as OutOfStock
FROM Products LEFT JOIN Inventory ON Product.productId = Inventory.productId
WHERE Inventory.OutOfStock = True
Product | OutOfSTock |
---|---|
HotPlate | NULL |
FlashMemory | NULL |
CVDmc | True |
...............................................................
I. Embedding queries
We can now do a more complex search. Let’s say that we want a list of all products but also want to know if orders if any were placed for these products so far. In this case we first look at the Orders Table and find no of orders placed on each product by using a Group By. Then we use the result of this query as a new Table and name it ProductSales. We then do a Left Join of Product Table and ProductSales where the columns match on productId.
SELECT Products.ProductName as Product, ProductSales.ItemsSold as ItemsSold
FROM Products LEFT JOIN ProductSales ON Product.productId = ProductSales.prodId
WHERE ProductSales =
(SELECT Orders.ProductId as prodId, Count(Orders.ProductId) as ItemsSold
FROM ORDERS
GROUP BY Orders.productId )
The query result will be-
Product.ProductName | ItemsSold |
---|---|
HotPlate | 2 |
FlashMemory | NULL |
CVDmc | NULL |
II. Multiple Table Joins
It is also possible to do more than one Join
when we need to get data from more than two tables such that multiple conditions are met. Let’s say that we want a list of all sales managers in the warehouse and list if any product is allotted to them yet and also list if the product manufacturer is from the electronics industry.
SELECT SalesMgrData.Name as MgrName, SalesMgrData.AllotedProduct as AllotedProduct, SalesMgrData.MfgIndustry as MfgIndustry
FROM SalesMgrData
WHERE SalesMgrData =
(
(SELECT SalesMgr.Name as Name, Product.ProductId as
AllotedProduct,
FROM SalesMgr LEFT JOIN Product
ON SalesMgr.Id = Product.SMgrId
)
LEFT JOIN Mfg ON AllotedProduct = Mfg.ProductId
WHERE Mfg.industry = Electronics
)
The query result will be something like this-
MgrNam | AllotedProduct | MfgIndustry |
---|---|---|
Turner | Hotplate | NULL |
James | NULL | NULL |
Maddy | NULL | NULL |
Pam | Sndsk | Electronics |
SQL Inner Join
Reducing redundant data is an important database design practice. One of the ways this is achieved is by storing data across multiple tables that represent separate entities like a product table and a manufacturer table for a Sales database. However, often the information we seek spans two or more tables that have a relationship with each other. SQL Join
is used to link data across such tables and provide the desired query results. This linking of tables by Join is a form of intersection where column data match.
Consider a Sales database
which has a product table and a manufacturer table keeping product and manufacturer records respectively. The product table keeps product information – product name, type etc and has the primary key ProductId. The manufacturer table keeps manufacturer information like- manufacturer name, address, phone no etc and has the primary key ManufacturerId. Both tables uniquely represent a different entity but are related to each other by the common key- ManufacturerId. Now we can do a SQL JOIN of both tables to get product and manufacturer information like this.
SELECT Products.Name, Mfg. Name, Mfg.Address
FROM Products INNER JOIN Mfg ON Products.ManufacturerId = Mfg. ManufacturerId
We will explain the meaning of ‘Inner’ Join later but for now the query result will be-
Product.Name | Mfg.Name | Mfg.Address |
---|---|---|
HotPlate | Plyts Inc | 23 Franklin ST, Redcity, MA |
FlashMemory | Sndsk | --- |
RBFlask | Jeruf ICs | --- |
The results above by using the SQL Join clause can also be achieved by the use of a SQL Where clause as shown below. But Join is recommended because it is the newer syntax (ANSI92) for any queries involving joins of two or more tables vs using Where clause (ANSI89). It is also preferred for readability.
SELECT Products.ProductName, Mfg.ManufacturerName, Mfg.Address
FROM Products, Mfg
WHERE Products.ManufacturerId = Mfg. ManufacturerId
I. Types of Joins
The most basic SQL Join is a Full Join
one that will result in a join of all records of the two tables with no condition specified. For example if we just joined the Product and Mfg tables, it would give a list of all combinations of products with all manufacturers and zipcodes. Such data makes little sense though and a Full Join is rarely used unless relevant and meaningful in a certain context with a particular set of tables.
SELECT Products.ProductName, Mfg.Name, Mfg.Zipcode
FROM Products JOIN Mfg ON Products.ManufacturerId = Mfg. ManufacturerId
The query result will be
Product.ProductName | Mfg.Manufacturername | Mfg.Zipcode |
---|---|---|
HotPlate | Plyts Inc | 22911 |
HotPlate | Sndsk | 22904 |
HotPlate | Jeruf ICs | 23102 |
FlashMemory | Plyts Inc | 22911 |
FlashMemory | Sndsk | 22904 |
FlashMemory | Jeruf ICs | 23102 |
The main SQL Joins are -Inner Join
and Outer Join
. Each Join differs from the other in that which tables’ rows are being included or excluded in the final result query set. Inner Join is the most commonly used SQL Join. In Inner Join only those rows of all tables are selected for the which the column values match. An Outer Join not only joins all the matched rows of the tables but also all the unmatched rows for one or both of the tables. An Outer Join can be a Left Join or a Right Join. In Left Outer Join all the rows of the first table are selected and rows of the second table where the column values do not match are taken as NULL. Right Outer Join works in exactly the opposite way but is rarely used because a Left Join can be used instead by reversing the order of the tables.
II. Inner Join Example
If we wish to get product information from manufacturers which are within a certain zip code we use an Inner Join as below.
SELECT Products.ProductName, Mfg.ManufacturerName
FROM Products INNERJOIN Mfg ON Products.ManufacturerId = Mfg. ManufacturerId
WHERE Mfg.Zipcode = 22901
The query result will be
Product.ProductName | Mfg.Manufacturername |
---|---|
HotPlate | Plyts Inc |
FlashMemory | Sndsk |
Most RDBMSes assume Inner Join by use of the ON keyword if type of Join is not specified. It is also possible to join more than two tables using the Join Syntax.
SQL Union
SQL Union
is used to combine the resulting row-sets of multiple SQL queries into a single dataset removing any duplicate entries in the order in which the database engine delivers it. It is similar to the Logical OR operator performed on two or more sets and is different from SQL Join in that it is a union of data instead of intersection. Just like Join it is important in a Union that all attributes of the interacting (‘combining’ in Union) datasets and their numbers match and their types are compatible. Union use is often synonymous with keywords like ‘both’ or ‘all’ or ‘together’ or ‘with’ and ‘either/or’.
Let us assume that we are a company that sells beverages and condiments online and want to know who our competitors are. We are also investigating whether selling our products on an Amazon like platform may be more profitable for us so the first thing we do is to get information from Amazon on who the other sellers are for such products and how well they are selling. Let’s say the data we are currently interested in is only the product name, seller name and yearly sales numbers. For the sake of simplicity let’s also assume we have access to this information and all these attributes are available within a single table per seller. Then the SQL Query we write will be the following.
SELECT Beverages.Name, Beverages.Seller, Beverages.Yrly$Sales
FROM Beverages
Union
SELECT Condiments.Name, Condiments.Seller, Condiments.Yrly$Sales
FROM Condiments
The result of this query is
Beverages.Name | Beverages.Seller | Beverages.YrlySales |
---|---|---|
LiptonNatEnergy | Tata | $200000 |
Nescafe | Bullions | $500678 |
CappucinoSB | Starbucks | $3457710 |
BerryJamN | JulieFarms | $2500 |
FicoButter | MilkBoothInc | $677789 |
I. Sorting results by Order By
To order the resulting sets in a certain order we can use Order By
with the Union Clause. If we want to see the products with the highest sales first we can modify the above query as following.
SELECT Name, Seller, Yrly$Sales
FROM Beverages
Union
SELECT Name, Seller, Yrly$Sales
FROM Condiments
Order By Yrly$Sales DESC;
The result of this query is
Name | Seller | YrlySales |
---|---|---|
CappucinoSB | Starbucks | $3457710 |
FicoButter | MilkBoothInc | $677789 |
Nescafe | Bullions | $500678 |
LiptonNatEnergy | Tata | $200000 |
BerryJamN | JulieFarms | $2500 |
II. Union with same table
The above query gives us broad information on the sales of all beverage and condiment sellers on Amazon showing on top the ones that are selling the most. It may be intelligent however to dig further and find only specific information that is relevant to us. For example we may now only query sales data for the beverages that we sell- coffee and tea. This will be a slightly different query from above in that we are now only seeking information from one table- the beverages table but we are looking for information on both coffee and tea product sellers. What we need here is a SQL Union on the same table as shown below.
SELECT Name, Seller, Yrly$Sales
FROM Beverages
WHERE Beverages.type = Coffee
Union
SELECT Name, Seller, Yrly$Sales
FROM Beverages
WHERE Beverages.type = Tea
The result of this query is-
Names | Seller | YrlySales |
---|---|---|
LiptonNatEnergy | Tata | $200000 |
CappucinoSB | Starbucks | $3457710 |
Nescafe | Bullions | $500678 |
The same results can be obtained with a query using the IN
operator like this-
SELECT Name, Seller, Yrly$Sales
FROM Beverages
WHERE Beverages.type IN (Coffee, Tea)