
A join in SQL combines columns from one or more tables to extract the required data. When used effectively, joins can simplify queries and save a lot of time. In this blog, we will look into an easy way to visualize the data captured by various joins using Venn diagrams and their SQL equivalent.
Some of these joins are only available natively in some relational databases; for the sake of simplicity, we will only consider the most widely used open-source databases in this blog- MySQL and PostgreSQL.
Types of joins covered in this blog:
INNER JOIN
LEFT JOIN
RIGHT JOIN
FULL OUTER JOIN
LEFT JOIN EXCLUDING INNER JOIN
RIGHT JOIN EXCLUDING INNER JOIN
OUTER JOIN EXCLUDING INNER JOIN
We are using two identical tables, A and B, with two columns – id and val. We have created the table in both MySQL and PostgreSQL and inserted two values in each table.
+-------+-------------+ | Field | Type | +-------+-------------+ | id | int | | val | varchar(50) | +-------+-------------+
Values in both tables:
Table A: | Table B: |
---|---|
SELECT * FROM A; +------+------+ | id | val | +------+------+ | 1 | V | | 2 | W | +------+------+ |
SELECT * FROM B; +------+------+ | id | val | +------+------+ | 2 | X | | 3 | Y | +------+------+ |
Inner join
An inner join returns the common values in both tables based on the key column. MySQL and PostgreSQL both support it natively.
MySQL : | PostgreSQL: |
---|---|
mysql> SELECT A.id, B.id, A.val, B.val FROM A INNER JOIN B ON A.id = B.id; |
test=# SELECT A.id, B.id, A.val, B.val FROM A INNER JOIN B ON A.id = B.id; |
+------+------+------+------+ | id | id | val | val | +------+------+------+------+ | 2 | 2 | W | X | +------+------+------+------+ 1 row in set (0.00 sec) |
id | id | val | val ----+----+-----+----- 2 | 2 | W | X (1 row) |
Left join
A left join or left outer join returns all the values in the leftmost tables based on the key column, even if there are no matching rows in the right table. For the rows not present in other tables, NULL will be displayed for the columns of the other table. MySQL and PostgreSQL both support it natively.
MySQL : | PostgreSQL: |
---|---|
mysql> SELECT A.id, B.id, A.val, B.val FROM A LEFT JOIN B ON A.id = B.id; |
test=# SELECT A.id, B.id, A.val, B.val FROM A LEFT JOIN B ON A.id = B.id; |
+------+------+------+------+ | id | id | val | val | +------+------+------+------+ | 1 | NULL | V | NULL | | 2 | 2 | W | X | +------+------+------+------+ 2 rows in set (0.00 sec) |
id | id | val | val ----+----+-----+----- 1 | | V | 2 | 2 | W | X (2 rows) |
Right join
A right join or right outer join returns all the values in the rightmost tables based on the key column, even if there are no matching rows in the left table. For the rows not present in other tables, NULL will be displayed for the columns of the other table. MySQL and PostgreSQL both support it natively.
MySQL : | PostgreSQL: |
---|---|
mysql> SELECT A.id, B.id, A.val, B.val FROM A RIGHT JOIN B ON A.id = B.id; |
test=# SELECT A.id, B.id, A.val, B.val FROM A RIGHT JOIN B ON A.id = B.id; |
+------+------+------+------+ | id | id | val | val | +------+------+------+------+ | 2 | 2 | W | X | | NULL | 3 | NULL | Y | +------+------+------+------+ 2 rows in set (0.00 sec) |
id | id | val | val ----+----+-----+----- 2 | 2 | W | X | 3 | | Y (2 rows) |
Left join excluding inner join
Left join excluding inner join returns the values in the leftmost table and excludes the common values in both tables. For the rows not present in other tables, NULL will be displayed for the columns of the other table. This is not natively supported by MySQL or Postgres, and we have to put a where condition to get the desired result.
MySQL : | PostgreSQL: |
---|---|
mysql> SELECT A.id, B.id, A.val, B.val FROM A LEFT JOIN B ON A.id = B.id WHERE B.id IS NULL; |
test=# SELECT A.id, B.id, A.val, B.val FROM A LEFT JOIN B ON A.id = B.id WHERE B.id IS NULL; |
+------+------+------+------+ | id | id | val | val | +------+------+------+------+ | 1 | NULL | V | NULL | +------+------+------+------+ 1 row in set (0.00 sec) |
id | id | val | val ----+----+-----+----- 1 | | V | (1 row) |
Right join excluding inner join
Right join excluding inner join returns the values in the rightmost table and excludes the common values in both tables. For the rows not present in other tables, NULL will be displayed for the columns of the other table. This is not natively supported by MySQL or Postgres, and we have to put a condition to get the desired result.
MySQL : | PostgreSQL: |
---|---|
mysql> SELECT A.id, B.id, A.val, B.val FROM A RIGHT JOIN B ON A.id = B.id WHERE A.id IS NULL; |
test=# SELECT A.id, B.id, A.val, B.val FROM A RIGHT JOIN B ON A.id = B.id WHERE A.id IS NULL; |
+------+------+------+------+ | id | id | val | val | +------+------+------+------+ | NULL | 3 | NULL | Y | +------+------+------+------+ 1 row in set (0.00 sec) |
id | id | val | val ----+----+-----+----- | 3 | | Y (1 row) |
Full outer join
A full outer join return combines the result set of both the left and right join and returns the values common between both and the non-matching values as well. It is supported by Postgres, but it’s not supported in MySQL natively. For MySQL, we have to use UNION to combine the results of left and right join.
MySQL : | PostgreSQL: |
---|---|
mysql> SELECT A.id, B.id, A.val, B.val FROM A LEFT JOIN B ON A.id = B.id UNION SELECT A.id, B.id, A.val, B.val FROM A RIGHT JOIN B ON A.id = B.id; |
test=# SELECT A.id, B.id, A.val, B.val FROM A FULL OUTER JOIN B ON A.id = B.id; |
+------+------+------+------+ | id | id | val | val | +------+------+------+------+ | 1 | NULL | V | NULL | | 2 | 2 | W | X | | NULL | 3 | NULL | Y | +------+------+------+------+ 3 rows in set (0.00 sec) |
id | id | val | val ----+----+-----+----- 1 | | V | 2 | 2 | W | X | 3 | | Y (3 rows) |
Outer join excluding inner join
An outer join excluding inner join returns the uncommon values in both tables. The matching values are omitted here. It is not supported by Postgres and MySQL natively. For MySQL, we have to use UNION to combine the results of left and right join and add a where condition. For Postgres, we have to use outer join and where condition.
MySQL : | PostgreSQL: |
---|---|
mysql> SELECT A.id, B.id, A.val, B.val FROM A LEFT JOIN B ON A.id = B.id WHERE B.id IS NULL UNION SELECT A.id, B.id, A.val, B.val FROM A RIGHT JOIN B ON A.id = B.id WHERE A.id IS NULL; |
test=# SELECT A.id, B.id, A.val, B.val FROM A FULL OUTER JOIN B ON A.id = B.id WHERE A.id IS NULL OR B.id IS NULL; |
+------+------+------+------+ | id | id | val | val | +------+------+------+------+ | 1 | NULL | V | NULL | | NULL | 3 | NULL | Y | +------+------+------+------+ 2 rows in set (0.00 sec) |
id | id | val | val ----+----+-----+----- 1 | | V | | 3 | | Y (2 rows) |
UNION and UNION ALL
UNION and UNION ALL both combine result sets of two or more select queries into a single result set. UNION produces a distinct result set where duplicate values are not shown. UNION ALL produces a result set that shows results from all the select queries. The duplicate values are not omitted; they are also shown.
UNION : | UNION ALL: |
---|---|
SELECT A.id FROM A UNION SELECT B.id FROM B; |
SELECT A.id FROM A UNION ALL SELECT B.id FROM B; |
+------+ | id | +------+ | 1 | | 2 | | 3 | +------+ |
+------+ | id | +------+ | 1 | | 2 | | 2 | | 3 | +------+ |
Conclusion
Mastering SQL joins is an essential skill for any database user. It helps in enhancing query performance, extracting relevant data, and delivering insights. In this blog, we covered the most commonly used queries for extracting data between two tables. Depending on the specific conditions, the queries can be modified further to filter out irrelevant data according to your needs. The usage and understanding of joins will help to easily create complex queries and assist in data retrieval and data analytics. Incorrect usage of joins could also make your query slower; always check the query performance using EXPLAIN to make optimizations and subsequent modifications in the query.
Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.