Have you felt like your data is scattered like puzzle pieces across different tables? What if you could bring all those pieces together, revealing a complete and clear picture? That’s where SQL joins come in!
Joins are used to transform scattered data into meaningful insights. They enable you to bridge gaps between separate tables, uncover hidden connections, and elevate your SQL skills. Whether you’re a novice or an advanced user, mastering joins is essential for anyone looking to manage data across multiple tables and sources of relational databases.
In this post, I’ll explore joins and guide you through how to connect different tables in SQL based on a real business scenario.
Business problems
Remember the hypothetical company we created in our last post, TechCorp.
A business named TechCorp is struggling to manage employee information effectively. They need a systematic approach to store, retrieve, update, and delete employee data. The lack of a centralized database makes it challenging to track employees’ details, such as names, ages, and departments. To address this issue, TechCorp aims to implement a basic SQL solution for managing the processes.
TechCorp’s IT department believes SQL joins could integrate these data sources but needs a strategy to implement this solution effectively and address current issues while remaining scalable.
The challenge is to design and implement a data integration strategy using SQL joins that will:
- Provide real-time insights into employee assignments and project statuses
- Improve resource allocation and project management
- Enhance performance evaluations and decision-making
- Streamline reporting
- Accommodate future data needs
TechCorp has two main databases:
- An employee database with personal and professional details
- A project assignment database tracks employee assignments, project status, progress, deadlines, and outcomes
Using the steps below, we’re going to see how we can solve TechCorp’s business problems.
Check out this YouTube video for a step-by-step guide to take you through the process.
Solution steps
To solve TechCorp’s business problem, we are going to first create a new table called “Project” and populate it with project details. Then, run some initial queries to get started. As you progress, we are going to use left joins to link tables and master the art of combining data. Finally, we are going to identify the people managing these projects.
Prerequisites
- SQL Server Management Studio
- If you want to see how to create a database, check out my previous blog post: Mastering SQL Aggregates and Groups for Powerful Data Analysis – Data Analytics Blog (kelaanalytics.com)
Let’s code
Let’s create the project table using the CREATE command. Remember that this command is used to create new objects in the database. For this operation, we will write the following query:
CREATE TABLE Project (
ProjectID INT PRIMARY KEY,
ProjectName VARCHAR(100),
StartDate DATE,
EndDate DATE
);
To ensure you’re working with the correct database, include the USE command at the beginning of your SQL script.
This statement selects the specified database, allowing all subsequent SQL commands to operate within that context. It’s a good practice to include this command, especially when working with multiple databases, to avoid unintended operations on the wrong database.
After setting the database, execute the command to create the table.
To confirm the table creation, use one of these methods:
- Refresh the database and check the tables to see the presence of the Project table.
- Execute a SELECT command on the newly created table. It will show as empty initially.
Insert data
Now that we have successfully created a table, the next operation would be to fill the table with some hypothetical data that we have created for this post. To run this query, we will make use of the INSERT command. This command is used to insert data into the table.
INSERT INTO Project (ProjectID, ProjectName, ProjectManagerID)
VALUES
(1, 'Project A', 1),
(2, 'Project B', 2),
(3, 'Project C', 3),
(4, 'Project D', 4),
(5, 'Project E', 5),
(6, 'Project F', 6),
(7, 'Project G', 7),
(8, 'Project H', 8),
(9, 'Project I', 9);
You can view what you’ve done by using the retrieval statement:
SELECT * FROM Project
Let’s do something more fun. Say, we decide to get the average age of the employees and group this by department.
For this, we say:
SELECT
EMP.Department,
AVG(EMP.Age) AS AverageAge
FROM
Employees AS EMP
GROUP BY
EMP.Department;
Create a JOIN statement to link tables
We will use join to link employees to their projects, which will allow us to track who is working on what, and how different aspects of the business are interconnected.
Let’s write a query to link our project and employee tables together.
Inner join
Our first task would be to write a query that essentially lists all projects along with the names and IDs of the employees who manage them.
SELECT
EMP.EmployeeID,
EMP.FirstName,
EMP.LastName,
PRJ.ProjectName
FROM
Employees AS EMP
INNER JOIN
Project AS PRJ
ON
EMP.EmployeeID = PRJ.ProjectManagerID;
This command selects columns from the Employees table and joins them with the Project table, where EmployeeID in Employees matches ProjectManagerID in Project, revealing project assignments.
Our second task will be to identify who the project managers are, we are going to create a JOIN statement that links our employee and project tables. This query will specifically select the first and last names of the project.
SELECT DISTINCT
EMP.FirstName,
EMP.LastName
FROM
Employees AS EMP
INNER JOIN
Project AS PRJ
ON
EMP.EmployeeID = PRJ.ProjectManagerID;
Left Join
To perform a left join, which returns all values from the first table and any corresponding values from the second table, modify the join type:
SELECT
EMP.EmployeeID,
EMP.FirstName,
EMP.LastName,
PRJ.ProjectName
FROM
Employees AS EMP
LEFT JOIN
Project AS PRJ
ON
EMP.EmployeeID = PRJ.ProjectManagerID;
Executing this command will show that EmployeeID 5 has no project assigned.
Conclusion
SQL joins are powerful tools that transform disconnected data into valuable insights. By combining information from multiple tables, they enable businesses to uncover hidden relationships, streamline operations, and make data-driven decisions. Whether you’re analyzing customer behavior, optimizing inventory, or managing complex projects, mastering SQL joins is essential for unlocking the full potential of your data. As databases continue to grow in size and complexity, the ability to effectively use joins will remain a crucial skill for data professionals and a key driver of business intelligence.
[…] The database for this post has been provided in a previous post; you can get up to speed by reading it here: SQL Joins and Advanced Queries. […]