6.4 — Beware of Duplicates When Joining

What You'll Learn

Why joining tables can create duplicate rows
How to diagnose and prevent unwanted duplicates
Best practices for safe joins

Sample Tables

Table sales:

SaleID	ProductID	Amount
1	101	120
2	102	80
3	101	50
4	103	90

Table products:

ProductID	ProductName
101	Apples
101	Apples
102	Oranges
103	Bananas

Why Do Duplicates Happen?

When you join tables in SQL, each row from the first table is matched with every row from the second table that meets the join condition.
If either table has duplicate values in the join column, you may get more rows than expected.

Example:

Suppose your sales table has multiple sales for the same ProductID, and your products table also has duplicate ProductID values (perhaps due to data entry errors):

sql

SELECT s.SaleID, p.ProductID, p.ProductName, s.Amount
FROM sales s
INNER JOIN products p
  ON s.ProductID = p.ProductID;

If products has two rows for the same ProductID, each sale for that product will appear twice in the result.

How to Diagnose Duplicates

Check for duplicates in your join columns before joining:

sql

SELECT ProductID, COUNT(*)
FROM products
GROUP BY ProductID
HAVING COUNT(*) > 1;

If this returns any rows, you have duplicate keys in products.

How to Prevent or Handle Duplicates

Ask to fix the data quality issue by deduplicating rows in the source tables before joining.
Ensure your join columns are unique identifiers (primary keys) in at least one table.
Option: Use SELECT DISTINCT after your join to remove duplicate results (but this only hides underlying data issues).

Key Points

Duplicates in join columns can multiply your result rows unexpectedly.
Always check your data for duplicates before joining.
Use unique keys for reliable joins.

Next Steps

Now that you understand the risks of duplicates, you’re ready to confidently join tables and build robust SQL analyses.

6.4 — Beware of Duplicates When Joining ​

What You'll Learn ​

Sample Tables ​

Why Do Duplicates Happen? ​

How to Diagnose Duplicates ​

How to Prevent or Handle Duplicates ​

Key Points ​

Next Steps ​