This project demonstrates how I cleaned and transformed a messy dataset using MySQL as part of a learning experience inspired by a YouTube tutorial. The goal was to enhance my skills in writing efficient SQL queries and applying best practices for data cleaning.
- Identifying and removing duplicate records.
- Handling NULL values and inconsistent data entries.
- Transforming data to improve structure and usability for analysis.
- Writing optimized SQL queries to process large datasets efficiently.
- Proficiency in MySQL for data cleaning and transformation.
- Understanding of common data quality issues and solutions.
- Writing structured and maintainable SQL scripts.
File Name | Description |
---|---|
README.md |
Comprehensive documentation for the project. |
layoffs.csv |
Raw dataset used for the data cleaning process. |
explore_data.sql |
Queries to explore and understand the dataset. |
remove_duplicates.sql |
Queries to identify and remove duplicate records. |
null_or_blank_values.sql |
Queries to handle NULL or blank values effectively. |
standardize_data.sql |
Queries to standardize and ensure consistent data. |
remove_unnecessary_data.sql |
Queries to identify and delete unnecessary data. |
DELETE
FROM layoffs_staging_02
WHERE row_num > 1;
UPDATE layoffs_staging_02 AS t1
JOIN layoffs_staging_02 AS t2
ON t1.company = t2.company
AND t1.location = t2.location
SET t1.industry = t2.industry
WHERE t1.industry IS NULL
AND t2.industry IS NOT NULL;
I would like to thank Alex The Analyst for their insightful tutorial, which served as the foundation for this project. Their guidance made learning data cleaning with MySQL much easier and more enjoyable.