Analysis

LONDON ANALYTICS: Machine Learning, SQL, Power Query & Tableau

Data Cleaning

Data cleaning is an important first step almost every data journey. This is a highly underrated skill! This part of the data lifecycle is often the most time consuming (something also found in my first project).

We have to ensure the data is: reliable; in a readable format for computers; contains no (minimal) missing values. We can use many tools to do this, a few of which, I have demonstrated below.

Each step was completed in order, as laid out below.

1. Aquisition

Acquiring our data from the London Data Store.

Clean my data

2. Power Query

Cleaning data using Excel’s Power Query.

Clean my data

3. SQL

Cleaning in Microsoft SQL Server Management Studio.

Clean my data

Exploratory data analysis (EDA)

Once the data has been cleaned, we can see what insights we can gain from the power of SQL and simple graphical visualisations using Seaborn. The idea of an EDA is simply to explore, one query may lead to the next, eventually uncovering some interesting insights.

5. SQl

A brief EDA using SQL.

Analysis

6. Seaborn

A brief EDA using Seaborn.

Analysis

Dashboards and Predictions

After conducting our EDA we can now move onto the finished products. A dashboard offers the opportunity to present our findings in an interactive way. Machine learning on the other hand lets us use the historic data to predict what will happen in the future!

7. Tableau Dashboard

Tableau Dashboard – London - 2014-2020.

Dashboard

8. Predictions (ML)

Machine Learning - London Housing - 1995-2021.

Insights