Step-by-step guide to Visualizations in Python

Updated: Oct 30, 2020

Create great-looking professional visualizations in Python using Matplotlib, Seaborn, and much more packages

Click here to view this on Medium

Data Visualization

Data Visualization is the graphic representation of Data. It involves producing efficient visual elements like charts, dashboards, graphs, mappings, etc. so as to give an accessible way of understanding trends, outliers, and patterns of data to people. The state of achieving people’s minds depends on our creativity in visualizing data and by maintaining a communicative relationship between the audience and the represented data.

Python for Visualization

Python is a highly popular general-purpose programming language and it comes extremely useful for Data Scientists to create beautiful visualizations. Python provides the Data Scientists with various packages both for data processing and visualization. In this article, we going to use some of Python’s well-known visualization packages, Matplotlib, and Seaborn.

Steps Involved in our Visualization

  1. Importing packages

  2. Importing and Cleaning Data

  3. Creating beautiful Visualizations (11 Types of Visuals)

Step-1: Importing Packages

Not only for Data Visualization, but every process to be held in Python should also be started by importing the required packages. Our primary packages include Pandas for Data processing, Matplotlib for visuals, Seaborn for advanced visuals, and Numpy for scientific calculations. Let’s import!

Python Implementation:

In the above code, we imported all primary packages and set our graph style to ‘ggplot’ (grammar of graphics). Apart from ‘ggplot’, you can also use many other styles available in python (Click here to see the styles available in python). We will also use the ‘cyberpunk’ style for upcoming specific chart types. At last, we are mentioning our charts’ measurements.

Step-2: Importing and Cleaning Data

This is an important step as perfect data is a base need for a perfect visualization. Throughout this article, we will be using a Kaggle dataset on Immigration to Canada from 1980–2013. (Click here for the dataset). Follow the code for importing and cleaning the data.

We have successfully imported and cleaned our dataset. Now we are set to do our visualizations using our cleaned dataset.

Step-3: Creating Beautiful Visualizations

In this step, we are going to create 11 different types of Visualizations right from basic chart to advanced charts. Let’s do it!

i) Line Chart

The line chart is the most common chart of all visualizations and it is very useful for the observation of trend and time series analysis. We will start doing it in python with a basic single line plot and we’ll proceed with a Multiple line charts.

Single Line chart Python Implementation:

Output :

Multiple Line chart Python Implementation:

Output :

All plots are based on the ‘ggplot’ style. Now let’s try out Multiple Line Chart using ‘cyberpunk’ style and this style is suitable only for specific chart types. In order to use the ‘cyberpunk’ style in python, it is essential to install the ‘mplcyberpunk’ package. After installing it, follow the code to produce a neon-style plot.

Cyberpunk line chart Python Implementation:

Output :

ii) Bar Chart

A Bar Chart is a type of representation mainly used for ranking values. It can easily be represented in Python using Matplotlib. We are going to further divide Bar Chart into Vertical bar chart, Horizontal bar chart, and Grouped bar chart. There are also many other types but these three are majorly used for visualizations. Let’s do it in Python!

Vertical bar chart Python Implementation :

Output :

Horizontal bar chart Python Implementation :

Output :

Grouped bar chart Python Implementation :

Output :

iii) Area Chart

Like line charts, Area charts are extremely useful for time series analysis. The representation of the Area chart is most similar to the line chart but the only difference is that Area charts are colored between spaces. This type of representation is also divided into a Simple area chart, Stacked area chart, and Unstacked area chart. Let’s dive into the code section of Area Charts!

Simple area chart Python Implementation :

For this, we are going to use the ‘df_tot’ data frame which we created during producing the vertical bar chart.

Output :

We can also produce a simple area chart using the ‘cyberpunk’ plot style which we did before for the Multiple line chart. Now let’s do it for Simple Area Chart.

Cyberpunk simple area chart Python Implementation:

Output :

iv) Box Plot

Box plot is often used for Exploratory Data Analysis to get a statistical view of a given data frame. It also helps us to observe the skewness, distribution, and outliers of data too. We are going to see how to plot Vertical and Horizontal box plots in Python.

Vertical box plot Python Implementation:

Output :

Horizontal box plot Python Implementation :

Output :

v) Scatter Plot

A Scatter plot is a representation that displays values pertaining to typically two variables. It is very useful to observe relations between the X and the Y variable in the axis. Let’s produce a simple scatter plot using the ‘Iris’ dataset in Python!

Scatter plot Python Implementation:

Output :

vi) Histogram

A histogram is a type of chart which is commonly used for observing the frequency distribution of a given variable. For this type of chart, we are going to use the same iris dataset which used before and Seaborn for better quality. Let’s make a histogram in Python!

Histogram Python Implementation:

Output :

vii) Bubble Plot

This type of chart is most similar to a scatter plot but, it represents three dimensions of data. For this chart, we are going to produce the values using NumPy’s ‘random’ function and Matplotlib to produce the chart. Let’s do it in Python!

Bubble plot Python Implementation:

Output :

viii) Pie Chart

A pie chart is a circular statistical graphic divided into slices to represent numerical proportions of the given data. Using matplotlib, we can produce beautiful custom pie charts. Let’s produce a pie chart in Python!

Pie chart Python Implementation:

Output :

ix) Doughnut Chart

Doughnut chart is most similar to the pie chart but we can use more than one data series to plot but, for our visualization, we are going to use only one Dataset which is the Immigration dataset. Let’s do it in Python!

Doughnut chart Python Implementation:

Output :

x) Regression Plot

Regression plots help data scientists to observe patterns in the dataset during Exploratory Data Analysis (EDA) and represent the linear relationships between two variables. It also illustrates the trend between the given ‘X’ and ‘Y’ variables. So, let’s do a Strong trend and a Weak trend regression plot using Seaborn in Python!

Strong trend regression Python Implementation:

Output :

We can observe that the total number of immigrants to Canada represents a strong trend which means the numbers are increasing year by year. Now, let’s create a Weak trend regression plot.

Weak trend regression Python Implementation:

Output :

It is clear that the total immigrants from Scandinavia (Germany, Norway, and Sweden) to Canada fell down year by year hence, it followed a weak trend.

xii) Word Cloud

A word cloud is a visual representation of text data that illustrates the keywords in it and helps people to easily understand the context of the text data. Unfortunately, Matplotlib doesn’t have a built-in function to create a word cloud. So, we are going to use the ‘Pywaffle’ package in Python to create a word cloud also, create a text file of an article or essay to make use of it. Let’s do it!

Word cloud Python Implementation:

Output :

From this word cloud chart, we observe that the given text file is all about Blockchain and its components like 'consensus', 'PoW (Proof-of-Work)', 'hash', 'block', and so on. Awesome!

xiii) Lollipop Chart

This type of chart is way more similar to Bar chart. Lollipop charts help in ranking values and to observe the trend. Creating a lollipop chart is so simple in Matplotlib and let’s do it!

Lollipop chart Python Implementation:

Output :

Final Thoughts!

Finally, we come to end by learning how to create twelve different types of visualizations in Python by making use of various packages like Matplotlib, Seaborn, Pywaffle, and so on. But, this isn’t the end. We just covered some of the basic visuals in python and there are much more than you think of like Geospatial visualizations, Networks, Sankey diagram, and the list goes on and on. You can find great resources on the internet and many free online courses. Apart from learning, practical implementation is the identity of your knowledge. So, start learning and get your feet wet by getting into the world of Data Science. If you missed any coding sections for any of the chart types, don’t worry I’ve provided the full code for all of the visualizations.

Happy Visualizing!

Full code: