SQL is a key cog in a data science professional's armory. I'm speaking from experience – you simply cannot expect to carve out a successful career in either analytics or data science if you haven't yet picked up SQL.
And why is SQL so important?
As we move into a new decade, the rate at which we are producing and consuming data is skyrocketing by the day. To make smart decisions based on data, organizations around the world are hiring data professionals like business analysts and data scientists to mine and unearth insights from the vast treasure trove of data.
And one of the most important tools required for this is – you guessed it – SQL!
Structured Query Language (SQL) has been around for decades. It is a programming language used for managing the data held in relational databases. SQL is used all around the world by a majority of big companies. A data analyst can use SQL to access, read, manipulate, and analyze the data stored in a database and generate useful insights to drive an informed decision-making process.
In this article, I will be discussing 8 SQL techniques/queries that will make you ready for any advanced data analysis problems. Do keep in mind that this article assumes a very basic knowledge of SQL. For more details udacity data analyst nanodegree review
Let's First Understand the Dataset
What is the best way to learn data analysis? By performing it side by side on a dataset! For this purpose, I have created a dummy dataset of a retail store. The customer data table is represented by ConsumerDetails.
Our dataset consists of the following columns:
Name – The name of the consumerLocality – The locality of the customerTotal_amt_spend – The total amount of money spent by the consumer in the storeIndustry – It signifies the industry from which the consumer belongs toSQL Technique #1 – Counting Rows and ItemsCount Function
We will begin our analysis with the simplest query, i.e, counting the number of rows in our table. We will do this by using the function – COUNT().
Great! Now we know the number of rows in our table which is 10. It may seem to be funny using this function on a small test dataset but it can help a lot when your rows run into the millions!
Distinct Function
A lot of times, our data table is filled with duplicate values. To attain the unique value, we use the DISTINCT function.
In our dataset, how can we find the unique industries that customers belong to?
You guessed it right. We can do this by using the DISTINCT function.
You can even count the number of unique rows by using the count along with distinct. You can refer to the below query:
SQL Technique #2 – Aggregation Functions
Aggregation functions are the base of any kind of data analysis. They provide us with an overview of the dataset. Some of the functions we will be discussing are – SUM(), AVG(), and STDDEV().
Calculate sum
We use the SUM() function to calculate the sum of the numerical column in a table.
Let's find out the sum of the amount spent by each of the customers:
In the above example, sum_all is the variable in which the value of the sum is stored. The sum of the amount of money spent by consumers is Rs. 12,560.

YOU ARE READING
8 SQL Techniques to Perform Data Analysis for Analytics and Data Science
ActionSQL is a key cog in a data science professional's armory. I'm speaking from experience - you simply cannot expect to carve out a successful career in either analytics or data science if you haven't yet picked up SQL. And why is SQL so important? As...