×
Community Blog Deciphering Data to Uncover Hidden Insights – Understanding the Data

Deciphering Data to Uncover Hidden Insights – Understanding the Data

In this article series, we will be exploring data analytics for businesses using Alibaba Cloud QuickBI and sample data from banking and financial services.

By Ranjith Udayakumar, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

"The best vision is insight – Malcolm Forbes". When it comes to data analytics for enterprises, nothing is more important than making accurate and reliable inferences from data. It is no surprise that enterprises are investing heavily on big data analytics as they can reap larger profits with accurate insights. However, this is often easier said than done. Data collected from real-world applications is affected by many variables, making data prediction challenging. Regardless, data analytics remain essential for many, if not all, businesses around the world.

In this article, I will walk you through the process of deciphering data to uncovering hidden insights from this data.

Is This Article Series for Me?

This article is meant for everyone! This includes students who just want to familiarize with general concepts, professional data analysts who want to learn new ways to analyze data, and business decision makers who want to know how to get better insights from business data. If you are not familiar with big data and analytics, you should browse for our free e-learning classes on this subject, and try the Alibaba Cloud Apsara Cloud Certifications to consolidate your knowledge.

Prerequisites

This article covers the overall process of deciphering data from conceptual, practical, and best practice perspectives. Anyone with valid data can use this article as a guide to get insights from data with the help of open-source technologies. However, if you are doing data analytics for business intelligence, I strongly recommend using Alibaba Cloud QuickBI.

To use Alibaba Cloud QuickBI, you need to do the following:

  1. Create Account in Alibaba Cloud.
  2. Add a valid Payment Method to your account.
  3. Enroll yourself for free trial of QuickBI Pro in your console.

Overview of the Article

For this article, we are going to be looking at:

  1. Domain - BFSI (Banking, Financial Services, and Insurance)
  2. Modules - From Understanding Data to Visual Stories
  3. Use cases - ATM Analytics, Customer 360

We will be covering the entire process of deciphering data. The overall process involves:

  1. Understanding the data
  2. Wrangling the data according to your business scenario (if needed)
  3. Ingesting the data
  4. Modelling the data
  5. Visualizing the data

This multi-part article talks about how to collect data, wrangle the data, ingest the data, model the data, and visualize the data from three viewpoints (conceptual, practical, and best practice).

In the first article in this series, we are going to see how to understand the data better.

Understanding the Data (Conceptual)

When it comes to big data, more data isn't necessarily better. Your data is only as good as your ability to understand and communicate it, which is why understanding the data is so essential.

Once you've got your data, you need to consider the following problems:

  1. What do you do with it?
  2. What should you look for?
  3. Which tools should you use?

You will need to address these questions for your data analysis to be effective. We will provide some generalized answers for the above questions in this article.

What Do You Do with It?

We should analyze the data to understand the domain it belongs to. With the domain in mind we should ask right questions against the data to get insights out of it. For example, if the data shows ATM location details, transaction type, number of transactions, and transaction amount, it clearly depicts the data belongs to the BFSI domain.

After we determine the domain, it's now our turn to decide what type of insights that we can infer out of it from the given data. We will do this in our practical section.

What Should You Look For?

We should look for some "interesting" insights. As we discussed earlier, we need to ask right questions against the data to understand it better and decipher insights.

For example, let's assume you have some understanding about the BFSI domain. Then, we should able to differentiate the Facts (Measures) and Dimensions (Other than Measures) from the data to get a clear idea about the data.

It's now our turn to understand what are the facts and dimensions available, what are the right questions that we need to ask to the given data. We can do this in our practical section.

What Tools Should You Use?

We need to choose the right tool to wrangle, process, visualize the data effectively. There are lot of tools available in market, all of them with their own unique strengths.

When deploying on the cloud, I prefer using Alibaba Cloud Quick BI, which covers the majority of tasks needed to be done in ease at an affordable price.

  1. Quick BI allows you to perform data analytics, exploration, and reporting on mass data with drag-and-drop features and a rich variety of visuals.
  2. Quick BI enables users to perform data analytics, exploration, and reporting, and empowers enterprise users to view and explore data and make informed, data-driven decisions.

In this article we are going to utilize Alibaba Cloud QuickBI as a tool to decipher the data to get the insights out of it. We will explore how to do this in our practical section.

Understanding the Data (Practical)

As we discussed earlier, we are going to understand the data better with real use cases.

UseCase-1: ATM Analytics

Here we will use the data from ATM Dataset.

1

What Do You Do with It?

As mentioned previously, we know that this data belongs to the BFSI domain. Specifically, this data talks about ATM Transactions. Now before digging deeper, we need to understand the domain basics and how the business users will see it to proceed with next question.

What Should You Look For?

As we discussed earlier we need to ask right questions to understand the data better. We need to differentiate the Facts (Measures) and Dimensions (Other than the Measures).

The Facts include:

  1. no_of_withdrawals
  2. no_of_cub_card_withdrawals
  3. no_of_other_card_withdrawals
  4. total_amount_withdrawn
  5. amount_withdrawn_cub_card
  6. amount_withdrawn_other_card

The Dimensions include:

  1. atm_name
  2. weekday
  3. festival_religion
  4. working_day
  5. holiday_sequence

After separating the facts and dimensions, we can now ask questions about the data. Questions may include:

  1. Total number of transactions
  2. Total transaction amount
  3. Top 5 ATMs by transaction volume
  4. Top 5 ATMs by transaction amount
  5. Lowest 5 ATMs by transaction volume
  6. Lowest 5 ATMs by transaction amount
  7. Number of different transactions by ATM

These questions are key to deriving insights from the data. Without the right questions, we can't derive the value we need from the data.

UseCase-2: Customer 360

Here we will use the data from Customer360.

2

What Do You Do with It?

Similar to the previous use case, we know the data belongs to the BFSI domain, specifically on bank customer details. Now before digging deeper, we need to understand the domain basics and how the business users will see it to proceed with next question.

What Should You Look For?

Similarly, we need to differentiate the Facts (Measures) and Dimensions (Other than the Measures).

The Facts are:

  1. Balance
  2. Duration
  3. Campaign
  4. Pdays
  5. Previous

The Dimensions are:

  1. Age
  2. Job
  3. Marital status
  4. Education
  5. Default
  6. Housing
  7. Loan
  8. Contact
  9. Day
  10. Month
  11. Poutcome
  12. Deposit

After separating the facts and dimensions, we can ask questions such as:

  1. Balance by job
  2. Balance by marital status
  3. Loan by age
  4. Loan by job

These questions are key to deriving insights from the data. Let's now look at the best practices of understanding data.

Understanding the Data (Best Practices)

Here are some of the best practices when trying to make sense out of data, particularly data relating to the two use cases above.

  1. Determine the appropriate domain, and understand the domain basics.
  2. Always ask right questions about the data
    1. Which ATMs fall under the Transaction Volume Benchmark?
    2. Which ATMs fall under Transaction Amount Benchmark?
    3. Which ATMs fall under Hit Rate Benchmark?
    4. Which ATMs perform well irrespective of External Influences?
    5. Top Violators
    6. Income or Profitability of ATMs

  3. Have a clear understanding of Facts and Dimensions.
  4. Name the columns meaningfully.
    1. "Job" as "Job Category"
    2. "Marital" as "Marital Status"
    3. "pdays" as "Previous Days"
    4. "poutcome" as "Previous Outcome"

  5. Name the columns in sentence case and always use space instead of underscore
    1. "Job_Category" as "Job Category"

Summary

I hope that this article gives you a better grasp of the basic principles on data analytics, specifically on understanding your data. If you want to know more about big data and analytics, I highly recommend the Alibaba Cloud Apsara Cloud Certifications. You can advance your skills by learning, and even earn official Alibaba Cloud certifications to demonstrate your professional competency.

In the next article of this series, we will be exploring how to wrangle the data. Please ensure that you have registered on Alibaba Cloud because we will be using QuickBI for other articles in this series. Stay tuned.

"Torture the data, and it will confess to anything – Ronald Coase"

0 7 5
Share on

Alibaba Clouder

2,599 posts | 763 followers

You may also like

Comments