Community Blog Current State of Big Data in Finance and Opportunities

Current State of Big Data in Finance and Opportunities

This article discusses Big Data and its forms and explains how Big Data is impacting the financial services industry.

By Alex Muchiri, an Alibaba Cloud MVP. He is the founder of Itesyl Technologies, a financial data and business banking solutions company.

The English dictionary defines data in the computing context as all symbols and characters on which computations are done. Such data could be text, images, video, or audio. Computers store, transmit or record data using electronic signals via any type of media. So, what is Big Data?

In a nutshell, Big Data is massive data collected from various sources and tends to continue to expand in size over time. Due to its exponential growth, size, and complexity, such data overwhelms traditional data analysis tools such as databases, closed systems, and content management software.

The financial industry including banks, insurance companies, payment service providers, and other players produce massive amounts of data by the minute. At the same time, a new crop of players in the digital space are exerting ever more pressure on the traditional economy. Being a data-driven financial institution requires that companies are able to sort through their data, integrate multiple data sources, and meet user needs.

Sources of Data in the Financial Industry

Data is everywhere, ranging from customer information, account transactions, online purchase activities, service logs, social media interactions with clients, customer queries to customer campaigns. We even have emerging sources such as wearables and IoT and messages across communication platforms. Above all, financial companies contain large databases with a rich source of financial information and customer behavior.

Databases in the financial space can be analyzed for trends and the data can also be exported to other tools such as Excel. However, as the data is massive and standardized, using traditional tools to analyze it does not offer a unique benefit for extracting meaning from the entire engagement with customers. Using standard databases is not the best way to extract meaningful insights from data.

An emerging trend in the financial sector is the rise of mobile. Mobile devices generate massive volumes of data today and can offer interesting insights into how a customer engages with their financial services provider. Companies understand that mobile is the next frontier to win and retain customers and are battling it out for opportunity. Mobile data promises to be a game-changer providing enterprises a massive competitive advantage.

On the whole, financial databases contain excellent financial modeling data such as key economic factors, transaction patterns, comparable analysis, and securities, bonds, and derivatives reporting.

Classification of Big Data

Big Data falls into three main categories or classes including:

  • Structured Data
  • Unstructured Data
  • Semi-structured Data

Structured Data

Structured Big Data can be accessed using standard formats such as a database or a file system. Such data is found in core banking systems, such as accounts, customer information, and transactional databases. Over the years, technology has enabled robust mechanisms to handle such data and extract actionable insights from it. However, it has become too large for traditional systems such as SQL databases and tends to slow down operation. Hence, data is archived after a certain period of time and only accessed when necessary. We are talking about data in multiple zettabytes accumulated over long periods of time. The growth today is monumental owing to multiple consumer interfaces. With such massive data, we can see why the term Big Data is used. Let us see an example of structured data below:

Branch_ID Customer_Name Gender Account_number Balance
001 John Doe Male 12345679 "1000000"
002 John Doe Female 14345679 "1000000"
003 John Doe Male 22345679 "1000000"
001 John Doe Male 12345679 "1000000"
002 John Doe Female 12345679 "1000000"

Structured data is neat but over time can become complex to the extent that working with it becomes expensive.

Unstructured Data

Unstructured data is any type of data without a known form or standard way of accessing it. It tends to be very huge and diverse such that extracting meaning from it is extremely challenging. Financial institutions generate massive amounts of data from customer purchases, communication channels such as Facebook, Twitter, and Messenger among others. They also contain a wealth of information such as the background of customers, pictures, and videos, all of which require complex algorithms to relate. However, most financial institutions such as banks do not possess the capability to do such computations at scale.

Examples include:

  • Email messages
  • Word documents
  • Videos
  • Tweets
  • Customer Photos
  • Alibaba.com search queries

While some file types may have a form of structure, they are considered unstructured because such data does not fit natively into the host database. Presently, nearly 90% of all generated data is unstructured and this percentage is growing.

Semi-structured Data

Semi structured data forms contain both structured and unstructured data. While at first glance such data may seem structured, for example, entity relationship definition of an SQL schema, it still requires some additional processing to fit into a database. See the example below:

<to> Jane Doe</to>
<from>John Doe</from>
<heading>Call missed</heading>
<body class="expansion-alids-init">Please call me back! </body>

What Are the Opportunities in Big Data?

Big Data is an opportunity goldmine with challenges of all kinds. Financial service providers (FSPs) must process all data touching a consumer and weave-in advanced analytics in an "affordable, efficient, and reliable" manner. In fact, the financial services sector is one of the leading sectors in the adoption of highly intelligent data management solutions.

One of the most important use cases for Big Data is the deployment of platforms that cover the entire data lifecycle for the most intelligent applications. Banks learn from patterns and insights extracted from the data to expedite the development of new products and to enhance data modeling and analysis of existing services.

Another important trend is sentiment analysis using social media data. Using Big Data analysis from social media, banks, and other FSPs can have a 360-degree view of their customers and stay up to date with the most current trends. That way they reduce reaction time for better customer service.

Even further, systems with the ability to store and retrieve massive amounts of data in structured, unstructured, and semi-structured formats can be enriched with analytics that delivers real-time insights, security, and business value. It is likely one of the trends that will dominate the sector in the coming years.

It is likely that FSPs will continue to advance their machine learning capabilities, integrate more data into unified platforms, and enhance their predictive capabilities. Such measures will drive the automation of many manual tasks in the sector using AI and new service offerings. According to some reviews, below are the industries leading in the adoption of AI:

  • High tech companies and telcos
  • Automotive and assembly industries
  • Financial services

Of course, with vast amounts of data in the financial sector, security remains an important consideration. Data platforms of the future shall leverage the power of predictive analytics, artificial intelligence, automation, and scale to solve novel problems.

Big Data Solutions by Alibaba Cloud

Alibaba Cloud offers advanced Big Data platforms to help industries derive insights faster, react to changes proactively, and exploit new opportunities. The portfolio includes:


MaxCompute is a large-scale data warehousing service for general-purpose data processing. It is offered as a service through the cloud model of computing for security, speed, and scale. MaxCompute is integrated into multiple data import solutions and churns out different models from stored data.

Data Lake Analytics (DLA)

Alibaba Cloud Data Lake Analytics DLA is a data analytics platform utilizing a server-less architecture and delivered through the cloud. It uses SQL interfaces to connect with clients that could feed data from different sources and locations and provide visualizations to guide decision-making processes.

E-Map Reduce Service

The Elastic MapReduce processes large volumes of data in near real-time. For organizations with large and streaming data, the service offers a cost-effective, scalable alternative.

Quick BI

Quick BI is an intelligent cloud analytics platform that supports ApsaraDB, MySQL, and local files using a drag and drop integration for building data portals quickly.


DataWorks is an offline data development platform leveraging Open APIs and access to powerful Alibaba Cloud's ecosystem. It allows simple drag and drop features for creating new nodes and enables community collaboration.


The DataV visualization tool offers rich visualization capabilities through its rapid interpretation models. Data is visualized into patterns and trends on a single user interface.

Alibaba Cloud Image Search

Alibaba Cloud Image Search is an intelligent image search leveraging a powerful backend service to compare and identify images. It is based on powerful machine learning algorithms developed by Alibaba Cloud. While it is deployed in e-commerce and photo-sharing websites, it could still have use cases in the financial sector with converging data platforms.


Big Data is unlocking tremendous opportunities in the financial sector and we have barely scratched the surface in this article. In this day and age, companies with the most reliable technology partners such as Alibaba Cloud and willing to try new ways of solving problems will be the winners of tomorrow.

0 0 0
Share on


53 posts | 8 followers

You may also like


53 posts | 8 followers

Related Products