Managing risk is a challenging enterprise, and errors are often made which can lead to catastrophic consequences. Today, big data analytics using digital tools like Hadoop or Splunk has seen an uptick amongst corporations looking to mitigate risk. There's an optimism that reviewing big data can yield insights that can help manage risk more effectively and thus prevent disasters such as the 2008 financial crisis. For example, many banks are now performing real-time analytics on customer data such as credit history, transaction history and employment history to more accurately determine which segment of customers represent a high or low risk for being given a mortgage or loan.
In the same way, numerous product manufacturers are utilizing big data analytics in order to determine their customers' likes and dislikes, enabling them to create products that meet their customers' specific tastes. Doctors are using big data to determine high risk patients who require more immediate care. The energy industry is using big data to spot problems in the production process early on before they develop into something unmanageable. And the list goes on across a plethora of different industries.
Nevertheless, while big data offers tremendous potential to manage risk across many industries and sectors, it's important to avoid common mistakes when handling said data. These could produce inaccurate results that will enhance risk if instead of reducing it.
Data scientists must ensure the data they are using is a relevant and complete representation of what they want to analyze (such as customer behavior, or oil pressures). Using incomplete or skewed data sets can lead to erroneous conclusions that will undermine risk management.
Historical data is important for generating insights to manage risk. However, it is recommended to also incorporate the most up-to-date data available, preferably in real time, for the most accurate insights. With the world is continually in flux, what was true yesterday may not be true today.
A frequent mistake when performing big data analytics is not including all the pertinent variables in the calculations. Data scientists must ensure that all relevant variables (e.g. customer income, credit history and employment history for evaluating mortgage suitability) are captured, since even one missing variable can dramatically alter the accuracy of the result. Deciding what the pertinent variables are is not always straightforward, often requiring deep thought as well as even trial and error iteration.
Perhaps the most serious mistake of all is cherry-picking the data set to produce results which are skewed based on the analyst's bias. Data scientists must be very careful to not let their subjective views affect what data sets they select for evaluation. This point seems highly relevant in today's era of 'fake news', where people listen to news which they want to be factual, even if it's not. The same principle applies to big data analytics.
Alex - June 21, 2019
ApsaraDB - April 28, 2020
Alex - June 21, 2019
Alibaba Developer - January 10, 2020
yanmin - June 25, 2019
amap_tech - April 20, 2020
Deploy custom Alibaba Cloud solutions for business-critical scenarios with Quick Start templates.Learn More
ApsaraDB for HBase is a NoSQL database engine that is highly optimized and 100% compatible with the community edition of HBase.Learn More
This solution helps you easily build a robust data security framework to safeguard your data assets throughout the data security lifecycle with ensured confidentiality, integrity, and availability of your data.Learn More
SDDP automatically discovers sensitive data in a large amount of user-authorized data, and detects, records, and analyzes sensitive data consumption activities.Learn More
More Posts by Alibaba Clouder