By Garvin Li
Note: The data in this article is fictitious and is only used for experimental purposes.
Issuing agriculture loans is a typical data mining case. Lenders use an experience model built based on statistics of past years (including a borrower's yearly income, types of planted crops, loan history, and other factors) to predict that borrower's repayment ability.
This document is based on agriculture loan scenarios and shows you how to use a linear regression algorithm to handle loan issuing business.
Linear regression is a widely applicable statistics analysis method used in statistics to determine the quantitative relation that two or more variables depend on. This article predicts whether to issue requested loan amounts to users in the prediction set by analyzing the issuing history information of agriculture loans. We will be performing all our data analysis on the Alibaba Cloud Machine Learning platform.
The specific fields are as follows:
The following is a screenshot of the data.
The following diagram shows the experiment process.
Input data is divided into two parts:
Predicate whom of the 71 applicants will receive loans based on the existing 200+ pieces of history data.
Map data of string type to numbers according to data meanings. For example, for the "region" field, map "north", "middle", and "south"in order to 0, 1, and 2 respectively, then convert the field to the double type by using the type conversion component, as shown in the following diagram. You can perform model training after data is pre-processed.
Use linear regression components to train history data and generate a regression model, which is used in the prediction component to predict data in the prediction set. Use the column merge component to merge user ID, prediction score and claim value, as shown in the following screenshot.
The prediction score indicates a user's loan repayment ability (expected loan repayment amount).
Use the regression model evaluation component to evaluate the model. The following table describes evaluation results.
Use filtering and mapping components to determine applicants that can receive loans. The principle of the experiment is that, if an applicant's repayment ability is predicated to be greater than the requested loan amount, that applicant will receive a loan. This principle applies to each potential customer.
To learn more about Alibaba Cloud Machine Learning Platform for Artificial Intelligence (PAI), visit www.alibabacloud.com/product/machine-learning
GarvinLi - December 27, 2018
GarvinLi - December 27, 2018
GarvinLi - January 18, 2019
Alibaba Clouder - July 18, 2018
Alibaba Clouder - July 17, 2019
GarvinLi - November 7, 2018
An end-to-end platform that provides various machine learning algorithms to meet your data mining and analysis requirements.Learn More
A secure solution to migrate TB-level or PB-level data to Alibaba Cloud.Learn More
A premium, serverless, and interactive analytics serviceLearn More
Data Integration is an all-in-one data synchronization platform. The platform supports online real-time and offline data exchange between all data sources, networks, and locations.Learn More
More Posts by GarvinLi