11.11 The Biggest Deals of the Year. 40% OFF on selected cloud servers with a free 100 GB data transfer! Click here to learn more.
Machine learning is the science of using statistical algorithms to give computers the ability to learn from a large amount of historical data, create analytical models, and then use analytical models to support business development. Machine learning can currently be applied to the following scenarios:
Machine learning can be typically divided into three categories:
You can implement your own machine learning algorithm for the above scenarios with Alibaba Cloud's Machine Learning Platform for AI. Built based on the Alibaba Cloud MaxCompute (ODPS) platform, Machine Learning Platform for AI is an integration of data processing, modeling, and online and offline prediction. Alibaba Cloud Machine Learning for AI applies the proven technology of Alibaba Group to offer more simple operations for machine learning users. This brings Artificial Intelligence (AI) to the machine learning users.
Note: The data in this section is created for testing only.
The parable of beer and diapers is a classic case of data mining utilization. When the diapers and beer are put next to each other on shelves, the sales of both items increase. The problem is how to find the hidden correlation between two irrelevant products in order to increase their sales. To resolve this problem, you can use data mining algorithms such as collaborative filtering. This algorithm enables you to find the hidden correlations from customers to customers or products to products.
Collaborative filtering is a correlation rule-based algorithm. The following example shows how collaborative filtering predicts the interests of customers A and B in products a, b, and c. If both customers A and B have purchased products X and Y, collaborative filtering determines that customers A and B have similar interests in shopping. Collaborative filtering then recommends product Z to customer B because customer A has purchased product Z. This is a classic example of using features of users as a correlation.
You can use collaborative filtering to make product recommendations, as follows:
This experiment uses the customer shopping behavior recorded before July to find the correlations between products. The information is then used to recommend relevant products to customers and make an assessment of the recommendation results. For example, customer A purchased product X before July. Product X is strongly correlated with product Y. The system then recommends product Y to customer A after July and calculates the probability of customer A purchasing product Y.
This experiment uses data collected from TIANCHI challenges. The data is divided into two parts: shopping behavior before July and shopping behavior after July.
The attributes are as follows:
The following figure shows the data.
The experiment flowchart is as follows:
Load the shopping behavior data recorded before July, use SQL scripts to extract the shopping behavior, and then import the data to the corresponding filtering component. Set the TopN attribute to 1 for the corresponding filtering component. This allows the corresponding filtering component to find the most similar item for each input item and calculate its weight. Analyze which products are most likely to be purchased by the same customer, as shown in the following figure:
The corresponding filtering result shows the correlation between products. The itemid field indicates target products. Products strongly correlated to the target products and correlation coefficients in the similarity field are separated with colons (:).
Step 1 shows how to list all strongly correlated products. The following procedure shows how to recommend product b to customer A after customer A purchase product a by using the product similarity list and how to calculate the hit rate.
This figure shows the statistics components. The full table scan component 1 shows the recommendation list created based on the shopping behavior before July. By removing duplicate rows, the final list contains 18,065 entries. The full table scan component 2 shows the number of products (in the recommendation list) that are purchased by the customers. In this experiment, 90 products are purchased by the customers.
By referencing the recommendation results, the experiment does not reach our expectations. The reasons include the following:
To learn more about machine learning on Alibaba Cloud, visit www.alibabacloud.com/product/machine-learning
Alibaba Clouder - June 22, 2018
Alibaba Clouder - September 6, 2018
Alibaba Clouder - November 14, 2017
- January 9, 2017
Alibaba Clouder - July 4, 2017
Alibaba Clouder - July 18, 2018
More Posts by GarvinLi