By Harshit Khandelwal, Alibaba Cloud Community Blog author.
We live in an era of Big Data where there is an abundance of data and companies across several different industries are coming up with new and unique ways of using this data to create value for their customers. For all of this, machine learning is an important piece of the puzzle.
Machine learning (abbreviated ML) can be described as a mechanism whereby a machine learns a pattern from data sets so that it can predict future data. The major types of machine learning algorithms are supervises, semi-supervised, unsupervised, and reinforcement learning. In a machine learning pipeline, training data, some sort of model for that data, and an algorithm are used. After initial training, a test dataset is applied to the model to check the accuracy of predictions made by this pipeline.
Machine learning pipelines typically have the following steps:
Machine learning can be broadly considered as a subtype of Artificial Intelligence (AI) and the larger umbrella category under which you find other types of algorithms like deep learning algorithms.
The computation power of the machine on which these types of algorithms are deployed also plays a big role in how power the algorithm can be. In the cloud, all of these algorithms are instrumental pieces to many services provided, and they rely on the computing power provided by servers on the cloud.
One common application of machine learning in recent years is recommendation systems. These systems use user input data to provide user recommendations. On example of these systems is the one used by Netflix.
Netflix uses a state-of-the-art recommendation system that can provide accurate recommendations. The algorithm used takes input such as the user's viewing history, user ratings, the data of other users with similar tastes, and the time of the day the user watched the content.
This recommendation system is important as about two thirds of movies watched on Netflix are recommended ones. In other similar services provided by Amazon and Google, the story is very similar. For Amazon, 35 percent of sales on their ecommerce platform come from recommendations, and on Google, news recommendations improved click-through rates by 38 percent.
This section takes you on a step-by-step tutorial of how to use machine learning on Alibaba Cloud. In this tutorial, you will create a basic machine learning pipeline to create a binary classification algorithm.
First, procure the Data. To do this, find a data source you want to work with. You can find some datasets in the console already. In this example, breast cancer data is used.
The data is as follows:
The following is the output for node 2 (the feature weights):
The result from prediction node is as follows:
Next, the confusion matrix output is as follows:
For this particular machine learning pipeline, Logistic regression is a statistical model which in its basic form uses a logistic function for classification, which can be understood as the prediction of labels. Other models may be a numerical value (or regression), which can be labels or binary numbers, such as 0 or 1. If the prediction values are two then it is called as Binary Logistic Regression, but if the categorical outputs are more than two then it is called as Multinomial Logistic Regression and if the multiple categories are ordered, then ordinal logistic regression.
Machine learning is means by which machines can predict future data based on current data. Therefore, machine learning can use data to provide value to customers. However, the power of a machine learning algorithm is limited by the machine or device is on. This is also the case for servers in the cloud. Machine learning plays an important role in cloud. Last, to develop a machine learning algorithm, you need to follow the regular steps of a pipeline.
Alibaba Clouder - June 17, 2020
Alibaba Clouder - July 17, 2020
Alibaba Clouder - December 30, 2020
Alibaba Clouder - August 12, 2020
Alibaba Clouder - September 25, 2020
Alibaba Clouder - June 29, 2020
A platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.Learn More
A secure environment for offline data development, with powerful Open APIs, to create an ecosystem for redevelopment.Learn More
Powerful parallel computing capabilities based on GPU technology.Learn More
More Posts by Alibaba Clouder