With the development of deep learning, all areas of our lives are undergoing intelligent transformations. As the team positioned closest to users, the frontend also wants to use AI capabilities to improve our efficiency, reduce labor costs, and provide users with a better experience. Intelligent transformation is seen as an important area of growth for the future of frontend development. However, frontend engineers have the following doubts:
Through communication with frontend engineers and research, we discovered the main reasons that prevent frontend teams from entering the AI field:
A Problem Analysis Diagram
With the rapid development of AI, intelligence has empowered many industries. We believe that there are some web scenarios where AI can be applied. However, in many cases, non-algorithm engineers cannot effectively identify and determine the scenarios where machine learning can be used. In addition, they are not sure to what extent deep learning can solve problems and whether its performance is better than traditional rule engines due to their lack of an in-depth understanding of models and algorithms. To solve this problem, we can use either of the following methods:
We know that data and models are the core elements of deep learning. If the model is a rocket engine, data is its fuel. Machine learning needs a large amount of high-quality fuel to allow it to realize its full potential. The frontend has accumulated some data over the years, and we also have advantages in data collection because we are the team closest to users.
The frontend possesses the following data:
The data can be classified into computer vision (CV) data and text data. CV and natural language processing (NLP) are also the focus of machine learning. However, frontend engineers often do not know how to process data so that it can be turned into fuel for their models. Our framework must provide fast and simple data processing, as well as convenient capabilities, such as data quality assessment and data visualization.
For non-algorithm engineers, models and algorithms represent another huge obstacle. They always worry that they do not understand the mathematical principles of a model and do not know how to use deep learning frameworks, such as TensorFlow. This problem is both easy and difficult to solve.
It is easy to solve because experience has been accumulated in some traditional deep learning fields over many years, and almost every field has its own popular and mature models with industrial availability. We only need to provide model implementations in the framework. In this way, non-algorithm engineers can use models without any configuration required and do not need to worry about internal implementations. However, this problem is difficult to solve because some non-algorithm engineers think that models are too much like black-boxes and want to slightly adjust them based on their known algorithm knowledge. Therefore, we must also provide intervention and adjustment capabilities in the framework.
JS vs. Python
The language problem is both simple and complex. As a simple solution, we can use JS, which is the language that frontend developers are most familiar with. Therefore, we developed Pipcook purely with Typescript, provided JS-based APIs, and implemented plugins for data processing and models based on tfjs-node. However, the JS-based machine learning ecosystem is still developing, and we cannot hope the JS ecosystem will provide the same richness as Python in a short time. Therefore, if our framework only uses JS, it is bound to be incomplete to a greater or lesser extent. Our solution is to provide a Node version of Python, like Swift, so Python libraries can be called in Node.js to help the frontend team.
After solving the preceding problems, we know why we need to use machine learning, when it can be used, and how to use it. In addition, we have provided solutions for each problem from the perspective of frontend engineers. As Pipcook and the entire JS-based machine learning ecosystem gradually mature, we believe that frontend engineers will get better at using intelligent capabilities.
A Diagram of the Pipcook Architecture
After we solved the scenario, algorithm, data processing, and language problems, we designed a pipeline-based frontend stream-format machine learning framework, as shown in the preceding figure. Models and data flow in the pipeline. We can embed plugins in this pipeline to process models and data and forward them downstream. Each plugin is responsible for a specific task in the machine learning cycle. Pipcook defines a series of specifications that allow third-party developers to develop plugins to extend Pipcook's capabilities. Our framework is based on TensorFlow.js for machine learning and training. We can also use the Python ecosystem through Python bridging. The following sections introduce several key parts of the framework.
A Sequence Diagram
Pipcook is a pipeline-based framework that includes data collection, data access, data processing, model configuration, model training, model service deployment, and online training. A specific plugin is responsible for each process. Plugins allow you to customize each process, and pipelines allow you to connect plugins in a series to implement algorithm engineering. The whole process is based on Node.js, and Node Package Manager (NPM) manages and maintains the plugins. The plugins for data processing and model service deployment can be deeply integrated with the existing frontend technical system.
Pipcook defines a set of dataset specifications. This prevents data access and usage costs resulting from different dataset standards in plugins for data collection, access, and processing. It also ensures that data can be shared between different pipelines. The protocols used by these plugins can generate standard and unified datasets under different labeling tools. The data processing plugin makes it easier to understand and optimize datasets.
The underlying models and algorithm capabilities of Pipcook are provided by the node version of TensorFlow, a well-known machine learning framework. The tfjs-node makes it much easier to use JS for machine learning. Therefore, our JS-based machine learning platform can also easily use the tfjs-node. For example, we can use mature official models (such as MobileNets), use basic operators to build a new model, or use its tensor capabilities to make up for the fact that the JS platform does not have something similar to NumPy.
As a brand new JS-based machine learning platform that has only been open source for a short time, Pipcook still has many imperfections. To push the whole frontend industry towards intelligent development, we will work to continually optimize Pipcook.
Currently, Pipcook's built-in plugins support a pipeline for image classification and object detection, and the pipeline for object detection uses Python capabilities. In the future, we hope to develop models based on the native tfjs-node to expand the JS-based machine learning ecosystem. In addition, Pipcook will continue to provide more plugins to support popular deep learning tasks, such as NLP and image segmentation. We also welcome third-party developers to contribute to these models.
As data volumes and model complexity increase, our computing power may prove insufficient. In the future, we will train models on multiple devices, support parallel, distributed parallel, and asynchronous data training, and use clusters to solve computing power problems.
Currently, Pipcook only supports simple solutions, such as local deployment. In the future, Pipcook will cooperate with various cloud service providers, such as Alibaba Cloud, AWS, and Google Cloud, to deploy models to cloud computing machine learning deployment services in the pipeline. This will allow you to start using prediction services as soon as training is completed.
In the future, we hope to combine the power of Alibaba's intelligent frontend team and the entire open-source community to continuously optimize Pipcook and the push for intelligent frontend capabilities it represents. This way, we can provide inclusive technical solutions for intelligent frontend capabilities, accumulate more competitive samples and models, provide intelligent code generation services with higher accuracy and availability, and improve frontend R&D efficiency. In addition, frontend engineers will no longer have to do simple and repetitive work, giving them more time to focus on challenging work.
Alibaba F(x) Team - December 10, 2020
Alibaba F(x) Team - February 26, 2021
Alibaba F(x) Team - December 8, 2020
Alibaba F(x) Team - June 22, 2021
Alibaba Clouder - December 31, 2020
Alibaba F(x) Team - December 31, 2020
ET Brain is Alibaba Cloud’s ultra-intelligent AI Platform for solving complex business and social problemsLearn More
An end-to-end platform that provides various machine learning algorithms to meet your data mining and analysis requirements.Learn More
A high-quality personalized recommendation service for your applications.Learn More
This solution provides you with Artificial Intelligence services and allows you to build AI-powered, human-like, conversational, multilingual chatbots over omnichannel to quickly respond to your customers 24/7.Learn More
More Posts by Alibaba F(x) Team