Intelligently Generate Frontend Code from Design Files: Field Binding

By Xiaodi

As one of the four major technical directions of the Frontend Committee of Alibaba, the frontend intelligent project created tremendous value during the 2019 Double 11 Shopping Festival. The frontend intelligent project automatically generated 79.34% of the code for Taobao's and Tmall's new modules. During this period, the R&D team experienced a lot of difficulties and had many thoughts on how to solve them. In the series "Intelligently Generate Frontend Code from Design Files," we talk about the technologies and ideas behind the frontend intelligent project.

Overview

imgcook is an ingenious chef specializing in cooking with various images such as Sketch, Photoshop Document (PSD), and static images. With a single click, imgcook intelligently generates maintainable frontend code, including view code, field binding code, component code, and a part of business logic code, from different types of visual design files. As one of the many imgcook services, intelligent field binding can help accurately identify bindable data fields in visual design files in vertical fields such as marketing. This service dramatically improves module development efficiency and enhances the accuracy of transforming visual design files to code. This service is divided into the following parts: data type rules, static text recognition, image field binding, and text field binding.

Position in the Architecture Diagram of imgcook

As shown in the figure, the intelligent field binding layer is divided into the following parts: data type rules, static text recognition, image field binding, and text field binding.

Hierarchical architecture of Design to Code (D2C) technology

Pre-research

The intelligent field binding layer relies on the semantic layer, which marks a node's type as "what" based on empirical data. Then, the intelligent field binding layer transforms this "what" node into a business domain field. To improve accuracy, high-confidence rules are used as binding conditions. The analysis and optimization suggestions for existing problems are as follows:

The semantic layer and the field binding layer are associated too tightly.
- Problem: The semantic layer and the field binding layer are too tightly associated, resulting in poor flexibility.
- Optimization: Separate the semantic layer's judgment process from that of the field binding layer and remove the concept of confidence.
The semantic layer uses hard rules.
- Problem: The semantic layer uses hard rules and focuses on judging whether these rules are met.
- Optimization: Use classification algorithms for hard rules and qualitatively benchmark against the node standards established by W3C.
The semantic layer does not make full use of machine learning algorithms.
- Problem: Machine learning algorithms are only used for entity recognition, syntax analysis, and translation.
- Optimization: Use deep models for image classification and traditional machine learning algorithms for text classification.
Business domain fields frequently change.
- Problem: The mapped fields vary based on different business domains.
- Optimization: Enable different configurations to bind mappings intelligently.
The number of hard rules needs to be increased.
- Problem: The number of existing hard rules is not enough.
- Optimization: Create new rules based on design files to expand the rule layer.

Technical Solution

Field binding uses Natural Language Processing (NLP)-based text recognition and image classification to recognize the content in the virtual DOM (VDOM) to determine the fields mapped to the data model, implementing intelligent field binding. The following figure shows the core flowchart of field binding.

Core flowchart of field binding

The core of field binding is NLP-based text recognition and image classification models, which are described in detail below.

NLP-based Text Recognition

Research

Classification and Analysis of All Dynamic Texts in Taobao's Design Files

We will use the following examples to illustrate the relationship between common fields in the business domain and texts in design files.

Product Title (itemTitle)

Design file text: Upto16characters
Real intent text: NikeAF1JESTERXX

Product title design file

Store Name (shopName)

Design file text: Upto16characters
Real intent text: ZARA Clothing

Store name design file

Store Description (shopDesc)

Design file text: Upto16characters
Real intent text: Up to 50% off

Store description design file

Technology Selection

Naive Bayes

One of the problems we have with NLP-based recognition in field binding is that we do not have enough samples. In particular, we rely on tenants uploading their own samples to train models for their specific business. However, tenants often do not have a large amount of data. In this case, we can use the Naive Bayes classifier for classification because the Naive Bayes formula originates from classical mathematics. Naive Bayes's posterior probability is derived from prior probability and adjustment factors and does not depend on the amount of data. Naive Bayes is robust for handling small data sets and noise. The Bayes' theorem is expressed in the following formula:

It is much easier to understand if we express the formula in the following form:

We just need to calculate P(category|feature).

Word Segmentation

Before classifying each sample, we need to extract its features, which means to segment the sample. On the AI machine learning platform, we use Alibaba Word Segmenter (AliWS) by default. AliWS is a lexical analysis system that is widely used in various product lines of Alibaba Group. AliWS provides the following features: ambiguity segmentation, multi-granularity segmentation, named entity recognition (NER), part-of-speech (POS) tagging, and semantic tagging. You can maintain your own dictionaries and handle or correct word segmentation errors. In our project, NER applies to simple entities, phone numbers, time, and dates.

Model Construction

We use the machine learning platform for rapid model construction. The machine learning platform encapsulates the word segmentation algorithm of AliWS and the multi-class classification of Naive Bayes. The following figure shows the process of model construction.

Process of training a text NLP model

As the figure shows, the first step is to run an SQL script to pull training samples from the database and then perform word segmentation on the samples. After that, the system proportionally splits the samples into a training set and a test set. The system uses a Naive Bayes classifier to classify the samples in the training set and then uses the test set for prediction and evaluation based on the classification results. Finally, the results are uploaded to Object Storage Service (OSS) using the odpscmd command.