As one of the four major technical directions of the Frontend Committee of Alibaba, the frontend intelligent project created tremendous value during the 2019 Double 11 Shopping Festival. The frontend intelligent project automatically generated 79.34% of the code for Taobao's and Tmall's new modules. During this period, the R&D team experienced a lot of difficulties and had many thoughts on how to solve them. In the series "Intelligently Generate Frontend Code from Design Files," we talk about the technologies and ideas behind the frontend intelligent project.
imgcook intelligently generates maintainable UI view code and logic code from various design image formats (including Sketch, Photoshop, and static images) with a single click.
Logic development is the last and most time-consuming step in the requirement implementation flow of frontend development. During the entire frontend development process, except for the initial static view compilation, all the code for data mapping, animation, function compilation, event streams, and tracking logs are essential supplements to the static view information.
As the following figure shows, requirements generation involves the collaboration of the product, user experience, and user interface designers. Programmers implement the requirements. If "requirement documents consist of design files and product requirement documents (PRD)", the combination of "static view and logic" is equivalent to the frontend page's final code.
Requirement implementation flowchart
Frontend development is a type of graphical user interface (GUI) programming. After transitioning to the GUI era from the command line interface (CLI) era, we have been continuously exploring convenient ways to tackle UI development. In the early years of frontend development, MVC and MVVM design ideas were proposed, and excellent framework class libraries such as jQuery, Backbone.js, Angular, Vue.js, and React emerged.
Separation of concerns is the guiding thought in GUI development. Views (V) and model data (M) are separated to simplify the software's internal structure. In practice, most design ideas and frameworks in UI development follow this primary thought, and the HTML+CSS+JS web technology is also a manifestation of this thought.
React is close to this basic thought and is preferred by Alibaba. It only provides V, M, and the rendering process related to these two elements. Simply put, UI = render(Data). We agree with this and have started the Design2Code (D2C) project to generate code based on design files.
This article discusses the business logic in the project code apart from views. Considering D2C as a design-to-code process, we need to generate the final web pages' code at the business logic layer described in this article.
The logic layer is a downstream core capability of the D2C project. All service-oriented intelligence capabilities must be implemented at this layer.
Technical architecture of D2C recognition capabilities
Most of the technical systems in the D2C project are established based on visual design files to accurately and properly recognize layout structures, field class names, and inline components. However, the logic layer needs to provide the capabilities that the D2C project lacks. This makes the business logic layer different from other layers of the project.
In addition, as an essential transition layer, the business logic layer converts the intelligence capabilities of the D2C project into the input of a visual orchestration platform. Intelligence results could be inaccurate. However, the downstream visual orchestration IDE requires a deterministic protocol to ensure that the code generated by the project can be eventually brought online. Therefore, reliably implementing business logic is a huge challenge.
In the existing D2C flowchart, in addition to the UI structure converted by a layout algorithm, the input information of the business logic layer includes the following items:
The business logic layer generates a visual orchestration protocol that carries the business logic. This protocol's fields can be classified into the following types by functionality:
In a conventional development process, UI and logic must be manually coded. In the D2C process, D2C's visual restoration capability enables automatic UI coding, which significantly reduces the time required for UI development. D2C's business logic layer is designed to implement automatic logic coding. We want to fully upgrade the D2C capabilities to implement a unified vision and logic restoration and achieve zero investment in frontend coding.
Comparison between the conventional coding process, the current D2C process, and the desired D2C process
In the preceding figure, automatic development parts are highlighted in blue, and those involving D2C capabilities are highlighted in red. In the desired process, D2C is expected to restore and implement all frontend code. However, some requirements will inevitably be missed if we use the design files as the only input. Therefore, we have added a logic pre-configuration step before restoration to ensure nothing will be missed.
Before implementing intelligent logic automation, we need to analyze how logic code is written.
After developing a static view, we need the input data to generate the logic code that we want to add to the pages. The input data can source from our visual design files, development experience, special rules under existing frameworks, and PD requirement documents. We can obtain information related to the requirements from the input data and visualize it into logic nodes.
Assume that a visual design file contains a "Search" button. Based on our experience, the button is probably used to trigger a network request when clicked. By referring to the requirement document and communicating with the relevant designers, we can determine the network request method and the content to be returned. In this way, a requirement is visualized into a logic node, and the next step goes to logic coding and testing.
Frontend requirement coding flowchart
We must automate the preceding coding flow to generate logic intelligently. The requirement coding flowchart shows that the logic coding process is divisible into two stages by requirement visualization: Requirement collection and requirement implementation. We must fulfill the visualized requirements during coding.
The business logic layer provides two incremental D2C capabilities in the desired process: logic pre-configuration and logic restoration. The two incremental capabilities correspond to requirement collection and requirement implementation in the requirement flowchart. In the D2C project, they are referred to as logic recognition and logic expression and are integrated into the process's following two procedures.
D2C logic process
In the D2C project, logic recognition is a pre-configured process. We can configure different plans for different logic nodes in advance. If a configuration is hit when we use D2C capabilities, the corresponding logic exists.
In Taobao marketing scenarios where D2C was initially released, we attempt to analyze the major issues facing intelligent logic generation.
Observing the logic nodes contained in a module
Take a glance at the layout pattern above. Two modules are arranged in the same row of the page, and one loop logic is required.
The "glance" enables us to identify the layout by analyzing the page structure generated by the layout algorithm.
Let's analyze the text. "Annual cross-store discounts" indicates that the text is about customer benefits.
"Analyzing the text" is a process of extracting features from the text. Different features are usually extracted for various purposes. In this example, "cross-store," "discounts," and other words related to consumer benefits appear. We can effectively differentiate these words from others based on the AliWS word segmentation and Naive Bayes multi-class classification algorithms.
It is likely that "1000 pieces sold" should be bound to the monthly sales field.
Here is also a text analysis process, and we can accurately differentiate the text using a regular expression.
The coupon in the lower-left corner has apparent visual features. Small blocks like this are usually abstracted as a business component.
The coupon in the lower-left corner is a business component whose style, text, and node quantity have data features. We can use conventional machine learning algorithms to analyze these features.
The commodity image has a white background and should probably be bound to the white background field.
The commodity image has obvious features and can be recognized easily based on the image classification algorithm.
The "Buy Now" button may redirect us to the details page. Alternatively, the outer layer of the module completes the redirection instead.
We can recognize this logic only by relying on our experience.
We will not dive into the details about other similar analysis processes in this article. We need to know the composition of the above logic nodes in actual scenarios to generate business logic intelligently. Therefore, we have analyzed Taobao marketing modules to obtain the following distribution histogram of logic nodes:
Logic analysis histogram of marketing modules
Data binding logic accounts for over 50% of all logic types, followed by tracking log logic, loop logic, business processing function logic, and component logic. In general, the module development logic for Taobao marketing scenarios is a set of enumerable, reusable, modeled, and systematic programs following specific rules and specifications. According to the Double 11 Shopping Festival performance over the past few years, the current specifications meet the business needs, and no large-scale requirement surge is expected in the short term.
Taobao marketing module development specifications
The D2C project provides many underlying intelligence means. With expert experience, we can fully retrieve and recognize the preceding logic. For example:
Random forest algorithm: A random forest is a classifier that contains multiple decision trees. Its output class is decided by the mode of individual tree output classes. It applies to both regression and classification tasks and easily shows the relative importance of a model's input features.
XGBoost multi-class classification: Extreme gradient boosting (XGBoost) is often used in algorithm competitions and has achieved remarkable results. Currently, it is the fastest and best open-source toolkit for massively parallel boosted trees.
NLP text classification: D2C analyzes text based on the AliWS word segmentation algorithm (provided by the Alibaba PAI platform) and the Naive Bayes multi-class classification algorithm. The AliWS algorithm mainly supports ambiguity segmentation, multi-granularity segmentation, naming entity identification, part-of-speech (POS) tagging, and semantic tagging. You can use the AliWS algorithm to maintain your dictionaries and handle or correct word segmentation errors.
Image classification: D2C classifies images in business scenarios and trains a convolutional neural network (CNN) model for transfer learning based on a residual neural network (ResNet). The image classification service is also deployed on the Alibaba PAI platform and follows the same productization process as the NLP text classification service.
Semantic service: The semantic service is customized by the D2C project to name classes in mobile scenarios. It builds a policy tree by using the internal expert system, uses Alinlp semantic entities, lexical analysis, translation, and other two-party services in specific identification processes, and uses the built-in IconFont service to identify small icons.
Layout algorithm: D2C develops a rule algorithm for conversion from absolute positioning to flex layout based on its built-in row and column scanning policy. It provides key features such as loop detection and local grouping.
In addition, unique logic in some business domains is uncharacteristic and therefore implemented manually.
Finally, we decide to identify the logic by visual layout, text semantics, image features, and empirical rule through multiple means and supplement necessary information for logic expression. The programs for recognizing module logic are referred to as logic recognizers.
Each recognizer gives recognition results based on its expertise field and comprehensively retrieves the visual design files to achieve a result that is close to human thinking.
Logic expression depends on two factors: the form and content of the expression.
D2C requires a set of protocols for carrying the intelligent output and an intelligent intervention platform to determine the expression form of logic.
Based on outstanding frontend frameworks such as React and Vue, imgcook, an application that implements the D2C capabilities, has implemented a simplified version of the data-driven (UI = render(Data)) lifecycle by referring to various competing products. You are encouraged to develop and write components based on this specification.
Custom lifecycle of imgcook
In 2019, the Alibaba Taobao marketing team upgraded the marketing module specifications to hooks-based Rax 1.0. Based on the new component specifications, imgcook provides code generation services for mobile and PC terminals separately. In this way, developers can use the lifecycle with hooks. They only need to focus on the module development method specified by imgcook, without differentiating hooks from other technical solutions.
Structure of the imgcook Tianma module project
After specifying coding rules for users, imgcook provides visualized operations to implement the code. Currently, imgcook IDE enables visualized coding for most static modules. The following figure shows the panel for frequent module logic operations:
Visualization panel of the internal edition of imgcook
For example, in the imgcook visual editor, we usually implement the typical Taobao marketing requirements in the following way (assuming that the current module is a single-loop module with two items displayed in the same row):
We abstracted the operation steps of each logic to obtain the following logic implementation procedure:
Abstract logic implementation procedure
Each column in the table shows an abstract capability of imgcook IDE, which means most logic can be implemented through configuration. As there are only a few marketing modules with large-scale business logic, we decide to generate function operators based on reuse instead of prediction. We manage the process by controlling the execution sequence and whether to generate a return value. More fine-grained arithmetic or logical operators and process control statements are not discussed here.
The medium between visual output and real code is the imgcook protocol. Next, we will discuss how to generate the protocol automatically.
The core of auto-protocol generation lies in content. The key does not lie in the operation to be performed by a node. It lies in the node to which the code will be bound and the content in the generated code. Therefore, the logic recognizers transfer the current process's content, such as the node information, global variables, and manual configurations, and inject them into the logic expression runtime. The data required for clearly and accurately expressing the current logic is named logical context in the D2C business logic layer.
Let's look at some real logical context.
The content of this logical function is a reusable xtpl template. You can directly access recognizeResult as the rendering context.
You can use the rendering context to "hold places" in the protocol to be added to express the protocol accurately.
After in-depth analysis, the issue of intelligently generating business logic has been broken down into using intelligent capabilities to recognize logic and generate context and using context to implement auto-logic expression. After these two points turned out to be feasible, we can start the design.
Based on the preceding derivation, we have outlined the capabilities of the business logic layer.
Logic recognition + Logic expression = Logic Intent
You need to specify the logic recognizer's recognition method based on actual scenarios. In this phase, the logic recognizer incorporates visual design files and manual rules to generate the recognition result.
Now you can consider that D2C has "confirmed the module requirement."
The logic expresser can be seen as the preconfigured version of imgcook that performs visual operations. It translates the recognition result and directly reflects the influence of the logic on the final module.
Now you can consider that D2C has "completed coding for the requirement."
Capabilities of the D2C business logic layer
Based on the preceding derivation and role definition, we divide the business logic layer into several modules such as logic recognition, logic expression, and logic core.
Structure of the D2C business logic layer
The D2C project was piloted in the Taobao marketing domain. To facilitate future access to other Alibaba business scenarios, we have proposed a team-specific logic scenario. A logic scenario is a collection of logic specific to a business domain. Provided that a business domain has enumerable, standardized, and customizable logic, you can build your logic scenario to facilitate module development by developers in your team.
D2C can recognize the 1:N (N blocks arranged in one row), Rows && Cols (horizontal and vertical loops), and Cols_cols_rows (loop nesting at any levels) module layouts, covering most static module layouts in the marketing domain. D2C requires highly standardized design files. For example, the module layout must be 100% standardized in activities like the Double 11 Shopping Festival. In other words, standardization is required for the D2C project. Therefore, we have upgraded our requirement on layout restoration accuracy in D2C design files. The layout restoration accuracy must comply with the protocol. You can add annotations to the design files to ensure an accurate layout restoration structure and normal loop detection.
Layout patterns supported by D2C
Standardization is the premise of intelligence. D2C must map the identified view to the schema so that the view can express the logic properly. When D2C identifies the layout of the module view, it also retrieves the schema to ensure that each loop level corresponds to the fields of the level. At present, however, developers cannot guarantee that each module would have a complete schema. Therefore, imgcook supports inferring view models without a schema. This ensures that the D2C project functions well as a complete system even when it is only provided with the design files.
Inferring is a process of building a schema tree. In the layout structure, we regard the loop layout as a branch and each node with bound data as a leaf of the branch. The content of the leaf nodes comes from the aggregation results of modules' historical data in Taobao.
The function recognizer and view expresser execute functions based on NodeVM at the business logic layer. We can learn about the connections between them based on the API definition.
In actual scenarios, D2C does not need to generate much coding logic automatically. We can manage data flows by controlling the time sequence and specifying whether a return value is generated. Process control only appears in the handle function of a lifecycle event or node event. For example, suppose that we need to implement three types of logic in a created function: inserting an image into a loop array, truncating an array depending on the 1:N layout, and sending an exposure tracking log. In this case, we can obtain the following created function by configuring the time sequence and specifying whether a return value is generated:
Information about some logic (uncharacteristic logic) cannot be obtained from design files. We have introduced the manual intervention step to coordinate the generation of the uncharacteristic logic.
Coordination method 1 - Parameter query: You can define a custom form to obtain user input and access it in layoutResult.userLogicConfig of the internal process.
Coordination method 2 - Logic filtering: The Developer Intervention option is available for logic configuration. When the option is selected, the module developer can block the logic.
These control measures are reflected in the inquiry dialog box for module restoration. If you do not understand the dialogue box's specific intent, contact the current logic scenario's owner.
Development configuration dialog box of D2C
You can select one out of N logic recognizers. We provide the following diagram to help you select a recognizer that you need.
D2C logic recognizer selection workflow
Feature: Acts as a regular analysis based NLP recognizer and provides some capabilities of the NLP recognizer.
A logic expresser is a combination of multiple abstract expressers. We have broken down the logic implementation process into fine-grained visual operations. We configure expressions in the background by analyzing the specific logic implementation method and then assemble the logic. When a recognizer informs an expresser that the current logic is active, the expresser automatically generates code for the logic.
Functionalities of logic expressers
During module development for the Double 11 Shopping Festival, imgcook built a set of business logic scenarios specific to Taobao marketing activities based on the intelligent logic process mentioned in this article. The logic scenarios integrate default built-in logic, for example, the margin setting logic based on visual specifications for the Taobao marketing domain, tracking log logic based on tracking log specifications for the Taobao marketing domain, and module rendering and row splitting logic based on the rax-hooks solution.
We have also configured various types of data binding and component recognition logic based on the intelligent capabilities provided by logic recognizers described in this article. These logic recognizers include the NLP text recognizer and the UI material recognizer. When a developer's visual design files contain such features, D2C automatically applies the logic code to the results.
D2C internal edition's logic scenario for the Double 11 Shopping Festival
According to the 2019 Double 11 Shopping Festival statistics, about 78.94% of the modules used imgcook's business logic generation process. 79.34% of the generated module code was used in the online code. 14 pieces of business logic were hit on average. That is, when developing a new module based on the D2C process, developers wrote at least a dozen fewer pieces of logic on average.
The performance is even better for static UI modules with simpler logic. For example, you can restore the following modules to the ready-to-test state with one click. This capability dramatically reduces the development workload.
D2C logic hits in the Double 11 Shopping Festival, in the internal development process with imgcook WebIDE
We found many problems while implementing the logic for the Double 11 Shopping Festival. For example, the following process for developing Taobao modules is not user-friendly enough: requirement > design > module development > joint frontend and backend debugging > module release. As we can see, the process reserves no time for logic development, although the system aims to generate logic based on design files and directly convert them into online modules. We have to provide manual intervention early before development to resolve this problem.
In the future, we will integrate the output of requirements and design files into the business logic layer and provide an all-in-one and closed-loop R&D experience for supported modules. Roles will be clearly defined, where designers design UIs, PDs add requirements based on UIs, and developers maintain the available logic in the background. The D2C system will significantly benefit from its practical experience in the business logic domain in the future.
The D2C intelligent logic system has proven itself and has taken a solid step forward. In the future, D2C 2.0 will focus on the following aspects:
As mentioned earlier in this article, requirement documents consist of design files and product requirement documents (PRD). D2C is a design file-based technical system. We will replace the current manual intervention process of imgcook with a PRD-based structured capability in the future in order to eliminate the involvement of developers in the process.
Based on statistics about the Double 11 Shopping Festival modules, we have proposed a concept of code availability, which is a proportion the imgcook-generated code takes among the online code. In the future, we will provide more logic recognizers, provide more abstract but easy-to-use logic expressers, and optimize the organization of the logic core module of the business logic layer. imgcook will intelligently generate code based on metrics to adapt to actual business scenarios and provide superior intelligent services.
We will upgrade the kernel of the imgcook editor. Based on the kernel, we will derive marketing, community, and mini-program business platforms and customize logic scenarios for more Alibaba business scenarios. The D2C system can stably provide frontend intelligent capabilities.
We want to provide more intelligent capabilities, support a wider range of service scenarios, and achieve higher efficiency. The system will provide truly intelligent services for deriving and generating all module logic from the visual design files. The D2C system generates 79.34% of the online code. First, we plan to achieve the zero-R&D goal at the frontend, then zero R&D at the requirement end, and finally zero-effort throughout the process.
Alibaba F(x) Team - February 26, 2021
Alibaba F(x) Team - February 3, 2021
Alibaba Clouder - December 31, 2020
Alibaba F(x) Team - December 30, 2020
Alibaba F(x) Team - March 3, 2021
Alibaba F(x) Team - February 2, 2021
More Posts by Alibaba F(x) Team