As one of the four major technical directions of the Frontend Committee of Alibaba, the frontend intelligent project created tremendous value during the 2019 Double 11 Shopping Festival. The frontend intelligent project automatically generated 79.34% of the code for Taobao's and Tmall's new modules. During this period, the R&D team experienced a lot of difficulties and had many thoughts on how to solve them. In the series "Intelligently Generate Frontend Code from Design Files," we talk about the technologies and ideas behind the frontend intelligent project.
In the frontend intelligent field, especially in mid-end and backend intelligence, form and table recognition plays a significant role. The mid-end and backend development work mostly consists of form and table development. Therefore, productivity can be improved significantly if the code for forms and tables can be generated intelligently from design files within seconds. This article describes how to generate code for forms and tables.
Technologies such as component recognition, layout algorithms, and material attribute recognition are used in form/table recognition.
The position of the material recognition layer in the overall architecture is shown in the following figure.
Fig: Technical architecture of D2C recognition capabilities
To generate code for general frontend components, such as forms and tables, you need to take a snapshot, paste it, and click the Recognize icon. The fields are translated into English if the field text's language is not English (for example, Chinese) and presented in lowerCamelCase.
Fig: Form recognition demonstration
Fig: Table recognition demonstration
Form and table recognition involves the following steps:
1) Detect all types of components and their coordinates using object detection technology.
2) Recognize all text fields and their coordinates using text recognition technology and translate them into English with auto-translation technology.
3) Extract various attributes such as the layout, label, type, and fields of the form or table from the component information obtained in Step 1 and the text information recognized in Step 2 using a code converter.
We train the Faster RCNN Inception V2 model and make predictions with it in object detection. For more information, see Intelligently Generate Frontend Code from Design Files: Basic Component Recognition. A general text recognition technology is used during text recognition to detect text content and coordinates. This article does not describe the details.
In the following figure, the red boxes indicate the to-be-detected object, and the green boxes indicate the to-be-recognized text.
This article focuses on extracting various attributes from a form or table using a code converter.
First, to facilitate information processing, the absolute coordinates of the to-be-detected object and the to-be-recognized text are converted into rows and columns. A specific data structure of rows and columns is a two-dimensional array in which the first dimension is columns and the second dimension is rows. The algorithmic logic includes the following steps:
The layout of the form or table can be determined based on the preceding row and column information.
A form has a two-dimensional layout consistent with the row and column information obtained through component recognition. Therefore, the row and column information obtained through component recognition can be directly mapped to the form protocol in a nested loop. A table is one-dimensional, and the table's structure can be determined as long as the table header is determined. Therefore, we can directly use the first row of the two-dimensional array of rows and columns obtained through text recognition as table headers.
After we determine the layout of a form or table, we can calculate the field values and types.
The form fields' values are obtained by translating the labels into English and presenting them in lowerCamelCase. How to extract the labels? It is common sense that labels are located either on the left or top of form fields. Therefore, the algorithmic logic is first to extract the labels on the left of the form fields. If the operation fails, extract the labels from the top of the form fields.
In tables, the table field is the table header. The table header's values can be obtained after the first row is extracted from the two-dimensional array of rows and columns obtained through text recognition. For example, you can double-check whether the first row is the table header by performing length-based filtering to filter out rows that have less than three fields.
A form may have various field types, such as input fields, checkboxes, and radio buttons. A table may have plaintext fields and links. Therefore, we must determine the types of form and table fields.
The types of form fields are obtained from the information obtained through object detection because one of the object-detection tasks is to extract the types of form fields.
The types of table fields are also obtained from the information obtained through object detection. The fields in each table column are of the same type. Therefore, only the type of the first field in each column needs to be obtained.
Automatically extracting the above information significantly reduces the workload. Use the following methods to extract more attribute information:
At this year's Apsara Conference, Daniel Zhang said that big data and computing power are the fuel and engine of the digital economy. At present, in the frontend industry, componentization is taking shape, and massive amounts of components can be used as big data. Moreover, the computing power in the industry is also constantly improving. We are going to witness how artificial intelligence technologies renovate the pattern of frontend development.
Alibaba F(x) Team - February 3, 2021
Alibaba F(x) Team - February 23, 2021
Alibaba F(x) Team - February 25, 2021
Alibaba Clouder - December 31, 2020
Alibaba F(x) Team - February 26, 2021
Alibaba F(x) Team - February 24, 2021
Alibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.Learn More
Alibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.Learn More
Alibaba Cloud e-commerce solutions offer a suite of cloud computing and big data services.Learn More
Step Up the Digitalization of Your Business with Alibaba Cloud 2020 Double 11 Big SaleLearn More
More Posts by Alibaba F(x) Team