Evolution of the Frontend Architecture and Serverless Practices in DataWorks

In this blog, we will explain how DataWorks evolved its frontend architecture and serverless practices through several large-scale application use cases.

By Haozhen

Released by DataWorks Team

DataWorks is a cloud-based big data development platform that offers big data operating system (OS) capabilities and professional, efficient, secure, and reliable services in an all-in-one-box manner. The platform provides comprehensive features across the entire data link, including data integration, data development, data governance, data security, data services, application development, and machine learning.

In this blog, we will explain how DataWorks evolved its frontend architecture and serverless practices through several large-scale application use cases.

Challenges

Complex Features and Technical Architecture

DataWorks, as well as many other similar products, provide single-page applications that are similar to Integrated Development Environments (IDEs) and offer rich interactions, as shown in the following figure:

Figure 1. A data development IDE

In terms of interaction, the IDE-like products are similar to Visual Studio Code (VS Code) which is used by most frontend engineers and which contains many functional modules. In the architecture of VS Code, all functional modules are provided by plug-ins. The following figure shows the most basic functional modules of a data development IDE:

Figure 2. Basic functional modules of an IDE

A data development IDE has a lot of overlaps with general-purpose IDEs in terms of basic capabilities, which are also the most common functions for an IDE product. On this basis, from the business perspective, the upper-layer business capabilities are customized, including different big data computing engines, data analytics nodes (similar to file types in VS Code), modules that are used to classify and manage data analytics nodes, and Kanban that is used to arrange and orchestrate the processes of data analytics nodes.

From a horizontal perspective, DataWorks has many functional modules. From a vertical perspective, the number of nodes in DataWorks alone has reached hundreds of millions. All the code, including code of functional modules and hundreds of millions of nodes, is mixed in a frontend project. With the addition of the basic IDE functions and other module code, the frontend project has evolved into a single monolithic application. With continuous iterations and upgrades of requirements, the current DataWorks version surpasses the first version in the volume of frontend projects. Currently, a single build and release takes more than 10 minutes. The entire release process will be very arduous for the foreseeable future.

Complex Runtime Environments

The operating environment for DataWorks is unusually complex throughout the entire Alibaba economy. To build a private, independent, and closed hybrid cloud (Apsara Stack), the frontend cannot use CDN capabilities. It is a major issue of engineering complexity to ensure that the same set of code can be directly reused inside Alibaba and in Alibaba Cloud and Apsara Stack.

Figure 3. Complex runtime environments

In addition to project architecture and complexity, different runtime environments have higher requirements for managing iterations and releases. GitLab resolved code conflicts between different versions, but it cannot resolve conflicts in product release cycles. When multiple release requirements need to be met, especially when we need to release major revisions of Apsara Stack on a monthly basis, we have to quickly release new features while packing these features into Apsara Stack editions based on a waterfall model. We have to release many features in different version iterations at different paces inside Alibaba and in Alibaba Cloud and Apsara Stack. Alibaba's circuit breaking mechanism further increased risks from frequent releases in the time window.

Various Demands from Different Users

The DataWorks platform is large and complex. Considering different functional requirements of Alibaba Cloud and Apsara Stack users, we need to provide features that are light, flexible, and convenient to break up and combine anytime and anywhere. Picky users always want to get the most proper solution at the least cost. To keep its competitiveness in the future, DataWorks obviously needs to focus on flexibility and the ability to evolve.

DataWorks can add switches to functional module units by using permission points. Although this method can meet various needs of different users, a large amount of permission judgment code exists in frontend projects. The development difficulty and testing costs have a great effect on this method.

Crude Canary Mechanism

On the whole, DataWorks uses human intervention during canary function verification, or specifically, depends on designing a function switch in advance. When we need to verify a function, a switch is configured manually at the frontend to screen out some users for the newly designed functional module. After a period of trial run and adjustment, problems are gradually solved, while the impact on users can also be controlled within a small range. However, this canary mechanism is not repeatable, arbitrary, or natural.

The canary mechanism is costly to implement due to human intervention and related design and development. Some developers even neglect canary verification to avoid these annoyances. The canary mechanism is ineffective under a conventional architecture when canary verification is performed on a feature that is highly localized, unrelated to users or workspaces, or if we do not know whether users will use this feature. If this is the case, we cannot start on the design even if we want to.

Shortage of Manpower

The DataWorks team is responsible for many products and must meet numerous data development requirements from inside and outside Alibaba. The frontend team consists of just over 10 members. Obviously, this team size is insufficient to cope with a heavy workload of iterations, not to mention making breakthroughs in some areas of innovation. This is also a common problem faced by R&D teams of rich interactive products.

DataWorks has made code reusable whenever possible after it adopted the React modular UI development mode. However, DataWorks is restricted by the differences in interaction patterns and styles, business distinctions, and developers' different understandings of information. This makes it impossible to accumulate reusable business components that can cover the fundamentals, in a period of time after the promotion of componentization.

Our frontend developers expend a lot of effort in tuning styles and interactions to implement a design model that separates the frontend from the backend and is based on a mainstream frontend framework. This provides a better user experience than the frontend-backend integrated design. There is no shortcut to the improvement. We hope that every design contributed by frontend developers can be reused instead of being disposed of after a single use.

Additional Pain Points

The preceding points cannot exhaustively cover all the problems faced by the DataWorks team. These points are merely a few of the biggest challenges that we screened out just as we do for product requirements. A principle for products is that less is more. This is also applicable in the R&D field. Due to the limited size of the R&D team, we are unable to meet all the needs of our data development personnel. We hope that DataWorks will be a more open platform, similar to VS Code. The DataWorks team focuses on building a data development platform that provides basic features, while formulating a series of standards and protocols to enable third-party services to customize features in the form of plug-ins.

Cooperation and Competition

The DataWorks platform provides a range of features to assist R&D engineers with their daily work. Our users spend a lot of time working with our platform and have the first-hand experience in the design of some features. Their experience is something that the R&D personnel of our platform cannot feel. Project directors (PDs) and user experience designers (UEDs) collect requirements and try out features themselves. However, without a background in data development, they are unable to experience the sense of frustration that is unique to data development engineers after long-term use. The use of the DataWorks platform varies greatly in vertical market segments and even more greatly in different industries. DataWorks is used in different ways in finance, banking, government, education, large state-owned enterprises, Internet companies, traditional companies, and privately owned companies. Some customers may even not know how to use this platform. Users have widely varying needs, and their knowledge and skills are at different levels.

Therefore, when front-line delivery teams or companies apply DataWorks in industries that we have not considered, requirements are collected from these industries and sent to our PD for analysis. Front-line teams also package some DataWorks APIs and provide these APIs as products to customers in specific industries to help them solve problems.

New product plans are being developed. The engine team uses DataWorks to improve user-friendliness of their designed products. It is difficult to keep expanding DataWorks if developers are working only according to the schedule to meet the demands for access and customization. Therefore, in view of frontend and backend architectures and countless scenarios of cooperation and competition, we must achieve a technical revolution to break away from the existing frontend and backend R&D model, and introduce more user-side R&D capabilities. We hope that this will allow us to make DataWorks more robust.

Evolution of Architecture

Evidence-based software architecture is a highly recommended architectural idea in the book titled Expert One-on-One J2EE Development without EJB. The book emphasizes proceeding carefully step by step to find the most suitable architecture. This theory reminds us that in the continuous evolution of product forms, the technical architecture also needs to evolve based on the current product forms.

In the early stages of DataWorks, a frontend architecture of Studio products was basically refined from DataStudio, and a technical architecture of IDE products was sorted out by layer. The base layer encapsulates basic modules, such as the IDE layout, communication, internationalization, skin change, data center, codeless event tracking, event center, and message center. The upper layer encapsulates module management and node management.

As demands continue to evolve, the number of nodes in DataStudio increases rapidly. Different types of nodes overlap greatly in functions and interactions. Therefore, after the early stages, the DataWorks frontend team restructured the nodes, and refined a basic node class from the product requirements that basically includes the common node functions and interactions. A lot of node instances are derived from this base class.

This architecture has been running stably in DataStudio for several years. The underlying layer of DataStudio depends on the MaxCompute (formerly ODPS) engine. For offline scenarios, MaxCompute provides powerful computing capabilities. However, MaxCompute cannot support scenarios with high requirements for real time. Therefore, in different vertical sectors, DataWorks derived various computing engines from DataStudio, including Stream Studio (based on the stream computing engine), HoloStudio (based on the interactive analytics engine), and GraphStudio (based on the graph computing engine.)

Each computing engine corresponds to a new Studio. Under the pressure of rapid business expansion, each Studio also integrates with the Studio base capability refined from DataStudio by copying code. Meanwhile, data analytics nodes corresponding to each respective Studio are derived from the basic node class. This form had existed for over a year. With the solo development of the Studios in different vertical sectors, the Studio bases can no longer be recombined.

Recently, the DataWorks team has extracted a unified Studio base, which converges, upgrades, and maintains relatively underlying and less changeable functions. We are also working with the IDE co-building team of the Alibaba economy (kaitian) to promote the integration of the IDE base capabilities, so that common capabilities can be reused and features can be customized through a unified protocol of upper-layer plug-ins.

Meanwhile, to remove the frontend manpower bottleneck, we want to make the DataWorks platform more open, and introduce user-side R&D capabilities for DataWorks. This is also an architecture upgrade based on various personalized needs that we have to consider. Last year, DataWorks began to upgrade the Studio plug-in architecture. Based on the extracted Studio base that we mentioned before, the upper layer uses plug-ins to open up common extension capabilities of DataStudio. This method has been used to connect to a new engine (Analysis DataBase) autonomously and completely. The method greatly simplifies the process of building a data development Studio for the new engine and reduces the development workload. It also frees the labor of the frontend team. Previously, the frontend team had to customize the development of the new engine.

After each business unit (BU) in the Alibaba economy reaches a certain volume, the demand for big data development will gradually arise. Some BUs may set up their own data teams to support their demand for big data. They may even build their own big data platforms. When I was communicating with colleagues of some BUs, I found that most business-oriented BUs have requirements for big data platforms because DataWorks cannot meet their special requirements at certain points. These small differences are not universal because DataWorks as a unified development platform cannot bear all the detailed differences in business characteristics.

For this scenario, the DataWorks team creates the DataWorks open platform based on the previous architecture of a unified Studio base and plug-ins. In addition, we also provided the existing capabilities in DataWorks to business-oriented BUs for function assembly and combination, enabling business teams of Alibaba to quickly build their data development platforms.

To sum up, DataWorks has gone through three stages of frontend architecture evolution: (1) monolithic applications, (2) componentization, and (3) the combination of micro frontends and plug-ins.

Monolithic Applications

During rapid iterations of each Studio, the frontend and backend continuously accumulate functional and code modules. The backend still accumulates functional modules as Spring Boot does. Although some business component libraries are accumulated at the frontend, their code are just stacked. The entire frontend and backend perform iterations in the same way as monolithic applications.

Monolithic applications have several benefits:

Easy to test
Easy to deploy

During rapid product iterations in the early stage, the monolithic architecture did help businesses quickly form a complete closed product loop and a complete product pipeline. However, this mode also has its applicable scope. That is, it is applicable when an application is not so complex. As the application becomes increasingly complex, especially when all DataWorks services are developing into online IDE forms, we encountered a situation where monolithic applications are becoming inadequate:

Heavy deployment: The entire application needs to be redeployed due to a single minor revision.
Low development efficiency: Application code keeps increasing in size, and the long compilation time affects the development efficiency.
Difficult regression testing: Application functions becomes increasingly complex and functional modules cannot be effectively separated, resulting in a long regression testing cycle.
In addition, it is difficult to upgrade the overall technical architecture for monolithic applications.

Componentization

After a period of time using the monolithic architecture, as the business complexity kept increasing, the size of each monolithic application also increased. With every minor change, such as a real-time build during the development or a compilation build during the release, the frontend and backend had to undergo an onerous deployment process. In addition, monolithic applications seem to become inadequate in the process of continuous service iterations.

The backend tried to introduce Spring Boot Starter to divide business modules based on service-oriented architecture (SOA). The frontend also abstracted and accumulated some interactive components in IDE form, such as Terminal, file tree, TabPanel, and LSP Editor, at the horizontal level.

SOA is similar to a service-oriented componentized architecture. It abstracts some common modules and divides a huge system into independent business module units (components) according to functional modules. Each component performs complete business logic from beginning to end and usually includes the specific tasks and functions required to complete an entire big action.

Frontend componentization is similar to service-oriented architecture (SOA), which divides a functional module of huge page interactions into independent npm packages (components). Each component performs complete business logic and page interactions from beginning to end. Usually, components are loosely coupled.

SOA and the componentized architecture partially overcame the challenges for monolithic applications, and at least abstracted some reusable functional modules to reduce the repetitive work of some functions. However, this also brings about problems:

Complex dependency: Compared with a monolithic application, the dependency relationships of code tend to be complex.
Difficult troubleshooting: It is difficult to troubleshoot third-party components.
When a problem occurs, all dependent applications have to be released again.

Micro Frontends and Plug-ins

Concepts

Some developers still have misunderstandings about Serverless, microservices, and other concepts. Let's take a look at these concepts first.

Serverless

Serverless means that developers implement the server logic and applications run in stateless computing containers. In containers, applications are triggered by events and fully managed by a third party with business layer statuses stored in a database or other storage media.

Serverless represents an advanced stage of cloud-native technology development. It allows developers to focus on business logic and pay less attention to infrastructure.

Years ago, the software we developed was still in the client-server (C/S) and model-view-controller (MVC) modes, and then we had SOA. The microservices architecture emerged in recent years, and later we had cloud-native applications. Enterprise applications have changed from a monolithic architecture to SOA and then to a more fine-grained microservices architecture. At the beginning of application development, high performance and scalability were required to cope with the Internet-specific high concurrency and uninterruptibility. People have endless pursuits for software development, while we hope to strike a balance between the development complexity and efficiency.

Cloud technologies have changed our cognition of operating systems. We have understood that the computing resources, storage, and network of a system can be configured separately and scaled elastically. However, for a long time, we have never broken through the shackles (or cognition) of server when we develop applications. We have long held that an application must run on a physical or virtual server and that an application must be deployed, configured, and initialized before it can run. We also must monitor and manage servers and applications and ensure data security. Can these cloud technologies help us streamline the process? Let's just focus on the logic of our own code, and let the cloud help us implement other things.

The evolution of cloud computing from infrastructure as a service (IaaS) and platform as a service (PaaS) to the rise of Serverless computing today pushes users to focus more and more on the real business logic, rather than operating systems, software and other infrastructure that are not related to the business logic. At the more advanced stage of cloud-native technology development, the overall goal is to let users care about only the business code logic and leave other parts to the cloud. Currently, Serverless architecture include two categories: backend as a service (BaaS) and function as a service (FaaS), which have achieved the purpose of Serverless in two forms.

Baas

In BaaS, APIs call the program logic in the backend or the logic already implemented by other people, such as the Auth0 authentication service. Some BaaS features are used to manage data. Many commercial services are available on public clouds that provide commonly used open-source software. For example, Amazon Relational Database Service (Amazon RDS) can replace MySQL we deployed, and a variety of other database and storage services are also available.

FaaS

FaaS is a form of Serverless computing. Currently, AWS Lambada is the most widely used FaaS service.

Now, the first thing that comes to mind when we talk about Serverless is FaaS, which is overblown. Essentially, FaaS is an event-driven and message-triggered service. FaaS vendors integrate various synchronous and asynchronous event sources. By subscribing to these event sources, you can trigger functions to run in emergency or on a regular basis.

Microservices

Microservices represent a new architectural style for building applications. Unlike the traditional monolithic architecture, the microservices architecture splits an application into multiple core functions. Each function is called a service, and can be built and deployed independently, meaning individual services can function (and fail) without negatively affecting the others.

Compared with Serverless, microservices are a Serverless architectural style for services or APIs. In addition, Serverless also involves storage such as Object Storage Service (OSS) and databases such as Amazon Aurora Serverless.

Baas and FaaS are two forms of microservices architecture. Compared with BaaS, FaaS is a more fine-grained service unit. They are evenly matched, but are applicable to different specific scenarios.

Microservices are an architectural design pattern. In microservices architecture, business logic is split into a set of small and loosely coupled distributed components, which together constitute a large application. Each component is called a microservice, and each microservice performs a separate task or is responsible for a separate function in the overall architecture. Each microservice may be called by one or more other microservices to perform specific tasks that a large application needs to complete. The system provides a unified processing method to execute tasks such as searching or displaying pictures, or other tasks that may need to be executed multiple times. The system also restricts multiple versions of the same function generated or maintained in different places within the application. The microservices architecture has the following characteristics:

Each component is responsible for the execution of an individual function.
Components are deployed separately.
Each component contains one or more processes.
Each component has its own data storage.
A small team can maintain several microservices.
Components are replaceable.

A microservice architecture has the following differences from SOA:

Last year, the DataWorks team adopted the microservices architecture and upgraded some relatively independent business logic to microservices. The team introduced an architecture that combines micro frontends and plug-ins to achieve personalized business development and customization in the corresponding frontend.

Micro Frontends

Micro frontends represent an architecture similar to microservices. It applies the microservice concept to browsers to transform web applications from single applications to applications that aggregates multiple small frontend applications. Each frontend application can also be developed, deployed, and run separately.

A big advantage of background microservices is that different technology stacks can be used to develop background applications. Large organizations use microservices because the microservices architecture is used to decouple inter-service dependencies.

On the contrary, when it comes to frontend microservices, people want aggregation as the result, especially To B (to business) applications.

Considering its current business situation and different computing engines, the DataWorks team has developed numerous Studios for vertical sectors. Although the Studios overlap greatly in product interaction and functional logic, they are available to users as different products. Stream Studio supports real-time computing, while DataStudio supports offline computing. In addition, some Studio platforms are used for developing FaaS. However, users think that all the products come from the same company, so they should have only one product. Moreover, due to the dispersion of product features, some public basic features are distributed in different products, which poses a great challenge to usage costs.

Therefore, the product orientation is shifting to cohesion. The XStudio project that started at the end of 2018 expects to integrate the development nodes of all computing engines in one Studio to offer an all-in-one product experience.

Aggregation has become a trend. We also draw on the micro frontend architecture to upgrade the frontend architecture, so that functions developed in different Studio environments can be aggregated into an application and presented to users.

In the DataWorks product system, each module and node is a single application that can be developed, deployed, and run separately. Different from common micro frontend applications distributed through routing, each node and module of DataWorks adopts a distribution mechanism featured by tab switch:

Of course, we also have routing-level products with a micro frontend architecture, such as Operation Center. Click relevant modules in the left navigation pane to distribute and switch routing-level micro applications.

Plug-ins

The word plug-in consists of plug and in. In real life, power patch panels are examples of a "plug-in" application. A patch panel and the electrical appliances plugged into it constitute a physical plug-in system. The patch panel provides power for the plug-in through plug holes and gives the system the ability of plug-in. After you insert the plug of the television, you have the function of watching TV. After you insert the plug of the refrigerator, you have the function of preserving food. After you insert the plug of the desk lamp, you have the function of lighting. These are just a few examples.

The design of a good plug-in system is equivalent to the design of patch panels. The core of the system is a hot-plugging "patch panel", which is responsible for providing power (plug-in API) for the plug-ins connected to the system. The capability of the system is the aggregation of all plug-in capabilities. Unlike patch panels in the physical world, software patch panels do not have a limit on number of plug holes. That is, the system can accept any number of plug-ins, and its functions can be increased infinitely.

The micro frontend architecture solves the problem of product aggregation. However, for rich interactive products such as Studios, the micro frontend architecture cannot enable fine-grained functional combination. As shown in the following figure, different nodes have different operation buttons in the top operation button area of a monolithic node application. The interaction and business logic behind the button UI are split into plug-ins. Then, they are used with the microservices and plug-in console to combine functions and modules through configuration.

In fact, micro frontends and plug-ins are both used to solve the aggregation problem of product functions. However, they solve the aggregation problem of different fields in DataWorks. The micro frontend architecture can aggregate block-level functions, whereas plug-ins can aggregate and combine fine-grained functions.

Communication and interaction between micro applications and plug-ins are very common in rich interactive Studio products. We implement standardization by using the following unified Studio base.

Unified Studio Base

The unified Studio base extracts a basic capability for IDE product forms, and encapsulates some business-layer capabilities of Studios in the upper layer. Therefore, all the upper-layer services of Studios are implanted in the Studio base in the form of plug-ins. This, in conjunction with the microservices and plug-in console introduced in the following content, has enabled us to customize Studio features through configuration and even quickly build a new Studio.

The unified Studio base can be used for unified maintenance and upgrade. It can also remove the complex engineering structure of different Studio bases from application engineering, which reduces the complexity of Studio application engineering. This is also a unified standard. The unified Studio base standardizes the interaction and communication between upper-layer plug-ins and modules. This lays a foundation for using a unified plug-in architecture to achieve communication among plug-ins across different Studios in the future.

Unified Plug-in Architecture for Studios

In the underlying layer, the micro frontend and plug-in architecture of DataWorks use qiankun whose underlying layer depends on single-spa. For the requirements that qiankun cannot meet in Studio scenarios, the DataWorks team has made drastic improvements and upgrades based on qiankun and single-spa.

Note: The micro applications and plug-ins of DataWorks are developed based on the micro frontend and plug-in architecture. They are frontend modules that can be developed, deployed, and run separately, which we also call plug-ins. Some plug-in operations depend on Studio bases. Plug-ins of this type do not match the characteristics of micro frontends.

1. Multi-instance Mode

DataWorks needs a comprehensive plug-in architecture. It also needs routing-level plug-ins such as Operation Center, slot mechanism-based plug-ins such as DataStudio, and page-level plug-ins such as Data Map. The overall architecture is based on single-spa and qiankun at the underlying layer, while solving the problem where qiankun cannot render a plug-in for multiple times on the same page.

2. Slot Mechanism

A plug-in focuses only on its own interaction form and business logic. The application determines which business system references this plug-in. Therefore, in the design process, we added a slot mechanism to ensure that an application can carry sub-plug-ins by developing a slot plug-in. Meanwhile, the slot plug-in can determine the rendering logic of its sub-plug-ins. A complete page is built by a series of slot plug-ins and sub-plug-ins.

Qiankun also provides a style isolation solution, but clears styles when plug-ins are mounted. This solution eliminates style conflicts when only one plug-in can be rendered. However, for Studio-oriented plug-in solutions, it is necessary to resolve the style conflicts that arise when multiple plug-ins are rendered at the same time.

3. Style Isolation and Version Isolation

The style isolation solution of qiankun allows only one plug-in to be rendered at a specific time. When the plug-in is mounted, extra styles are cleared. For Studio-oriented plug-in solutions, it is necessary to resolve the style conflicts that arise when multiple plug-ins are rendered at the same time. In addition, two plug-ins reference scenes of different versions from the same library. The conflict between the global style definitions of two version libraries must also be resolved.

Similarly, style conflicts also exist in scenarios with the same JavaScript (JS) library but different versions. For some JS libraries that depend on window, we also need to resolve conflicts.

Note: We would like to thank the Alibaba Cloud Management Console team for their ideas on the Widget micro frontend solution. The style isolation and version isolation solutions also used the Widget solution for reference.

4. Plug-in Nesting

For the plug-in architecture of Studios, the granularity of a plug-in can be an entire block or a small button. The data between plug-ins can use a tree structure. The following figure shows a whole plug-in, and each button in the head toolbar area is a sub-plug-in:

The unified plug-in architecture of DataWorks supports plug-in rendering of the nesting structure, and implements the plug-in dependency orchestration through configuration in conjunction with the microservices and plug-in console. With the plug-in orchestration feature in the microservices console, you can compile a complete page for an application through configuration.

In addition, the microservices console provides a more robust and easy-to-use visual orchestration capability. With this capability, you can build plug-in dependencies for an application through visual interaction, and then render out a complete page during the runtime.

Microservices and Plug-in Console

The integration of frontend and backend services as plug-ins based on DataWorks is an important reason why we insist on developing DataWorks Microservice Platform (DMSP). DMSP associates the release of frontend components with backend microservices and enables frontend and backend services to be quickly deployed as business plug-ins by using Swagger. The frontend and backend members of the team can implement DevOps on DMSP and continuously deliver new functions to customers.

Noticeably, DMSP is applicable to three environments, including the inside of Alibaba, Alibaba Cloud, and Apsara Stack. After the development of plug-ins is complete, we can use DMSP to continuously deliver them to 20 regions in Alibaba Cloud, and also package and deploy microservices in Apsara Stack in a unified manner. DMSP also shields plug-in developers from complex external deployment environments.

In the future, we hope to design most page content in DataWorks based on micro frontends and plug-ins, so that we can solve the aforementioned challenge: "light, flexible, and convenient to break up and combine anytime and anywhere." The architecture does not just drive the development pattern, but will also influence the entire potential market.

Building an Ecosystem

With the combination of the backend microservices architecture and the micro frontend and plug-in architecture, the aforementioned customization for vertical business will also become possible. Industry-oriented delivery teams can leverage the micro frontend and plug-in capabilities provided by the DataWorks platform to customize an intelligent R&D platform that is fully adapted to industry-specific characteristics. Then we can build a vibrant innovation ecosystem on the DataWorks platform to provide customers with more diverse choices. The microservices architecture will drive the ecosystem to evolve toward a more competitive direction.

By using this architecture, the DataWorks team hopes to achieve mutually beneficial cooperation inside and outside Alibaba and create synergy among Alibaba Cloud Intelligence products.

Redefining R&D

The microservices and micro frontend architectures are independent of development languages, and also clear some technical gaps. Therefore, if the backend team does not use React, it can use pure JS for plug-in development. The front-end team is familiar with Node.js., so it can also apply their knowledge and skills in the microservices design. In this development pattern, the frontend and backend engineers have expanded their reach to more technical fields. This is a more efficient development pattern for business, while making the frontend and backend more tacit in technical interaction and communication.

DataWorks also provides App Studio, which is special and dedicated to the development of WebIDE. Fine-grained splitting of product functions is implemented for microservices of both the backend and frontend. WebIDE is suitable for the development of small-volume projects, including backend microservices, frontend plug-ins, and functions in FaaS, all of which can be developed online. Meanwhile, App Studio will better integrate with the microservices and plug-in console and streamline the R&D process, so that developers can focus on the development of business logic and accelerate product iteration.

In response to the trends of machine learning and artificial intelligence, we have created an intelligent programming product for frontend coding: the Sophon intelligent hint plug-in. We hope that with the help of this IDE plug-in, frontend development can advance fast.

Future Prospects

The combination of the microservices architecture and the micro frontend and plug-in architecture has been verified for some time in DataWorks Studio, a scenario with rich interactions. In addition to implementing this combination in business, we expect more business parties to join us in customizing big data development functions of DataWorks, in an open manner by using this technical solution. We will also continuously open up some core capabilities of DataWorks through microservices.

For example, we recently connected AnalyticDB (ADB) to the DataWorks system by using this set of architectures. The backend accesses core APIs of DataWorks and business logic through microservices. The frontend uses the micro frontend and plug-in scheme to customize the functions of the ADB engine, such as data collection, customization of data analytics nodes, and Data Map presentation.