This topic provides a best practice for using Requirements Management in the data team of an enterprise.

Background information

Note Requirements Management is available in the following regions: China (Shanghai), China (Hangzhou), China (Beijing), and China (Shenzhen).
To boost business efficiency with data, an Internet startup sets up a data team consisting of four persons who perform the following roles:
  • Data product manager (Jack_PD): plans and implements data products, and has a deep understanding of the business logic of the enterprise.
  • Data developers (Alice_DEV and Rose_DEV): jointly design models, develop code, and test code.
  • Data development director (Mike_DEV_TL): is responsible for the stability and reliability of the production environment.

This enterprise has improved efficiency by using DataWorks. However, the data team needs an effective tool for daily work planning to cope with rapid business growth and various changes in business demands.

The following figure shows the architecture of the enterprise and staff who use DataWorks.Staffing
  • As a sales person, Sales_01 does not need to be added to a DataWorks workspace. Sales_01 can directly go to the Requirements Management page and create requirements.
  • Jack is the data product manager who supervises the entire process from requirement creation to implementation. Jack does not design or implement specific code. Therefore, Jack can be added to a DataWorks workspace and assigned a role as needed.
  • Mike is assigned the workspace administrator role to configure resources, review code, and deploy code in a workspace. Mike is also responsible for the operations and maintenance (O&M), deployment, and security management of the workspace.
  • Alice_DEV and Rose_DEV are assigned the developer role. They are responsible only for developing code and creating deploy tasks.

Objectives

  • Improved project efficiency and quality

    The responsibilities of staff and the input and output of each phase are clarified to ensure the integrity of key information and reduce invalid or repeated communication. This ensures steady project progress and improves overall efficiency.

  • Optimized daily work plan

    Each business requirement is strictly reviewed and scheduled, and development tasks are properly assigned with clear deadlines.

  • Well-established internal communication system

    An internal communication system is established within the enterprise based on the standard specifications for data development. This avoids internal conflicts caused by poor communication.

Development process

  1. Review: Evaluate the compliance and feasibility of the technology and data that are required to implement the requirement.
  2. Design: Design data models, code, and dependencies based on the data form, including the data quality and distribution.
  3. Develop: Develop code in a standardized and efficient way based on the output of the Design phase.
  4. Test: Accurately locate bugs and risks in the code to improve the quality of output data.
  5. Publish and O&M: Deploy the program that meets the publishing conditions online and ensure stable running.

Review

  1. Submit a requirement.
    Assume that Sales_01 from the sales department has a data requirement. Sales_01 logs on to the DataWorks console as a RAM user, goes to the Requirements Management page, and enters relevant information.
    1. Sales_01 logs on to the DataWorks console, finds the required workspace, and then clicks DataStudio.
    2. Sales_01 clicks the More icon in the upper-left corner and chooses All Products > Requirements Management.
    3. Sales_01 clicks Create Request.
    4. Sales_01 enters the name and content of the requirement. In the Basic Information section on the right, Sales_01 sets the Assign To parameter to Jack_PD.
    5. Sales_01 clicks Save.
  2. Review the requirement.
    1. Jack_PD organizes relevant persons to evaluate the necessity, feasibility, risks, and implementation details of the requirement based on the development specifications for the requirement phase. Meanwhile, Jack_PD sets the status of the requirement to Reviewing.
    2. If the requirement passes the review, Jack_PD goes to the Requirements Management page and sets the status of the requirement to To Be Designed.
      If the requirement fails to pass the review, Jack_PD goes to the Requirements Management page and sets the status of the requirement to Rejected.
    3. Jack_PD sets the owner for each phase based on the responsibilities of the staff.
      Note
      • If the team has sufficient staff, we recommend that you do not assign a tester to serve as a developer or designer at the same time.
      • The code must be reviewed in the Publish phase to ensure that the code is stable. You must assign an experienced person, except the developer and designer, to review the code. Sufficient smoke tests must be performed before the code is deployed.
      • The person who creates and submits the requirement must be specified as the owner of the Acceptance Check phase.
    4. If the expected publishing date is unrealistic, Jack_PD changes the publishing date to a date that is agreed by both parties.
    5. Jack_PD uploads the reviewed product document to the Review phase as a reference for owners of the subsequent phases.

Design

  1. Based on the document that is generated in the Review phase, Alice_DEV performs data exploration and designs based on the development specifications for the Design phase. Meanwhile, Alice_DEV changes the status of the requirement to Designing to advance the progress.
  2. After the designs are completed, Alice_DEV uploads the data exploration report, extract-transform-load (ETL) document, and scheduling design document to the Design phase. Then, Alice_DEV changes the status of the requirement to To Be Developed.

Develop

  1. Develop the code.
    1. Based on the documents that are generated in the Design phase, Rose_DEV develops nodes in DataWorks based on the specifications for code development. Meanwhile, Rose_DEV changes the status of the requirement to Developing to advance the progress.
    2. Rose_DEV clicks Associated Nodes.
    3. In the Select Associated Nodes dialog box, Rose_DEV selects required nodes from DataStudio or experiments from Machine Learning Platform for AI (PAI) and clicks OK.

      Rose_DEV returns to the Requirements Management page to verify that all the required nodes are associated with the requirement. Requirements Management automatically calculates the overall deployment progress based on the status of the nodes and displays the progress as a percentage. The percentage reaches 100% when all of the nodes are deployed.

  2. Test the code.

    After Rose_DEV develops the nodes, Rose_DEV tests the nodes, and then prepares and uploads the unit test report, publish operation document, and code review report. Meanwhile, Rose_DEV changes the status of the requirement to To Be Tested.

Test

  1. Alice_DEV uses test cases to perform delivery and data tests on the document that is generated in the Develop phase based on the development specifications for the Test phase. Meanwhile, Alice_DEV changes the status of the requirement to Testing to advance the progress.

    DataWorks workspaces in standard mode isolate the development environment from the production environment. You can develop code and perform smoke tests in the development environment before you deploy the code to the production environment.

    1. Alice_DEV clicks the More icon in the upper-left corner and chooses All Products > DataStudio.
    2. Alice_DEV double-clicks the required workflow. On the configuration tab of the workflow, Alice_DEV clicks the Submit icon to submit the nodes for which code has been developed to the scheduling system in the development environment.
    3. Alice_DEV clicks Run Smoke Test in Development Environment for each node to simulate code running in the production environment. You can also click View Log to check whether the scheduled time to run the node and the running result of the node are as expected.
  2. After the tests are completed, Alice_DEV prepares and uploads the delivery test report, quality assessment report, and acceptance check report. Meanwhile, Alice_DEV changes the status of the requirement to To Be Published.

Publish

  1. Based on the development specifications for the Publish phase and the documents that are generated in the Test phase, Alice_DEV submits an application for deploying the code. Alice_DEV can deploy the code after Mike_DEV_TL verifies that the code is compliant, standardized, and appropriate.

    To deploy the code, perform the following steps:

    1. Submit a deploy application.

      Alice_DEV clicks Deploy in the upper-right corner to create a deploy task for all the nodes that have been successfully run in the Test phase and submits a deploy application. After the deploy task is created, Alice_DEV needs to wait for the workspace administrator to review the deploy task.

    2. Review and deploy the nodes.

      The workspace administrator goes to the View Deploy Tasks page and checks the difference between the to-be-deployed code and the deployed code. If the code is correct, the workspace administrator clicks Deploy to deploy the code to the production environment for time-based scheduling.

      Click the More icon in the upper-left corner and choose All Products > Operation Center. In the left-side navigation pane, choose Cycle Task Maintenance > Cycle Task to view all the nodes that have been deployed to the production environment for time-based scheduling.

      You can also click Cycle Instance to view the instances that are generated for the scheduled nodes every day and view the operational logs of each instance.

    3. Configure rules to monitor the data quality.

      After the nodes are deployed, you can configure rules to monitor the data quality for the deployed nodes. This ensures the reliability of output data.

  2. After the code is deployed, Alice_DEV changes the status of the requirement to To Be Checked.

    You can click Upload to upload the description document that is generated in the Publish phase to the server for archiving.

Acceptance Check

The salesperson Sales_01 and the data product manager Jack_PD check whether the tables or APIs that are provided by the developers meet the initial business expectations. If the tables or APIs meet the expectations, Sales_01 changes the status of the requirement to Acceptance Checked.

You can click Upload to upload the description document that is generated in the Acceptance Check phase to the server for archiving.

At this point, a data requirement is implemented based on a standardized process. As the requirement manager of the data team, Jack_PD can use the advanced search and view features of Requirements Management to supervise the work of each team member. Jack_PD can also set the priorities of requirements to manage the work of the team.