Data Management (DMS) provides the Test Data Generation feature that is supported by a powerful algorithm engine. You can use this feature to generate large amounts of information at a time, such as random values, region names, and virtual IP addresses. This way, you can prepare test data with ease. This topic describes the Test Data Generation feature and shows you how to generate test data.

Prerequisites

One or more of the following databases supported by DMS are used:
  • MySQL: ApsaraDB RDS for MySQL, PolarDB for MySQL, MyBase for MySQL, PolarDB-X, AnalyticDB for MySQL, and MySQL databases that are not on Alibaba Cloud
  • SQL Server: ApsaraDB RDS for SQL Server, MyBase for SQL Server, and SQL Server databases that are not on Alibaba Cloud
  • PostgreSQL: ApsaraDB RDS for PostgreSQL, PolarDB for PostgreSQL, MyBase for PostgreSQL, AnalyticDB for PostgreSQL, and PostgreSQL databases that are not on Alibaba Cloud
  • MariaDB: ApsaraDB for MariaDB TX and MariaDB databases that are not on Alibaba Cloud
  • ApsaraDB for OceanBase in MySQL mode
  • PolarDB for Oracle

Background information

In general, test data is required for functional tests or performance tests. You may use the following methods to generate test data:

  • Write test data. This method has low efficiency and is inapplicable to scenarios in which a large amount of test data is required.
  • Maintain existing scripts. This method is costly. Scripts must be modified for different tests. In addition, the data generated by using this method is not discrete enough.
  • Export data from an online environment and write the data to an offline environment. This method is not secure and may cause data leaks.

The preceding methods are insufficient to meet the requirements of the actual development process in which test data is frequently required. This process requires high data security and controllable data discreteness. It also requires high efficiency to save time for more constructive work. To this end, DMS provides the Test Data Generation feature to help you generate test data with ease.

Usage notes

  • You can use this feature to generate test data for one table at a time. To generate test data for multiple tables, use this feature multiple times.
  • A maximum of one million rows of data can be generated at a time.
  • You can generate test data based on the following performance metrics. In this case, traffic throttling is enabled. This prevents database overload that is caused by the instantaneous generation of massive data.
    • One million rows of data can be generated for four fields in about 60 seconds.
    • One million rows of data can be generated for 40 fields in about 120 to 180 seconds.

Procedure

  1. Go to the DMS console V5.0.
  2. In the top navigation bar, click Database Development. In the left-side navigation pane, choose Environment Construction > Test Data Generation.
  3. On the Test Data GenerationTickets page, click Test Data Generation in the upper-right corner.
    Note Alternatively, you can go to the SQL Console tab of the required database, right-click the required table in the table list and choose Data Plans > Test Data Generation.
  4. On the Test data build ticket application page, set the parameters in the Application step as required and click Submit. The following table describes the parameters.
    Parameter Description
    Task Name Required. The name of the task. This helps you find the ticket in subsequent operations and allows approvers to know the purpose of the ticket with ease.
    Database Name Required. The name of a specific database in a specific database instance. You must have permissions to manage the database in DMS. Enter the prefix of a database name in the field and select the database from the matched results.
    Table Name Required. The table for which you want to generate test data. Enter a keyword in the field and select a table whose name contains the keyword from the matched results. You can specify multiple tables.
    Configure the algorithm Required. The algorithms that you use to generate test data. For more information, see Algorithms.
    Number of rows generated Required. The number of rows that you want to generate for the test data.
    Conflict Handling Required. Specifies how DMS handles conflicts. Valid values:
    • Skip when encountering data conflicts: If a primary key conflict or a unique index conflict occurs when test data is being generated, DMS ignores the conflict entry and continues to generate test data.
    • Replace when encountering data conflict: If a primary key conflict or a unique index conflict occurs when test data is being generated, DMS overwrites the conflict entry and continues to generate test data.
    Change Stakeholder Optional. The stakeholders involved in the ticket. Specify one or more stakeholders as needed. Only users who are relevant to the ticket, including those who participate in the approval process of the ticket, can view ticket details.
    After you submit the ticket, wait for the ticket to be approved. After it is approved, the system automatically generates test data and writes it into the required database.
    Note By default, the tickets that are submitted to generate test data are approved by database administrators (DBAs). For more information, see Test Data Generate.

Algorithms

You can use one of the following algorithms to generate test data: Random, Customize, and Enumeration.

  • Random
    • INTEGER type: Two build types are provided. If you set the build type to Self-increasing sequence, you must set the Starting value and Step parameters. If you set the build type to Interval number, you must set the Minimum value and Maximum value parameters.
    • STRING type: Two build types are provided. If you set the build type to Variable length string, you must set the Minimum length, Maximum length, and Character range parameters. If you set the build type to Do not repeat string, you must set the Options parameter.
    • TIME type: Random dates and time values can be generated based on a specified time range.
  • Customize

    The Customize algorithm can be used to generate only test data of the STRING type, such as personal information, geographic location information, and industry-related common information.

  • Enumeration
    You must create a limited number of values for DMS to select.
    Note The Enumeration algorithm can be used to generate test data of the INTEGER, STRING, and TIME types.