Database coding (Database-as-Code) practice-Alibaba Cloud Developer Community

recently, we encountered a difficult problem in Apsara stack output. Customers need to upgrade the version we released two years ago to the latest version. Due to the long span, the product code and database schema have changed greatly. The product code uses a version management policy and has a clear upgrade path. However, the database does not use a code solution, resulting in a missing upgrade path, which makes the entire upgrade process very difficult. In order to make the future version upgrade go smoothly, we need to develop a unified database coding solution.

What we needed to do was to change our mindset of how we treated our database. We had to stop treating it like some special artifact or some unique scenario, and we started looking at it through the same perspective that we were treating our web code.

State based and Migrations based are two common methods to implement database coding.

State based

in the state based mode, we only need to maintain the target state of the database. Each table, stored procedure, view, and trigger is saved as a separate SQL file, which is the true representation of the state of the database object. The script required to upgrade the database is automatically generated by the tool, thus greatly reducing the maintenance cost.

However, this mode does not handle data migration scenarios well. For example, the name column of the user table is split into the first name and last name fields. This is because the data in the data table is often context-related, which means that the tool cannot reliably assume the data to generate upgrade scripts.

Migrations based

in migrations based mode, we need to maintain the database change script from one version to another. Compared with state based, this mode increases the cost and complexity of maintenance, but it allows us to control the migration process more directly so that we can handle context-related scenarios such as data migration. And since changes are described as imperative, we can review them earlier. Realized migration based representation of tool with Liquibase, Flyway, Etc.

this section describes Flyway, one of the representative tools in migrations based mode. The version used in this section is 6.0.8.

What is Flyway

Flyway is an open-source database migration tool, which can help us easily complete new database deployment and incremental upgrades. It has the following features:

  1. it can be embedded into an application or executed as a separate tool.
  2. Track the migration that has been performed.
  3. Perform a new migration.
  4. Verify the database status.

Flyway principle

Flyway works as follows:

  1. initially, create a database named flyway_schema_historythis table is used to record the execution of migration.
  2. Scan the migration scripts in the specified directory and sort them by version number from low to high.
  3. Apply the migration script to the database in sequence. The metadata table is updated accordingly.
  4. During the next incremental database upgrade, flyway finds out the new migration scripts based on the execution records in the metadata table and runs them in sequence.

because we not only need to handle schema changes, but also face some data migration scenarios, we finally chose the migrations based mode and used Flyway to help us implement database coding.

Organization structure of migration scripts

the following directory structure is designed to manage SQL migration scripts in both the public cloud and Apsara stack, and the upgrade scenario of the old version of Apsara stack.

|--{db1}
     |--flyway.conf
     |--base_sql
          |--V0.000__a.sql
          |--V0.001__b.sql
          |...
          `--V0.025__z.sql
     |--upgrade_legacy_private_cloud_sql
          |--V2.000__create_TB_t1.sql
          |--V2.001__alter_TB_a.sql
          `--V2.002__TB_b_update.sql
     |--upgrade_sql
          |--V2019.11.11.000__alter_TB_b_add_column.sql
          |--V2019.11.11.001__TB_c_insert.properties
          |--V2019.11.11.001__TB_c_insert.sql
          `--V2019.11.13.000__mix.sql
 |--{db2}
      |--flyway.conf
      ...
 |--common
      `--procedure.sql

1. Each database corresponds to an independent directory that contains the migration script and configuration information of the database. 2. The flyway.conf file in the database directory contains the connection, authentication, baseline, and other information of the database. 3. The subdirectory base_ SQL is used to store existing schemas. The version number is in the format 0.xxx. The content in this directory cannot be changed after it is determined. 4. The upgrade_legacy_private_cloud_ SQL subdirectory is used to store the migration script from the old version to the new version of Apsara stack. The format of the version is 2.xxx. 5. The upgrade_ SQL subdirectory stores the subsequent migration scripts of the public cloud and Apsara stack. Due to the fact that no version has been compiled for the public cloud, and the numbering methods for different versions of Apsara stack are different, the date is used as the version prefix of the migration script in the format yyyy.mm.dd.index. 6. DML in different environments may be different. In this scenario, you need to use a placeholder to represent the difference in the migration script. The actual value of the placeholder is rendered by a file with the same name as the properties File. For example V{yyyy.mm.dd.index}__xxx.properties. The properties file with the same name is stored in the baseline. Different values can be set for different environments. Files in the corresponding environment are copied to the upgrade_ SQL directory during runtime. 7. File common/procedure.sqlstored procedures that include index, column, and key changes. These stored procedures implement the idempotence of changes.

Migration script granularity

In general, we recommend that a script contain only one type of operations for one object. For example, Vxxx_TB_car_add_column.sqlrepresentative datasheet caradd columns, Vyyy_TB_car_insert.sqlindicates a data table. carInsert data. This mode conforms to the design idea of a single responsibility and can greatly reduce the number and complexity of merge conflicts.

Execution process

based on the preceding SQL migration script management mode, the following table describes the execution processes in different scenarios:

environmental scene execution process

Idempotence practices

ideally, each migration script runs only once on each database. However, if a migration fails, you may need to perform the successful migration step again to restore the database to the desired state. It is very helpful to write a migration script in an idempotent manner. Here we summarize the best practices for different DDL and DML idempotence implementations.

SQL Type Object Action SQL script naming convention Best Practices

in this example, the modification process is encapsulated into the following stored procedure.

DELIMITER $$
CREATE PROCEDURE `SAFE_CHANGE_COLUMN`(IN i_table_name VARCHAR(128),IN i_col_name VARCHAR(128), IN i_col_def VARCHAR(256))
BEGIN
    SET @tableName = i_table_name;
    SET @colName = i_col_name;
    SET @colDef = i_col_def;
    SET @colExists = 0;
    SELECT 1 INTO @colExists FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = @tableName AND COLUMN_NAME = @colName LIMIT 1;
    IF @colExists THEN
        SET @query = CONCAT('ALTER TABLE ',@tableName,' CHANGE COLUMN ', @colName,' ',@colDef);
        PREPARE stmt FROM @query;
        EXECUTE stmt;
        DEALLOCATE PREPARE stmt;
    END IF;
END$$

CICD pipeline

here, we store the database migration script and application code in the same code repository and share the same CICD process. This model is most in line with the ideas of collaboration, testing, fast feedback, continuous improvement and so on advocated by DevOps, and can realize faster, more frequent and more stable product delivery.

with the database coding solution, the maintenance and upgrade of the database becomes easy and easy.

  1. No matter how long the upgrade takes, it can be handled calmly, because there is a clear upgrade path between any version.
  2. Database schema changes and data migration behaviors become auditable.
  3. Database Migration and application code sharing CICD pipeline speed up product delivery and release.
  • Database coding (Database-as-Code) Practice (2)
Selected, One-Stop Store for Enterprise Applications
Support various scenarios to meet companies' needs at different stages of development

Start Building Today with a Free Trial to 50+ Products

Learn and experience the power of Alibaba Cloud.

Sign Up Now