After you have implemented the business logic, and published and started a Realtime Compute job, you must optimize the job to meet the performance requirements.

Purposes

  • Jobs can start and run properly.
  • Jobs have reasonable latency and throughput to meet the requirements of business performance.
  • Resources can be used efficiently, thus reducing the cost.

Procedure

The following figure shows the sequence and steps of job optimization.procedure
  1. Optimize the SQL code

    Specifically, SQL optimization indicates selecting an appropriate SQL implementation method based on business requirements. For example, you can optimize Group By functions, resolve data hotspot issues, optimize the TopN algorithm, use efficient built-in functions, optimize the deduplication performance, or avoid using regular expressions. For more information, see Skills for optimizing the Flink SQL code.

  2. Optimize parameter settings
    • Optimize performance based on job parameter settings

      Select an underlying optimization policy. For example, you can enable miniBatch to reduce state data access. For more information, see Optimize performance based on job parameter settings.

    • Optimize upstream and downstream data storage based on parameter settings

      Optimize the read and write operations performed on the upstream and downstream storage systems. For example, you can read or write data in batches to improve throughput. You can also set cache policies to improve the efficiency of joining dimension tables. For more information, see Optimize upstream and downstream data storage based on parameter settings.

  3. Optimize resource configuration automatically

    To simplify job optimization, Realtime Compute provides the automatic configuration optimization feature. We recommend that you use the automatic configuration optimization feature for job optimization. For more information about how to perform automatic configuration optimization in Blink V3.x, see Optimize performance by AutoScale.

  4. Optimize resource configuration manually or repeat the optimization process
    • Optimize resource configuration manually

      When automatic configuration optimization fails to meet requirements, you can Optimize performance by manual configuration.

    • Repeat the optimization process

      If the optimization result cannot meet your business requirements, repeat the optimization process.