Process and specifications of draft development and deployment - Realtime Compute for Apache Flink

As the amount of data explosively grows and business requirements become more and more complex, enterprises have an increasingly urgent need for real-time data processing capabilities. As a powerful stream processing framework, Flink has become a standard for real-time computing. Its standardized development and O&M processes are essential for enterprises to improve data processing efficiency and ensure system stability. These standardized processes improve R&D efficiency and ensure smooth project development. Realtime Compute for Apache Flink provides an all-in-one development, O&M, and management platform. This platform provides powerful capabilities throughout the lifecycle of a project, including draft development, data debugging, operation and monitoring, Autopilot, and intelligent diagnostics. This topic describes the R&D specifications of Realtime Compute for Apache Flink, including phase planning, roles, and responsibilities. This topic also describes the process of developing a real-time data lakehouse.

Phase planning

Requirement phase: Product managers must understand business requirements, evaluate requirements for real-time data processing, and write requirement documents.
Design phase: Data architects design an architecture for processing real-time data streams based on requirement documents, including data source connection, data conversion, data storage, and data query.
Development phase: Developers use tools such as Flink to implement real-time data processing logic based on design documents and perform unit tests.
Testing phase: Test engineers write test cases and perform functional tests, performance tests, and exception tests to ensure the accuracy and stability of data processing.
Deployment phase: O&M engineers publish the developed real-time data processing drafts as deployments to the production environment.
O&M phase: O&M engineers and developers jointly monitor the system status and optimize the performance based on monitoring results.

Roles and responsibilities

Product manager: collects and evaluates business requirements, writes requirement documents, and communicates with technical teams to ensure that the requirements can be implemented.
Data architect: designs the real-time data processing architecture, including data stream architecture design and solution selection.
Developer: writes Flink code or develops applications, implements data processing logic, and performs unit tests such as code review and job debugging.
Test engineer: writes and runs test cases to ensure the stability and performance of data processing.
O&M engineer: deploys, monitors, and maintains the real-time data processing system to ensure high availability and stability.
Security expert: implements data encryption, maintains access control mechanisms, and configures and manages network isolation measures to ensure that the real-time data processing architecture meets security and compliance requirements.

Process of developing a real-time data lakehouse

Requirement analysis
- Data product managers work with business teams to clarify the goals and requirements for real-time data processing.
- Data product managers determine business requirements, including the data source, data type, processing logic, and output requirements.
Architecture design
- Data architects design the real-time data processing architecture, including data source connection, data conversion, data storage, and data query.
- Data architects select appropriate data processing tools and storage solutions.
Security specifications
- Security experts participate in the architecture design to ensure that the architecture meets security standards and compliance requirements.
- Security experts implement security measures such as sensitive data masking, access control, and permission isolation.
Draft development
- Developers perform extract, transform, and load (ETL) operations to convert and process data based on the architecture design.
- Developers use Flink to implement data processing logic and perform unit tests.
Code review
- Developers perform code reviews to ensure the quality and security of code.
- Developers use automated tools to perform static code analysis.
Testing
- Test engineers write test cases and perform functional tests, performance tests, and exception tests.
- Test engineers perform tests to ensure the accuracy and stability of data processing.
Deployment
- O&M engineers deploy the system to the production environment.
- O&M engineers perform security checks and verify configurations before deployment.
Monitoring and O&M
- O&M engineers and developers jointly monitor the system status.
- O&M engineers optimize the performance and respond to faults based on monitoring results.
Performance testing
- Test engineers perform load tests and stress tests to ensure the performance of the system under high loads.
- Test engineers optimize system configurations and resource allocation.
Backup and restoration
- O&M engineers back up data on a regular basis and restore data as required.
- O&M engineers verify the integrity and restorability of backup data.
Audit and compliance
- Security experts perform security audits and compliance checks on a regular basis.
- Security experts make sure that all operations comply with legal regulations and company policies.

References

For more information about how to develop a draft, see Development.
You can use variables to prevent security risks caused by information such as AccessKey pairs and passwords in plaintext in various scenarios, including the development of SQL, JAR, or Python drafts. For more information, see Manage variables.
After you develop a draft, you must publish the draft as a deployment to the production environment and configure the deployment.
Flink job support two automatic tuning modes: Autopilot and scheduled tuning. You can also use the intelligent job diagnostics feature to monitor the health status of jobs. For more information, see Job diagnostics and optimization.
You can grant Resource Access Management (RAM) users the minimum required permissions on upstream and downstream systems to further improve access security. For more information, see Best practice of secure access.