Best practice of real-time data analysis based on Confluent+Flink

Business background

In actual business use, it is necessary to do some data analysis in real time, including real-time PV and UV display, real-time sales data, real-time store UV and real-time recommendation system, etc. Based on such requirements, fluent+real-time computing Flink version is an efficient solution.

Confluent is an enterprise-level fully managed streaming data service provided by Apache Kafka, built by the original creator of Apache Kafka, which extends the advantages of Kafka through enterprise-level functions and eliminates the burden of Kafka management or monitoring.

Real-time computing Flink is an enterprise-level real-time big data computing commercial product built by Alibaba Cloud based on Apache Flink. Real-time computing Flink is officially produced by the founding team of Apache Flink. It has a global unified commercial brand, provides a full range of product matrices, is fully compatible with the open source Flink API, and is fully based on the powerful Alibaba Cloud platform to provide cloud-based commercial value-added capabilities of Flink.

1、 Preparations - Create a Confluent cluster and a real-time computing Flink cluster

1. Log in to the Confluent management console and create the Confluent cluster. For the creation steps, refer to the Confluent cluster activation

2. Log in to the real-time computing Flink management console and create a vvp cluster. Please note that the vpc selected for creating the vvp cluster is the same as the region and vpc of the fluent cluster, so that the internal domain name of fluent can be accessed inside the vvp.

2、 Best practices - real-time statistics of player recharge amount - Confluent+real-time calculation of Flink+Hologres

2.1 Create a new Confluent message queue

1. On the fluent cluster list page, log in to the control center

2. Select Topics on the left, click Add a topic, create a topic named confluent-vvp-test, and set the partition to 3

2.2 Configuration result table Hologres

1. Enter the Hologres console, click the Hologres instance, and add the database 'mydb' in the DB management`

2. Log in to Hologres database and create a new SQL

3. Create a result table SQL statement in Hologres

2.3 Create real-time computing vvp jobs

1. First log in to the vvp console, select the region where the cluster is located, and click the console to enter the development interface

2. Click the Job Development Tab, and then click New File. The file name is confluent-vvp-hologres. The file type is Stream Job/SQL

3. Write the code in the input box

4. In the advanced configuration, add the dependent file truststore.jks (you need to add this file to access the internal domain name, but not to access the public domain name). The fixed path prefix to access the dependent file is/link/usrlib/(here is/link/usrlib/truststore.jks)

5. Click the online button to complete the online

6. Find the newly launched role in the operation and maintenance role list, click the start button, wait for the status to be updated to running, and the operation is successful.

7. On the [Topics ->Messages] page of the control center, send test messages one by one

2.4 View the real-time statistical effect of user recharge amount

3、 Best practices - e-commerce real-time PV and UV statistics - Confluent+real-time computing Flink+RDS

3.1 Create a new Confluent message queue

1. On the fluent cluster list page, log in to the control center

2. Select Topics on the left, click Add a topic, create a topic named pv-uv, and set the partition to 3

3.2 Create RDS result table

1. Log in to the RDS management console page and purchase RDS. Ensure that RDS and Flink fully managed clusters are in the same region and under the same VPC

2. Add a virtual switch network segment (vsswitch IP segment) to enter the RDS white list. For details, refer to the setting white list document

3. [vswitch IP segment] can be queried in the workspace details of Flink

4. Create an account [high authority account] on the [account management] page

5. Create a new database [confluent_vvp] under [database management] under the database instance

6. Log in to RDS using the DMS service provided by the system, and enter the login name and password into the high-powered account created above

7. Double-click the [confluent_vvp] database, open SQLConsole, copy and paste the following table creation statements into SQLConsole, and create the result table

CREATE TABLE result_ cps_ total_ summary_ pvuv_ min(

summary_ Date date NOT NULL COMMENT 'Statistics date',

summary_ Min varchar (255) COMMENT 'Count minutes',

pv bigint COMMENT 'pv',

uv bigint COMMENT 'uv',

Currenttime timestamp COMMENT 'Current time',

primary key(summary_date,summary_min)

)

3.3 Create a real-time computing VVP job

1. [VVP Console] Create a new file

2. Enter the code in the SQL field

3. After clicking [Online], click the start button on the operation and maintenance page until the status is updated to RUNNING status.

4. On the [Topics ->Messages] page of the control center, send test messages one by one

3.4 Viewing PV and UV effects

It can be seen that the pv and uv of the rds data table will change dynamically with the message data sent. At the same time, you can view the corresponding chart information through [Data Visualization].

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us