This topic uses Excel as an example to describe how to use Kettle to synchronize the local Excel data to AnalyticDB for MySQL.
Kettle is a popular open source extract-transform-load (ETL) tool used for data collection, conversion, and migration. Kettle supports not only various relational databases and NoSQL databases such as HBase and MongoDB, but also niche data sources such as Microsoft Office Excel and Access. Kettle can support more data sources through extensions and plug-ins.
- Install Kettle.
- Create a database and a table in an AnalyticDB for MySQL cluster. For more information, see CREATE DATABASE and CREATE TABLE.
- Add the IP address of the device on which Kettle is installed to the whitelist of the AnalyticDB for MySQL cluster. For more information, see Configure a whitelist.
- If you need to use a public endpoint to connect to AnalyticDB for MySQL clusters, you must apply for a public endpoint. For more information, see Apply for or release a public endpoint.
- Start Kettle, and choose to create a conversion task.
to create a database connection for the conversion task.
Parameter Description Connection Name The name of the connection. We recommend that you choose an identifiable name to facilitate subsequent management. Connection Type Select MySQL. Access The access mode of the connection. Select Native (JDBC). Host Name The endpoint of the AnalyticDB for MySQL cluster to which you want to connect. You can view the connection information about the cluster on the Cluster Information page of the AnalyticDB for MySQL console. For more information, see Connect to an AnalyticDB for MySQL cluster. Database Name The name of the database in the AnalyticDB for MySQL cluster. Port Number The port number of the AnalyticDB for MySQL cluster endpoint. The default port number is 3306. User Name The account used to connect to the AnalyticDB for MySQL cluster. You can use one of the following account types:
- Privileged account
- Standard account
Password The password of the account used to connect to the AnalyticDB for MySQL cluster.Note Do not select Use Result Streaming Cursor when you configure the parameters.
- After you configure the preceding parameters, click Test. In the Database Connection Test dialog box that appears, follow the prompts to verify if the connection to the database is successful. After the connection passes the test, click OK.
- In the left-side navigation pane of Kettle, click the Core objects tab, and choose Input > Excel Input. Drag and drop Excel Input to the workspace.
- Double-click Excel Input in the workspace. In the Excel Input dialog box that appears, click Browse and Add to add an Excel file to Selected Files.
Configure parameters on tabs such as Worksheet, Content, and Field as required, and click Preview to check whether the entered data meets your requirements.
- In the left-side navigation pane of Kettle, click the Core objects tab, and choose Output > Table Output. Drag and drop Table Output to the workspace.
- Add a connection line from Excel Input to Table Output.
- Double-click Table Output in the workspace. In the Table Output dialog box that appears, configure the parameters.
- Target Schema: Enter the name of the AnalyticDB for MySQL database.
- Target Table: Enter the name of the table in the AnalyticDB for MySQL database.
- Select Specify database fields.
- Select Use batch update for inserts.
On the Database fields tab of the Table Output dialog box, click Get fields and Enter field mapping to map columns in the Excel file to those in the AnalyticDB for MySQL table.
- Click the white arrow to perform the conversion. During this period, you can check the operations log and operating status.
After the data in the Excel file is synchronized to the AnalyticDB for MySQL database, you can use AnalyticDB for MySQL to analyze the data.