You can collect information about the schema and lineage of a table to Data Map. This way, the inner structure and association relationships of the table can be clearly displayed. This topic describes how to create a crawler to collect metadata from a Tablestore data source. You can view the collected metadata on the Data Map page.

  1. Go to the Data Discovery page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. After you select the region in which the workspace that you want to manage resides, find the workspace and click Data Analytics in the Actions column.
    4. On the DataStudio page, click the More icon icon in the upper-left corner and choose All Products > Data governance > DataMap.
    5. In the top navigation bar, click Data Discovery.
  2. In the left-side navigation pane, click OTS.
  3. On the OTSMetadata Crawler page, click Create Crawler.
  4. In the Create Crawler dialog box, set the parameters in each step.
    1. In the Basic Information step, set the parameters as required.
      Basic Information step
      Parameter Description
      Crawler Name Required. The name of the crawler. You must set a unique name.
      Crawler Description The description of the crawler.
      Workspace The workspace of the data source from which you want to collect metadata.
      Data Source Type The type of the data source from which you want to collect metadata. The default value is OTS and cannot be changed.
    2. Click Next.
    3. In the Select Collection Object step, select a data source from the Data Source drop-down list.
      If no data source is available, click Create to go to the Data Source page and add a Tablestore data source. For more information, see Add a Tablestore data source.
    4. Click Start Testing next to Test Crawler Connectivity.
    5. If the message The connectivity test is successful appears, click Next.
      If the message The connectivity test failed appears, check whether you have configured a valid data source.
    6. In the Configure Execution Plan step, configure an execution plan.
      Valid values of the Execution Plan parameter are On-demand Execution, Monthly, Weekly, Daily, and Hourly. The execution plan that is generated varies based on the execution cycle. The system collects metadata from the Tablestore data source based on the execution cycle that you specify. The following descriptions explain each value and provide examples:
      • On-demand Execution: The system collects metadata from the Tablestore data source based on your business requirements.
      • Monthly: The system automatically collects metadata from the Tablestore data source once at a specific time on several specific days of each month.
        Notice Specific months do not have the 29th, 30th, or 31st day. In these months, the system does not collect metadata from the Tablestore data source on these dates. We recommend that you do not select the last days of a month.
        The following figure shows that the system automatically collects metadata from the Tablestore data source once at 09:00 on the 1st, 11th, and 21st days of each month. An expression is automatically generated for the Cron Expression parameter based on the values of the Date and Time parameters. Monthly
      • Weekly: The system automatically collects metadata from the Tablestore data source once at a specific time on several specific days of each week.
        The following figure shows that the system automatically collects metadata from the Tablestore data source once at 03:00 on Sunday and Monday of each week. WeeklyIf the Time parameter is not set, the system automatically collects metadata from the Tablestore data source once at 00:00:00 on the specific days of each week.
      • Daily: The system automatically collects metadata from the Tablestore data source once at a specific time of each day.
        The following figure shows that the system automatically collects metadata from the Tablestore data source once at 01:00 each day. Daily
      • Hourly: The system automatically collects metadata from the Tablestore data source once on the N × 5th minute of each hour.
        Note For a Tablestore metadata collection task that is run each hour, you can set the time to a multiple of 5 minutes.
        The following figure shows that the system automatically collects metadata from the Tablestore data source on the 5th and 10th minutes of each hour. Hourly
    7. Click Next.
    8. In the Confirm Information step, check the information that you specified and click Confirm.
  5. On the OTSMetadata Crawler page, you can view the information about your crawler and manage your crawler. View the crawler
    The following descriptions show the information that you can view and the operations that you can perform:
    • You can view the status and execution plan of the crawler. You can also view the time when the last execution was started, the time period consumed for the last execution, the average time period consumed, the number of updated tables in the last execution, and the number of created tables in the last execution.
    • You can click Details, Edit, Delete, Run, or Stop in the Actions column to perform the desired operation.
      • Details: View the crawler name and the data source and execution plan configured for the crawler.
      • Edit: Modify the configurations of the crawler.
      • Delete: Delete the crawler.
      • Run: Run a task to collect metadata from the Tablestore data source. The Run button is available only if the Execution Plan parameter is set to On-demand Execution.
      • Stop: Stop the crawler.

Result

After the metadata in the Tablestore data source is collected, click All Data in the top navigation bar. Select OTS from the drop-down list in the upper part of the page. You can view the tables that store the collected Tablestore metadata. View the collected Tablestore metadata

Click a table name, a workspace name, or a database name to view the related details.

Example 1: View the details of the mysql_ots table. View the details of a table
Example 2: View all tables in the datax-bvt database. View all tables in a database