This topic describes how to create a crawler to collect metadata from an E-MapReduce data store to DataWorks. You can view collected metadata on the Data Map page.

Procedure

  1. Go to the Data Discovery page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces. The Workspaces page appears.
    3. Find the target workspace and click Data Analytics in the Actions column.
    4. On the DataStudio page, click Icon in the upper-left corner and choose All Products > DataMap. The Data Map page appears.
    5. Click Data Discovery in the top navigation bar.
  2. On the E-MapReduce metadata collection page that appears, click New collector.
  3. In the New EMR metadata collection dialog box that appears, select an engine instance from the Select Engine drop-down list and click Authorize.
    New EMR metadata collection
  4. On the page that appears, click the Metadata tab and click Enable.
    Enable
  5. In the Confirm Operation dialog box that appears, click OK.
  6. Return to the New EMR metadata collection dialog box on the Data Map page and click Refresh.
  7. After the authorization status changes to Authorized, click Commit.
  8. On the E-MapReduce metadata collection page, find the created crawler and click Obtain All in the Actions column.
    Click Refresh in the upper-right corner of the page and verify that the value in the Status column of the created crawler changes to Collected successfully.
    Note After full metadata is collected from the E-MapReduce data store, the system automatically synchronizes new metadata from the data store.

    To delete the created crawler, click Delete in the Actions column. In the Delete dialog box that appears, click OK.

  9. View the metadata collected from the E-MapReduce data store.
    1. Click All Data in the top navigation bar.
    2. Click the EMR Table tab.
    3. On the EMR Table tab, click the corresponding table name and view the table details.