All Products
Search
Document Center

DataWorks:Hive data management

Last Updated:Mar 26, 2026

The data catalog provides a unified interface to manage Hive metadata. This topic describes how to add a Hive data source to the catalog, create tables with fields and partitions, and manage or remove catalog entries.

The Hive data catalog uses a three-level hierarchy: data source (Hive instance) > database > table. Add a Hive data source first, then manage databases and tables within it. Only Internal Table is supported as the table type.

Go to the Hive data catalog page

  1. Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a region. Find the target workspace and choose Shortcuts > Data Studio in the Actions column.

  2. In the left navigation pane, click the image icon. In the Data Catalog tree, click Hive to open the Hive data catalog management page.

Create a Hive data catalog

On the Hive data catalog management page, add existing Hive data sources as datasets.

  1. To the right of the Hive data catalog, click the image icon to open the Add Instance page.

  2. On the DataWorks Data Source tab, add a Hive data source:

    • To manage the EMR computing resources attached for the new Data Studio in the current workspace, find the corresponding EMR cluster data source and click Add in the Actions column.

    • To add multiple sources at once, select multiple Hive data sources and click Batch Add below the list.

Manage a Hive data catalog

After adding a Hive data source, create and manage Hive tables in the data catalog.

Create a table

  1. Click the image icon next to the Hive data catalog to expand the database, then click Tables.

  2. To the right of Tables, click the image icon to open the Create Table page.

  3. Generate basic table and field information using one of the following methods:

    • Create a table using Copilot:

      1. In the top toolbar, click Create Table With Copilot to open the Copilot chat interface.

      2. Enter an instruction in natural language, for example: Create a user table.

      3. Click Generate And Replace. The system generates a default table name and field information.

      4. Click Accept to apply the result. To make further changes, edit the generated information manually after accepting.

    • Create a table manually:

      Parameter Description
      Basic information Specify a Table Name, Table Description, and other details.
      Field information Add fields and annotations. Click Insert to add rows manually and fill in Field Name, Field Type, and other details. Alternatively, click Generate Fields or Generate Field Descriptions to let Copilot generate fields based on the table name and description.
  4. (Optional) Configure partition information. In the Partition Fields section, specify the number of partition fields in Rows and click Insert. Multiple partitions are supported. Configure Field Name, Field Type, and other parameters for each partition field.

  5. (Optional) Configure advanced settings.

    Parameter Description
    Table type Only Internal Table is supported.
    Storage location Specify a custom storage path. Example: /user/hive/warehouse/hive_work.
    Storage format Select a format based on your data structure and query pattern. See Choose a storage format below.
  6. Click Publish in the top toolbar to create the table.

Choose a storage format

The system automatically sets the data input format, output format, and serialization/deserialization methods based on the format you select.

Format Best for
CSV Simple data structures; comma-separated text
PARQUET Big data analytics; high compression ratio; columnar storage
ORC Complex data types; high-performance columnar storage
AVRO Dynamic data structures; supports schema evolution
JSON Semi-structured data; supports nested structures
SELF_DEFINE Custom serialization/deserialization logic

Manage tables

Click the image icon to the left of the Hive data catalog, then click Tables to open the Tables page.

  • View tables: Browse basic information for all tables. Click a table name to view its Details, Basic Information, and DDL.

  • Delete a table: Find the table and click Delete in the Actions column.

    Important

    This operation cannot be undone. Proceed with caution.

View and remove a Hive data catalog

View a data catalog

Click the image icon to the left of the Hive data catalog to see the added Hive data sources. Click a data source to view all Databases in that Hive instance.

Remove a data catalog

Right-click the catalog and select Detach Data Catalog from the context menu.