The Information Schema service of MaxCompute contains the metadata of key objects in a project and provides historical information about job execution, data upload, and data download.
Description
The metadata views of the Information Schema service allow you to browse and retrieve metadata.
The usage information views of the Information Schema service allow you to optimize jobs and plan resources. For example, you can analyze the metrics of a job, such as the resource consumption, running duration, and amount of processed data.
Different views have different validity periods or default retention periods. Data that exceeds the retention period is inaccessible. You can manually export data from Information Schema to a MaxCompute table to back up the data at a specified interval. This backup applies to historical data that requires a longer storage period.
- For projects that use pay-as-you-go computing resources, you are charged based on
the SQL statements that you execute to query a view of Information Schema.
Range-clustered tables
are used in the views of Information Schema to reduce the input data amount of queries and improve query performance. If you query the TASKS_HISTORY and TUNNELS_HISTORY views, we recommend that you query the data of the previous day after 6:00:00 every day. To minimize the input data amount of queries and reduce the query cost, we recommend that you do not query data of the current day. - For projects that use subscription computing resources, the compute units (CUs) that you purchased are consumed when you query views of Information Schema.
- You do not need to pay storage fees for the views of Information Schema.
insert into select * from information_schema.***
after some fields are inserted into a MaxCompute table.
Type | Views | Timeliness and retention period | Delay |
---|---|---|---|
Metadata information | TABLES | Quasi-real-time view | Online data is displayed in metadata views with a delay of about three hours. |
PARTITIONS | Quasi-real-time view | ||
COLUMNS | Quasi-real-time view | ||
UDFS | Quasi-real-time view | ||
RESOURCES | Quasi-real-time view | ||
UDF_RESOURCES | Quasi-real-time view | ||
USERS | Quasi-real-time view | ||
ROLES | Quasi-real-time view | ||
USER_ROLES | Quasi-real-time view | ||
PACKAGE_OBJECTS | Quasi-real-time view | ||
INSTALLED_PACKAGES | Quasi-real-time view | ||
SCHEMA_PRIVILEGES | Quasi-real-time view | ||
TABLE_PRIVILEGES | Quasi-real-time view | ||
COLUMN_PRIVILEGES | Quasi-real-time view | ||
UDF_PRIVILEGES | Quasi-real-time view | ||
RESOURCE_PRIVILEGES | Quasi-real-time view | ||
TABLE_LABELS | Quasi-real-time view | ||
COLUMN_LABELS | Quasi-real-time view | ||
TABLE_LABEL_GRANTS | Quasi-real-time view | ||
COLUMN_LABEL_GRANTS | Quasi-real-time view | ||
Usage information | TASKS | Real-time snapshots of running jobs | Online data is displayed in metadata views with a delay of a few seconds. This view is in public preview without SLA guarantee and will be available in the future. |
TASKS_HISTORY | Quasi-real-time view. Historical data is stored in a partitioned table, and data from the last 14 days is retained. | Online data is displayed in metadata views with a delay of about three hours. | |
TUNNELS_HISTORY | Quasi-real-time view. Historical data is stored in a partitioned table, and data from the last 14 days is retained. |
TABLES
Field | Type | Description |
---|---|---|
table_catalog | STRING | The value is fixed to odps .
|
table_schema | STRING | The name of the project. |
table_name | STRING | The name of the table. |
table_type | STRING | The type of the table. Valid values: Valid values:
|
is_partitioned | BOOLEAN | Specifies whether the table is a partitioned table. |
owner_id | STRING | The ID of the table owner. |
owner_name | STRING | Optional. The Alibaba Cloud account of the table owner. |
create_time | DATETIME | The time when the table was created. |
last_modified_time | DATETIME | The last time when the table was modified. |
data_length | BIGINT | If the table is a non-partitioned table, the value of this parameter is the size of the table data. If the table is a partitioned table, the system does not calculate the size of the table data. In this case, the value of this parameter is NULL. The PARTITIONS view includes the data size of each partition in a partitioned table. Unit: bytes. |
table_comment | STRING | The comments on the table. |
life_cycle | BIGINT | Optional. The lifecycle of the table. |
is_archived | BOOLEAN | A reserved field. |
table_exstore_type | STRING | A reserved field. |
cluster_type | STRING | The clustering type of the MaxCompute table. Valid values: HASH and RANGE. |
number_buckets | BIGINT | Optional. The number of buckets in the clustered table. The value 0 indicates that the number of buckets dynamically changes during job execution. |
view_original_text | STRING | The view text in the table of the VIRTUAL_VIEW type. |
PARTITIONS
Field | Type | Description |
---|---|---|
table_catalog | STRING | The value is fixed to odps .
|
table_schema | STRING | The name of the project. |
table_name | STRING | The name of the table. |
partition_name | STRING | The name of the partition. Example: ds='20190130' .
|
create_time | DATETIME | The time when the partition was created. |
last_modified_time | DATETIME | The last time when the table was modified. |
data_length | BIGINT | The size of the data in the partition. Unit: bytes. |
is_archived | BOOLEAN | A reserved field. |
is_exstore | BOOLEAN | A reserved field. |
cluster_type | STRING | Optional. The clustering type of the MaxCompute table. Valid values: HASH and RANGE. |
number_buckets | BIGINT | Optional. The number of buckets in the clustered table. The value 0 indicates that the number of buckets dynamically changes during job execution. |
COLUMNS
Field | Type | Description |
---|---|---|
table_catalog | STRING | The value is fixed to odps .
|
table_schema | STRING | The name of the project. |
table_name | STRING | The name of the table. |
column_name | STRING | The name of the column. |
ordinal_position | BIGINT | The serial number of the column. |
column_default | STRING | The default value of the column. |
is_nullable | STRING | Optional. The value is fixed to YES. |
data_type | STRING | The data type of the column. |
column_comment | STRING | The comments on the column. |
is_partition_key | BOOLEAN | Specifies whether the column is a partition key. |
UDFS
Field | Type | Description |
---|---|---|
udf_catalog | STRING | The value is fixed to odps .
|
udf_schema | STRING | The name of the project. |
udf_name | STRING | The name of the UDF. |
owner_id | STRING | The ID of the UDF owner. |
owner_name | STRING | Optional. The Alibaba Cloud account of the UDF owner. |
create_time | DATETIME | The time when the UDF was created. |
last_modified_time | DATETIME | The last time when the UDF was modified. |
RESOURCES
Field | Type | Description |
---|---|---|
resource_catalog | STRING | The value is fixed to odps .
|
resource_schema | STRING | The name of the project. |
resource_name | STRING | The name of the resource. |
resource_type | STRING | The type of the resource. Valid values: Py and Jar. |
owner_id | STRING | The ID of the resource owner. |
owner_name | STRING | Optional. The Alibaba Cloud account of the resource owner. |
create_time | DATETIME | The time when the resource was created. |
last_modified_time | DATETIME | The last time when the resource was modified. |
size | BIGINT | The storage space used by the resource. |
comment | STRING | The comments on the resource. |
is_temp_resource | BOOLEAN | Specifies whether the resource is a temporary resource. |
UDF_RESOURCES
Field | Type | Description |
---|---|---|
udf_catalog | STRING | The value is fixed to odps .
|
udf_schema | STRING | The name of the project. |
udf_name | STRING | The name of the UDF. |
resource_schema | STRING | The name of the project to which the resource belongs. |
resource_name | STRING | The name of the resource. |
USERS
Field | Type | Description |
---|---|---|
user_catalog | STRING | Valid values: ALIYUN and RAM. |
user_schema | STRING | The name of the project. |
user_name | STRING | Optional. The name of the user. |
user_id | STRING | The ID of the Alibaba Cloud user. |
user_label | STRING | The label of the user. |
ROLES
Field | Type | Description |
---|---|---|
role_catalog | STRING | The value is fixed to odps .
|
role_schema | STRING | The name of the project. |
role_name | STRING | The name of the role. |
role_label | STRING | The label of the role. |
comment | STRING | The comments on the role. |
USER_ROLES
Field | Type | Description |
---|---|---|
user_role_catalog | STRING | The value is fixed to odps .
|
user_role_schema | STRING | The name of the project. |
role_name | STRING | The name of the role. |
user_name | STRING | The name of the user. |
user_id | STRING | The ID of the user. |
PACKAGE_OBJECTS
Field | Type | Description |
---|---|---|
package_catalog | STRING | The value is fixed to odps .
|
package_schema | STRING | The name of the project. |
package_name | STRING | The name of the package. |
object_type | STRING | The type of the package object. |
object_name | STRING | The name of the package object. |
column_name | STRING | The name of the table column. |
allowed_privileges | VECTOR<STRING> | The shared permissions. |
allowed_label | STRING | The shared label. |
INSTALLED_PACKAGES
Field | Type | Description |
---|---|---|
installed_package_catalog | STRING | The value is fixed to odps .
|
installed_package_schema | STRING | The name of the project. |
package_project | STRING | The name of the project in which the package was created. |
package_name | STRING | The name of the package. |
installed_time | DATETIME | Reserved. The time when the package was installed. |
allowed_label | STRING | The shared label. |
SCHEMA_PRIVILEGES
Field | Type | Description |
---|---|---|
user_catalog | STRING | The value is fixed to odps .
|
user_schema | STRING | The name of the project. |
grantee | STRING | The name of the user. |
user_id | STRING | The ID of the user. |
grantor | STRING | The account that grants the permission. The current value is NULL. |
privilege_type | STRING | The type of the permission. |
TABLE_PRIVILEGES
Field | Type | Description |
---|---|---|
table_catalog | STRING | The value is fixed to odps .
|
table_schema | STRING | The name of the project to which the table belongs. |
table_name | STRING | The name of the table. |
grantee | STRING | The name of the user. |
user_id | STRING | The ID of the user. |
grantor | STRING | The account that grants the permission. The current value is NULL. |
privilege_type | STRING | The type of the permission. |
user_schema | STRING | The name of the project to which the user belongs. |
COLUMN_PRIVILEGES
Field | Type | Description |
---|---|---|
table_catalog | STRING | The value is fixed to odps .
|
table_schema | STRING | The name of the project to which the table belongs. |
table_name | STRING | The name of the table. |
column_name | STRING | The name of the column. |
grantee | STRING | The name of the user. |
user_id | STRING | The ID of the user. |
grantor | STRING | Optional. The current value is NULL. |
privilege_type | STRING | The type of the permission. |
user_schema | STRING | The name of the project to which the user belongs. |
UDF_PRIVILEGES
Field | Type | Description |
---|---|---|
udf_catalog | STRING | The value is fixed to odps .
|
udf_schema | STRING | The name of the project. |
udf_name | STRING | The name of the UDF. |
user_schema | STRING | The name of the project to which the user belongs. |
grantee | STRING | The name of the user. |
user_id | STRING | The ID of the user. |
grantor | STRING | The account that grants the permission. The current value is NULL. |
privilege_type | STRING | The type of the permission. |
RESOURCE_PRIVILEGES
Field | Type | Description |
---|---|---|
resource_catalog | STRING | The value is fixed to odps .
|
resource_schema | STRING | The name of the project. |
resource_name | STRING | The name of the resource. |
user_schema | STRING | The name of the project to which the user belongs. |
grantee | STRING | The name of the user. |
user_id | STRING | The ID of the user. |
grantor | STRING | The account that grants the permission. The current value is NULL. |
privilege_type | STRING | The type of the permission. |
TABLE_LABELS
Field | Type | Description |
---|---|---|
table_catalog | STRING | The value is fixed to odps .
|
table_schema | STRING | The name of the project. |
table_name | STRING | The name of the table. |
label_type | STRING | The type of the label. The value is fixed to NULL. |
label_level | STRING | The level of the label. |
COLUMN_LABELS
Field | Type | Description |
---|---|---|
table_catalog | STRING | The value is fixed to odps .
|
table_schema | STRING | The name of the project. |
table_name | STRING | The name of the table. |
column_name | STRING | The name of a column. |
label_type | STRING | The type of the label. The value is fixed to NULL. |
label_level | STRING | The level of the label. |
TABLE_LABEL_GRANTS
Field | Type | Description |
---|---|---|
table_label_grant_catalog | STRING | The value is fixed to odps .
|
table_label_grant_schema | STRING | The name of the project to which the user belongs. |
user | STRING | The name of the user. |
user_id | STRING | The ID of the user. |
table_schema | STRING | The name of the project to which the table belongs. |
table_name | STRING | The name of the table. |
grantor | STRING | The account that grants the permission. The current value is NULL. |
label_level | STRING | The granted level of the label. |
expired | DATETIME | The time when the authorization expires. |
COLUMN_LABEL_GRANTS
Field | Type | Description |
---|---|---|
column_label_grant_catalog | STRING | The value is fixed to odps .
|
column_label_grant_schema | STRING | The name of the project to which the user belongs. |
user | STRING | The name of the user. |
user_id | STRING | The ID of the user. |
table_schema | STRING | The name of the project to which the table belongs. |
table_name | STRING | The name of the table. |
column_name | STRING | The name of a column. |
grantor | STRING | The account that grants the permission. The current value is NULL. |
label_level | STRING | The granted level of the label. |
expired | DATETIME | The time when the authorization expires. |
TASKS
Displays the real-time snapshots of jobs. This view is used to monitor jobs in real time.
Field | Type | Description |
---|---|---|
project_name | STRING | The name of the project. |
task_name | STRING | The name of the job. |
task_type | STRING | The type of the job.
Valid values:
|
inst_id | STRING | The ID of the instance. |
status | STRING | The status of the job when data is collected. Valid values: Running and Waiting. |
owner_id | STRING | The ID of the Alibaba Cloud account that submits the job. |
owner_name | STRING | The name of the Alibaba Cloud account that submits the job. |
start_time | DATETIME | The time when the job starts. |
priority | BIGINT | The priority of the job. This parameter is applicable only to jobs that use subscription resources. |
signature | STRING | The job signature. |
queue_name | STRING | The name of the compute queue. |
cpu_usage | BIGINT | The current CPU utilization. The value of this field is calculated by using the following formula: Number of CPU cores × 100. |
mem_usage | BIGINT | The current memory usage. Unit: MB. |
gpu_usage | BIGINT | The current GPU usage. The value of this field is calculated by using the following formula: Number of GPUs × 100. |
total_cpu_usage | BIGINT | The accumulated CPU utilization. The value of this field is calculated by using the following formula: Number of CPU cores × 100 × Running duration of the job (seconds). |
total_mem_usage | BIGINT | The accumulated memory usage. The value of this field is calculated by using the following formula: Memory size (MB) × Running duration of the job (seconds). |
total_gpu_usage | BIGINT | The accumulated GPU usage. The value of this field is calculated by using the following formula: Number of GPUs × 100 × Running duration of the job (seconds). |
cpu_min_ratio | BIGINT | The ratio of the CPU utilization of the job to the total CPU utilization. This parameter is applicable only to jobs that use subscription resources. |
mem_min_ratio | BIGINT | The ratio of the memory consumed by the job to the total memory usage. This parameter is applicable only to jobs that use subscription resources. |
gpu_min_ratio | BIGINT | The ratio of the GPUs consumed by the job to the total GPU usage. This parameter is applicable only to jobs that use subscription resources. |
cpu_max_ratio | BIGINT | The ratio of the CPU utilization of the job to the maximum CPU utilization. This parameter is applicable only to jobs that use subscription resources. |
mem_max_ratio | BIGINT | The ratio of the memory consumed by the job to the maximum memory usage. This parameter is applicable only to jobs that use subscription resources. |
gpu_max_ratio | BIGINT | The ratio of the GPUs consumed by the job to the maximum GPU usage. This parameter is applicable only to jobs that use subscription resources. |
settings | STRING | The custom scheduling settings of an upper-layer application, such as DataWorks. |
additional_info | STRING | The additional information. This is a reserved field. |
TASKS_HISTORY
Field | Type | Description |
---|---|---|
task_catalog | STRING | The value is fixed to odps .
|
task_schema | STRING | The name of the project. |
task_name | STRING | The name of the job. |
task_type | STRING | The type of the job.
Valid values:
|
inst_id | STRING | The ID of the instance. |
status | STRING | The status of the job when data is collected. This is not a real-time state. |
owner_id | STRING | The ID of the Alibaba Cloud account. |
owner_name | STRING | The name of the Alibaba Cloud account. |
result | STRING | The error information displayed if an error occurs in an SQL job. |
start_time | DATETIME | The time when the job starts. |
end_time | DATETIME | The end time of the job. If the job does not end on the current day, this value is NULL. |
input_records | BIGINT | The number of records read by the job. |
output_records | BIGINT | The number of records generated by the job. |
input_bytes | BIGINT | The amount of scanned data, which is the same as that displayed on Logview. |
output_bytes | BIGINT | The number of output bytes. |
input_tables | STRING | The job input tables in the [project.table1,project.table2] format. |
output_tables | STRING | The job output tables in the [project.table1,project.table2] format. |
operation_text | STRING | The source XML file of the query statement. If the size of the source XML file exceeds 256 KB, set the value to NULL. |
signature | STRING | Optional. The job signature. |
complexity | DOUBLE | Optional. The job complexity. This parameter is available only for SQL jobs. |
cost_cpu | DOUBLE | The CPU utilization of the job. The value 100 indicates that 1 CPU core multiplies the job running duration in seconds. For example, if 10 CPU cores run for five seconds, cost_cpu is 5000, which is calculated by using the following formula: 10 × 100 × 5. |
cost_mem | DOUBLE | The memory consumed by the job. The value of this field is calculated by using the following formula: Memory size (MB) × Running duration of the job (seconds). |
settings | STRING | The information that is scheduled by the upper layer application or specified by users. The information is saved in the JSON format. The information includes the following fields: USERAGENT, BIZID, SKYNET_ID, and SKYNET_NODENAME. |
ds | STRING | The date when the data was collected. Example: 20190101. |
TUNNELS_HISTORY
Field | Type | Description |
---|---|---|
tunnel_catalog | STRING | The value is fixed to odps .
|
tunnel_schema | STRING | The name of the project. |
session_id | STRING | The session ID, which is saved in the format of TIMESTAMP (YYYYMMDDHHmmss, 14 characters) + IP address (8 characters) + numHex (8
characters) . Example: 2013060414484474e5e60a00000002.
|
operate_type | STRING | The type of the operation. Valid values:
|
tunnel_type | STRING | The type of the tunnel. Valid values: TUNNEL LOG and TUNNEL INSTANCE LOG. |
request_id | STRING | The ID of the request. |
object_type | STRING | The type of object on which the operation is performed. Valid values: TABLE and INSTANCE. |
object_name | STRING | The table name or instance ID. |
partition_spec | STRING | The partition information. Example: time=20130222,loc=beijing .
|
data_size | BIGINT | The size of data. Unit: bytes. |
block_id | BIGINT | The ID of the block uploaded by using the tunnel. This parameter is available only if operate_type is set to UPLOADLOG. Otherwise, this parameter is left empty. |
offset | BIGINT | The number of records to skip before data is downloaded. By default, the download starts from record 0. |
length | BIGINT | The number of records to download or upload in the current session. The number of downloaded records is equal to the value of this parameter. |
owner_id | STRING | The ID of the Alibaba Cloud account. |
owner_name | STRING | The name of the Alibaba Cloud account. |
start_time | DATETIME | The start time of the request. |
end_time | DATETIME | The end time of the request. |
client_ip | STRING | The IP address of the client that initiates the request. |
user_agent | STRING | The information about the user agent, which is the client that initiates the request. The information may be the Java version or the operating system. |
columns | STRING | The columns that are specified when the data is downloaded over a data tunnel. |
ds | STRING | The date when the data was collected. Example: 20190101. |