All Products
Search
Document Center

Dataphin:View the schedule resource dashboard

Last Updated:Jan 23, 2025

The schedule resource dashboard provides insights into the usage and allocation of scheduling resources within the Dataphin cluster. It enables you to monitor global resource configurations and individual task allocations, optimizing resource utilization, reducing costs, and mitigating the risk of task backlog due to resource constraints, thus enhancing platform stability.

Prerequisites

  • Ensure the Dataphin deployment team has set up Prometheus monitoring to collect the necessary data for the schedule resource dashboard.

  • Verify that the schedule resource dashboard feature is enabled in Management Center > Resource Settings within the metadata warehouse tenant. For more information, see Resource Settings.

Feature description

  • The schedule resource dashboard in Dataphin offers insights into the cluster's resource management from two key aspects: Resource Allocation and Resource Consumption, with a focus on CPU and memory metrics. It also highlights tasks with suboptimal resource usage-either underutilized, leading to waste, or overutilized, risking memory overflow. This optimization task list enables prompt issue identification and resolution, thereby bolstering platform stability and optimizing resource expenditures.

  • Comprising three modules-Resource Allocation, Resource Consumption, and Suggested Optimization Tasks-the dashboard enables quick identification of blocking tasks that consume excessive resources, potentially causing task backlog. By examining snapshot values and trend analysis, you can assess optimization opportunities to enhance resource utilization.

Data statistics scope

The data statistics for the schedule resource dashboard are collected as follows:

  • Data Statistics Frequency: Statistics are gathered every minute, capturing the current snapshot for allocated resources and the peak value within the past minute for consumed resources. A page refresh triggers an update of the page data.

  • Data Statistics Scope:

    • Global Resource Consumption and Global Resource Allocation: These statistics encompass the allocation and consumption values of all task instances across all operational environments under the current tenant.

    • Suggested Optimization Tasks: Statistics are limited to auto-triggered nodes (Basic and Prod projects) running in dedicated containers within the production environment, excluding SQL tasks and code template tasks in shared containers.

Schedule resource dashboard entry

  1. On the Dataphin home page, single click the top menu bar Development > Task Operation And Maintenance.

  2. In the Operation Center, select Schedule Resource Dashboard from the left-side navigation pane to access the Schedule Resource Dashboard page.

    image

Global resource allocation

Global resource allocation illustrates the distribution of CPU and memory relative to total resources at the current statistical time point and provides a historical trend chart of resource allocation.

image

Global resource allocation Metric explanation

The metrics for global resource allocation are detailed below.

Metric

Description

Total CPU

The total CPU capacity available in the scheduling cluster, excluding system consumption.

Allocated CPU

The sum of CPU resources allocated to tasks at the current statistical time point.

CPU Allocation Rate

The ratio of allocated CPU to the total CPU at the current statistical time point, expressed as a percentage to two decimal places.

Total Memory

The total memory capacity available in the scheduling cluster, excluding system consumption.

Allocated Memory

The sum of memory resources allocated to tasks at the current statistical time point.

Memory Allocation Rate

The ratio of allocated memory to the total memory at the current statistical time point, expressed as a percentage to two decimal places.

The resource allocation trend chart allows you to select time frames such as the last 3, 6, 12, 24 hours, 3 days, or 7 days to observe the allocation trends. The horizontal axis is segmented according to the chosen time frame, displaying corresponding time points, while the vertical axis represents the resource allocation rate, marked at intervals of 0%, 20%, 40%, 60%, 80%, and 100%. Hovering over a time point reveals the CPU allocation value, CPU allocation rate, memory allocation value, and memory allocation rate at that moment.

Global resource allocation Optimization suggestions

Pay particular attention to the CPU Allocation Rate and Memory Allocation Rate. The following optimization suggestions are provided:

  • A consistently low resource allocation rate indicates underutilization, leading to potential waste. Consider downsizing the total resources to save on costs.

  • A consistently high allocation rate may result in task delays due to resource contention or failures from memory shortages. Increasing the total resources may be necessary.

Global resource consumption

Global resource consumption offers a comparative view of the actual resources consumed by global tasks against the pre-allocated resources, highlighting fluctuation trends.

CPU allocation and consumption trend

image

The metrics for global resource consumption are detailed below.

Parameter

Description

CPU Allocation Value

Represents the cumulative CPU resources allocated to tasks at each point of data collection.

CPU Consumption Value

Total CPU usage by tasks measured at each statistical interval.

CPU Consumption Rate

The ratio of CPU consumption to CPU allocation at each point of measurement.

Number of Running Instances

This represents the total count of instances that are actively running at each point of data collection, inclusive of recurring instances, data backfill instances, and one-time instances. Additionally, the count distinguishes between instances in shared and dedicated containers, serving as a metric for managing concurrency.

You can select time frames such as the last 3, 6, 12, 24 hours, 3 days, or 7 days to view the CPU allocation and consumption trends. The horizontal axis of the trend chart is divided into 12 segments based on the selected time frame, showing the corresponding time points. The left vertical axis indicates the resource amount, segmented into five levels based on the total available resources of the cluster. The right vertical axis displays the number of running instances.

Memory allocation and consumption trend

image

The metrics for global resource consumption are detailed below.

Parameter

Description

Memory Allocation Value

The total memory resources allocated to tasks at each statistical time point.

Memory Consumption Value

The total memory resources consumed by tasks at each statistical time point.

Memory Consumption Rate

The ratio of memory consumed to memory allocated at each statistical time point.

Number of Running Instances

The total number of task instances in a running state at each statistical time point, including recurring, data backfill, and one-time instances. This count also distinguishes between instances running in shared versus dedicated containers, providing a reference for managing concurrency.

You can select time frames such as the last 3, 6, 12, 24 hours, 3 days, or 7 days to view the memory allocation and consumption trends. The horizontal axis of the trend chart is divided into 12 segments based on the selected time frame, showing the corresponding time points. The left vertical axis indicates the resource amount, segmented into five levels based on the total available resources of the cluster. The right vertical axis displays the number of running instances.

When there is a significant discrepancy between the consumption and allocation values, it is advisable to examine the optimization task list for tasks with low consumption rates and adjust their resource allocation to enhance overall resource utilization. The following are detailed explanations:

  • If there is a notable difference between the overall resource consumption and allocation values over a period, consider reviewing the suggested optimization tasks and potentially reducing the allocation for some tasks to improve resource utilization.

  • If the overall consumption and allocation values are closely aligned over a period, the resource allocation is generally appropriate. However, consider increasing the allocation for certain core tasks to balance efficiency and stability requirements based on their historical performance.

Suggested optimization tasks

image

Suggested optimization tasks highlight tasks with resource consumption rates that exceed or fall below certain thresholds. This allows for flexible adjustment of resource allocation to ensure tasks have adequate resources for stable operation, preventing under-allocation that could disrupt normal scheduling or over-allocation that leads to resource waste.

Optimization tasks Metric explanation

Filter Suggested Optimization Tasks by project, task type, and consumption rate threshold. The metrics for the list items are as follows:

Metric

Description

Consumption Rate

The peak resource consumption of the most recent task instance run as of the statistical time.

  • Low consumption rate list

    Default sorting is by Consumption Rate From Low To High. You can also sort by Recent Runtime From Long To Short.

    gagaga

  • High consumption rate list

    Default sorting is by Consumption Rate From High To Low.

    gagaga

Recent 7 Consumption Rates

The peak resource consumption as a percentage of allocated resources for each run of the task's recurring instances and data backfill instances, based on the last 7 runs.

Average Runtime

The average duration of the task's recurring instances and data backfill instances over the last 7 runs.

Optimization tasks Focus suggestions

  • If a task consistently shows a resource allocation value significantly higher than its consumption value and has a long runtime, it should be prioritized for review to prevent it from impacting the execution of ad hoc queries and downstream business processes.

  • If a task's resource allocation value is significantly higher than its consumption value over a period, but the runtime is short, consider a moderate adjustment of resources to reallocate the surplus to other tasks that may be resource-constrained.

  • If a task's resource consumption value is close to its allocation value over time, monitor it to prevent delays or failures due to insufficient resources.

  • If a task's memory consumption rate reaches 100%, it is critical to address this by increasing memory allocation to prevent failures due to memory overflow, which could affect data output.

Task resource details

Follow the steps below to view detailed task resource information. Task resource details provide insights into the most recent run and the trend of the last 7 runs, aiding in the optimization of resource allocation. gagaga

Parameter

Description

Task

Enables viewing resource details for different tasks.

Basic Information

Includes essential information about the target task, such as Task Name, Task Type, and Schedule Date.

Resource Details

Covers the scope of resource statistics, with the default being Details Of The Most Recent Run. Use the Resource Details dropdown to switch to Trend Of The Last 7 Runs.

Resource Allocation And Consumption Trend

Displays the trend chart for resource allocation and consumption, providing a visual representation of resource dynamics during task execution.