The R&D platform helps you control the concurrency of locks and Analyze commands during the development process. This topic describes how to configure edit locks, object submission, query acceleration and storage volume update settings.
Limits
The storage volume update setting feature is available when the compute engine is E-MapReduce 3.x, E-MapReduce 5.x, CDH 5.x, CDH 6.x, FusionInsight 8.x, Cloudera Data Platform 7.x, AsiaInfo DP 5.3, ArgoDB, TDH 6.x, StarRocks, SelectDB, or Doris.
SelectDB and Doris compute engines do not support table management settings or the default compute engine module for standard modeling.
Permission description
Only custom user roles with the Manage R&D Platform Settings permission and super administrator or system administrator can configure the R&D platform.
R&D platform access
In the top navigation bar of the Dataphin homepage, choose Management Hub > System Settings.
In the navigation pane on the left, choose Platform Settings > R&D Platform.
Edit lock
In the Edit Lock section, click the edit icon, enable the exclusive edit lock switch, and configure the lock.
Parameter
Description
Exclusive Edit Lock
When disabled, users can override each other's lock status. When enabled, after a user locks an object, other users cannot edit it until the lock is manually released or expires. Only then can other users lock and edit the object.
Lock Duration
If a user performs no editing actions within the lock duration, the exclusive lock becomes invalid and can be acquired by other users. The default is 30 Minutes, with a minimum of 5 Minutes and a maximum of 120 Minutes.
Auto-release When Closing Object
Automatically releases the lock when closing the object editing tab.
Auto-release When Submission Succeeds
Automatically releases the lock when submission succeeds. The lock is not released if submission fails.
Click OK to complete the edit lock settings.
To restore the initial system configuration, you can click Restore Defaults.
Query acceleration
When query acceleration is enabled, it supports accelerating all ad hoc queries and SQL unit queries in the analysis platform. When this switch is turned off, the query acceleration switch for all ad hoc queries and analysis platform SQL units will be hidden, and query acceleration will not be supported.
Query acceleration only supports the MaxCompute compute engine.
Storage volume update settings
For data tables written directly to HDFS through integration, real-time development, and other tasks, Hive does not update storage volume information by default, including table storage volume and partition storage volume. Therefore, storage volume information for target tables cannot be displayed in the asset directory. Dataphin provides a feature to automatically execute the Analyze command after data table updates to obtain the latest storage volume information. You can configure this in Management Hub - System Settings - R&D Platform Settings.
In the Storage Volume Update Settings section, click the edit icon, enable the automatic storage volume update switch, and configure the concurrent connections.
Automatic Storage Volume Update: Disabled by default. When enabled, Dataphin automatically executes the Analyze command for Hive target tables after tasks run successfully to update storage volume information. If you have many integration and real-time development tasks, and your Hive Server performs well, you can adjust the number of concurrent connections to shorten the overall execution time of update commands, ensuring that the latest storage volume information can be queried in the asset directory the next day. Note that high concurrency may consume more computing resources and affect the normal operation of other tasks. Please configure the number of concurrent connections reasonably based on your business scenario.
Maximum Connections: Allows you to set the maximum number of concurrent connections for executing Analyze commands. The default is 5, and you can set a positive integer between 1 and 200.
ImportantWhen automatic storage volume update is enabled, if the Analyze command runs for more than 24 hours, the system will automatically terminate executing or waiting commands to save computing resources.
Click OK to complete the storage volume update settings.
NoteWhen automatic storage volume update changes from disabled to enabled and is confirmed, the configured number of concurrent connections takes effect immediately. Note that high concurrency may consume more computing resources and affect the normal operation of other tasks. Please configure the number of concurrent connections reasonably based on your business scenario.
When automatic storage volume update changes from enabled to disabled, executing or waiting Analyze commands are not affected. The storage volume of target tables for subsequently successful integration, real-time development, and other tasks will not be automatically updated. You can manually update the information by executing the Analyze command in Hive.
Node task related settings
In the Node Task Related Settings section, click the Edit icon to configure the default scheduling time for new tasks and object submission rules.
New
Parameter
Description
Default Priority
The default priority when creating integration tasks, computing tasks, and logical table tasks. You can select Lowest, Low, or Medium. The default is Medium.
Default Scheduling Time
Random Within Interval
The default time interval is 00:00-03:00, and the default random time interval is 5 minutes.
The end time of the time interval must be greater than the start time. The valid time range is 00:00-23:59, in the format hh:mm.
The valid range for the random time interval is a positive integer between 1 and 30.
Fixed Time
The default fixed time is 00:00.
The valid time range is 00:00-23:59, in the format hh:mm.
Python Default Version
The default Python version for creating Python computing tasks, creating Python offline computing templates, and installing Python third-party packages. You can select Python 2.7, Python 3.7, or Python 3.11. The default is Python 3.7.
NoteThe Default Scheduling Time is set to Random Within Interval by default. You can change it to Fixed Time as needed.
When creating offline tasks (integration tasks, computing tasks, logical tables), the scheduling time will automatically use the default scheduling time configured here.
If the default scheduling time is set to Random Within Interval, a random time will be generated according to the configured rules.
If the default scheduling time is set to Fixed Time, the configured time will be used.
Run
Hide Logview URL When SQL Contains Account And Password Global Variables: Account and password global variables in SQL are displayed as plaintext in MaxCompute logview SQL, which can lead to leaks. This option is disabled by default.
If you enable this configuration, when MAXCOMPUTE_SQL and logical table tasks contain account and password global variables, the logview URL for SQL that references account and password global variables will be hidden in development environment run and data preview logs, along with production environment O&M logs. The logview URL will be replaced with
The current SQL uses the account password global variable {dp_glb_xxx}, the logview URL has been hidden. The logview URL is invisible because the current SQL is using global variable "{dp_glb_xxx}", which is of type account and password..NoteThis configuration is only supported when the compute engine is MaxCompute.
Submit
Parameter
Description
Automatic Dependency Parsing For Offline Development Object Submission
When enabled, dependency parsing is automatically triggered each time an offline development object (such as SQL computing tasks and logical table tasks) is submitted, updating the upstream dependency list to avoid missing upstream dependencies.
Field Type Validation For Logical Table Submission
When enabled, the system checks whether the return type of the field calculation logic is compatible with the field type when a logical table is submitted. If they do not match, the system will block the submission to prevent implicit type conversion that could cause data errors.
Offline Delete
Allow Deletion Of Published Objects In Development Environment: When enabled, objects published to the production environment (computing tasks, integration tasks, logical tables, atomic metrics, business filters, derived metrics, etc.) can be directly deleted in the development environment.
ImportantDeleted objects cannot be recovered. In the development environment, if you delete a development object without publishing the deletion task to the production environment, the corresponding production object cannot be modified because the corresponding development object no longer exists.
Default Dependency Cycle And Dependency Policy
You can modify the Default Dependency Cycle and Default Dependency Policy.
Default Dependency Cycle: You can select Current Cycle (Current Day), Previous Cycle (Previous 1 Day), Last 24 Hours, or Previous N Cycle. For Previous N Cycle, N defaults to 2 and cannot be empty.
Default Dependency Policy: You can select First Instance, Nearest Instance, All Instances, or Last Instance.
The initial default dependency cycles and policies are shown in the following table.
Current Node Scheduling Cycle
Upstream Node Scheduling Cycle
Is Upstream Node Self-dependent
Default Dependency Cycle
Default Dependency Policy
Daily/Weekly/Monthly
Day
Yes/No
Current Cycle (Current Day)
Last Instance
Daily/Weekly/Monthly
Hourly/Minutely
No
Current Cycle (Current Day)
All Instances
Daily/Weekly/Monthly
Hourly/Minutely
Yes
Current Cycle (Current Day)
Last Instance
Monthly/Weekly/Daily/Hourly/Minutely
Monthly/Weekly
Yes
Current Cycle (Current Day)
Last Instance
Monthly/Weekly/Daily/Hourly/Minutely
Monthly/Weekly
No
Current Cycle (Current Day)
Last Instance
Hourly/Minutely
Daily
Yes/No
Current Cycle (Current Day)
Last Instance
Hourly/Minutely
Hourly/Minutely
Yes/No
Current Cycle (Current Day)
Last Instance
After completing the configuration, click OK.
To restore the initial system configuration, you can click Restore Defaults.
Table management settings
The StarRocks, GaussDB data warehouse service (DWS), Doris, and SelectDB compute engines do not support table management settings.
In the Table Management Settings section, click the Edit icon to configure Automatically Generate Table Deletion Pending Release Item After Using SQL to Delete Table and Generate Pending Release Item When Deleting Table in Table Management.
Automatically Generate Table Deletion Pending Release Item After Using SQL To Delete Table: Enabled by default. When enabled, after executing a
drop tablestatement in the development environment's ad hoc query or SQL computing task, the system automatically generates a pending release item for table deletion. When disabled, executing adrop table table_namestatement in the development environment will not generate a pending release item for table deletion.Generate Pending Release Item When Deleting Table In Table Management: Enabled by default. When enabled, the system generates corresponding pending release items when deleting tables in table management. When disabled, deleting tables in table management will not generate corresponding pending release items.
Configure Default Storage Format/Default Storage Format For External Tables. Different compute engines support different storage formats, as shown in the following table.
NoteYou cannot configure the Default storage format when the compute engine is AnalyticDB for PostgreSQL.
You can configure the Default storage format for foreign tables only when the compute engine is MaxCompute.
- in the following table indicates not supported.

Engine Default (Can be specified in create table statement)
hudi
delta(Delta Lake)
paimon
iceberg
kudu
parquet
avro
rcfile
orc
textfile
sequencefile
binaryfile
csv
text
json
MaxCompute
-
-
-
-
-
-
Supported
Supported
Supported
Supported
Supported
Supported
-
-
-
-
Lindorm (Compute Engine)
Supported
-
-
-
Supported
-
Supported
Supported
Supported
Supported
Supported
Supported
-
-
-
-
Databricks
Supported
-
Supported
-
-
-
Supported
Supported
-
Supported
-
-
Supported
Supported
Supported
Supported
Amazon EMR
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
-
-
-
-
Transwarp TDH 6.x
Transwarp TDH 9.3.x
Supported
-
-
Supported
Supported
-
Supported
Supported
Supported
Supported
Supported
Supported
-
-
-
-
CDH 5.x
CDH 6.x
E-MapReduce 3.x
E-MapReduce 5.x
Cloudera Data Platform 7.x
Huawei FusionInsight 8.x
AsiaInfo DP 5.3
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
-
-
-
-
You can configure the default lifecycle for physical and logical tables that use the MaxCompute compute engine. By default, this value is empty, which means no lifecycle is set. You can enter an integer from 1 to 36,500 or quickly select 7, 14, 30, or 360 days.
NoteYou can configure the default lifecycle only when the compute engine is MaxCompute.
After you complete the configuration, click OK.
To restore the initial system configuration, you can click Restore Defaults.
Default compute engine for standard modeling
Dataphin instances with Hadoop compute engines support setting the default compute engine for standard modeling, including Hive, Impala, and Spark. The compute engines have the following limitations:
If the corresponding task is not enabled for the compute source corresponding to the project, the system will automatically switch to the Hive compute engine. For more information, see Create a Hadoop compute source.
Hive: Cannot read source tables stored in Kudu format.
Impala: Can read source tables stored in Kudu format, but does not currently support storing logical tables as Kudu. Not recommended if you don't have source tables in Kudu format.
NoteWhen the compute engine is Amazon EMR, Impala is not supported.
Spark: Cannot read source tables stored in Kudu format.
Query acceleration
You can enable MCQA query acceleration to speed up all MAX_COMPUTE_SQL ad hoc queries and all SQL unit queries on the analysis platform. If this feature is disabled, the current tenant cannot use MCQA query acceleration.
Query acceleration is supported only for the MaxCompute compute engine.