This topic describes how to use a SET command to configure the time zone for a MaxCompute project.
The following types of jobs support the time zone configuration feature:
- Spark on MaxCompute
- If tasks are submitted to the MaxCompute computing cluster, the time zone of the project is automatically obtained.
- If tasks are submitted from spark-shell, spark-sql, or pyspark in yarn-client mode,
you must configure parameters in the spark-defaults.conf file of the driver and add
spark.driver.extraJavaOptions -Duser.timezone=America/Los_Angeles. The timezone parameter indicates the time zone that you want to use.
- Machine Learning Platform for AI (PAI)
By default, the time zone of a MaxCompute project is UTC+8. The DATETIME, TIMESTAMP, and DATE fields and the related built-in time functions are all calculated based on UTC+8. You can use one of the following methods to configure the time zone:
- Session level: Submit the
set odps.sql.timezone=<timezoneid>;statement along with a computing statement for execution.
--Set the time zone to Asia/Tokyo. SET odps.sql.timezone=Asia/Tokyo; --Query the current time zone. SELECT getdate(); output: +------------+ | _c0 | +------------+ | 2018-10-30 23:49:50 | +------------+
- Project level: Execute the
setProject odps.sql.timezone=<timezoneid>;statement. Only the project owner has the permission to execute this statement.Notice After the time zone of a project is configured, it is used for all time computing, and the data of existing jobs is affected. Therefore, configure the time zone only when it is necessary. We recommend that you configure time zones only for new projects.
Limits and usage notes
- SQL built-in date functions, user-defined functions (UDFs), user-defined types (UDTs),
user-defined joins (UDJs), and the
SELECT TRANSFORMstatement allow you to obtain the timezone attribute of a project to configure the time zone.
- A time zone must be configured in the format such as
Asia/Shanghai, which supports daylight saving time. Do not configure the time zone in the GMT+9 format.
- If the time zone in SDK differs from that of the project, you must configure the GMT time zone to convert the data type from DATETIME to STRING.
- After the time zone is configured, differences exist between the real time and the output time of related SQL statements that you run in DataWorks. Between the years of 1900 and 1928, the time difference is 352 seconds. Before the year of 1900, the time difference is 9 seconds.
- MaxCompute, SDK for Java, and the related client are updated to ensure that DATETIME data stored in MaxCompute is correct across multiple time zones. The versions of the required SDK for Java and related client have the -oversea suffix. The update may affect the display of DATETIME data that is earlier than January 1, 1928 in MaxCompute.
- If the local time zone is not UTC+8 when you update MaxCompute, we recommend that you update SDK for Java and the related client. This ensures that the SQL-based computing results and data transferred by using Tunnel commands that are later than January 1, 1900 are accurate and consistent. After the update, for the DATETIME data that is earlier than January 1, 1900, the SQL-based computing results and data transferred by using Tunnel commands may still have a difference of 343 seconds. For DATETIME data that is earlier than January 1, 1928 and uploaded before SDK for Java and the related client are updated, the time in the new versions is 352 seconds earlier.
- If you do not update SDK for Java and the client to versions with the -oversea suffix, the SQL-based computing results differ from data that is transferred by using
Tunnel commands. For data that is earlier than January 1, 1900, the time difference
is 9 seconds. For data that is between January 1, 1900 and January 1, 1928, the time
difference is 352 seconds.
Note The modification of the time zone configuration in SDK for Java or on the related client does not affect the time zone configuration in DataWorks. Therefore, the time zones are different. You must evaluate how the scheduled jobs in DataWorks are affected. The time zone of a DataWorks server in the Japan (Tokyo) region is GMT+9, and that in the Singapore (Singapore) region is GMT+8.
- If you use a third-party client that is connected to MaxCompute by using Java Database Connectivity (JDBC), you must configure the time zone on the client to ensure time consistency between the client and the server.