• UID625
  • Fans5
  • Follows1
  • Posts68

[Share]How to Customize Log Configurations in ODPS JDBC

More Posted time:Sep 9, 2016 17:28 PM
Abstract: This article discusses how to configure the logging function in ODPS JDBC 2.0 beta, and gives examples and descriptions of the configuration file of Logback.
There are two types of logs related to ODPS JDBC: one is directly output by internal JDBC code, and the other is output after an exception is thrown by JDBC and captured by the host application invoking JDBC API. Because the second type of log depends on how the host application processes the exceptions and configures the log system, this article will mainly discuss the first type of log.
In the previous version before 2.0-beta, the logs of ODPS JDBC could only be output to the command line terminal (standard output stream), and its bottom layer uses ConsoleHandler of the built-in java.util.logging of JDK. Therefore, unless you start up the host application in the command line (the other method is to start up the host application by double-clicking the executable files on the visual interface without console interfaces), it will be difficult to find the log information. This makes it inconvenient for the user to implement problem diagnosis and troubleshooting.
To enable the user to customize relevant log configurations, such as deciding whether to output logs to the terminal or the file, whether to store the log files, or how to implement rotation for the log files output and so on. slf4j and logback are introduced in 2.0-beta ODPS JDBC to enable the user to customize the log configurations. To avoid conflicts between the slf4j logging implementation framework selected for JDBC and the slf4j logging implementation framework for the host application, the JAR package of ODPS JDBC does not contain any specific JAR packages implemented by slf4j. Besides, it is not strongly dependent on logback. Only if the host application does not contain any slf4j log implementation, the user can add JAR package of logback to the classpath, and transfer the logback configuration files to JDBC to enable the implementation.
• Why is there conflict between the slf4j logging implementation frameworks?
slf4j is a logging facade framework generated to unify the behaviors of different log frameworks and the API. It only provides API, rather than the specific log solution implementation. Therefore, generally it should be used together with another specific logging implementation framework, such as Logback and log4j. But if ODPS JDBC uses slf4j + Logback, and the host application uses slf4j + log4j, conflicts may occur while running. As the user may invoke LoggerFactory.getLogger of slf4j to get the log instance, while this method will be bound to a specific slf4j implementation upon initial invocation. Such behavior depends on a class called org.slf4j.impl.StaticLoggerBinder. However, this class exists in all slf4j logging implementation frameworks. If multiple sets of slf4j logging implementation exist in classpath, it completely depends on which set of implementation the org.slf4j.impl.StaticLoggerBinder classloader loaded first while running belongs to, and implementation and configuration for the loaded one will be used. This is somewhat uncertain, but it is a common JAR package conflict. Therefore, if ODPS JDBC brings different slf4j logging implementation from the host application, the host application may get the log instance of another set of implementation while running, so that the original log configuration of the host application may be invalid.
According to imperfect induction, I will classify the users of ODPS JDBC into two categories: accessors and developers respectively.
• Accessors
Such users are generally users of some data query or analysis tools. They need to add the ODPS JDBC Driver to the access tools, and then configure the parameters such as jdbc URLin order to implement ODPS SQL queries using the tools.
• Developers
Such users are generally developers of data products. They will compile relevant query logic code via SDK or JDBC APIs in order to develop software products suitable for the target users.
Now let's talk about how to customize log configurations for these two types of users, and further explain the specific configurations.
1. Accessors
I'd like to explain with SQL Workbench/J as an example.
Download SQL Workbench/J to the local disk to get an executable file, then double-click it, and click Manager Driver in the pop up interface to load our JDBC jar package.
Please note that SQL Workbench/J does not introduce any logging implementation of slf4j, so we can enable the default logback implementation of ODPS JDBC by adding the logback jar package.

Then create a new Connection Profile. Please note that a parameter named log_conf_file is added to the JDBC URL in the URL section.

This parameter can be used to specify a local logback configuration file for the log configurations of ODPS JDBC. Please note that this parameter is a parameter of the specified configuration file rather than a parameter at the specified log output position. Hence, please ensure this file exists under the corresponding directory and the application has access to it, and the configuration file is consistent with the format of the logback configuration file. If this parameter is not transferred, the log will only be output to the console. But the SQL Workbench/J opened by double-clicking has no console interfaces. If you want to output the logs to a log file under a specific directory, you only need to add the corresponding FileAppender in the configuration file. Refer to configuration instructions in Part 3.
Please download the JAR package and configuration file mentioned in this section from: v2.0-beta
2. Developers
The developers should be familiar with the Java log framework, and the applications developed may have applied slf4j and implementations. So you should first clarify that if slf4j and implementations have been applied, there is no need to use the built-in configuration function of JDBC; instead, you can add the log configurations for the com.aliyun.odps.jdbc package in log configurations of the current application.
If you did not apply any slf4j implementation, please add the two JAR packages: logback-core-1.1.7.jar and logback-classic-1.1.7.jar in your classpath while adding the ODPS JDBC JAR package. If Maven is used, you can add the following dependencies:

You have two methods to specify the configuration file, one of which is to put logback.xml directly in classpath.
The other is to specify it by adding the configuration of the log_conf_file attribute in the following config or adding the log_conf_file parameter in the jdbc URL string (see the sample picture of SQL Workbench/J). When configuring the item in both URL and config, the config value should be prioritized.
Properties config = new Properties();
config.put("access_id", "...");
config.put("access_key", "...");
config.put("project_name", "...");
config.put("charset", "...");
Connection conn = DriverManager.getConnection("jdbc:odps:<endpoint>", config);

3. Configuration Instructions
Next, let's learn about logback.xml:
  <appender name="FILE" class="ch.qos.logback.core.FileAppender">
     <!--*Linux or Windows:*-->
    <!--tries $HOME first; if it doesn't exist, it tries $USERPROFILE-->
      <pattern>%date %level [%thread] %logger{10}  %X{connectionId} [%file:%line] %msg%n</pattern>

  <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">

<logger name="com.aliyun.odps.jdbc" level="debug"/>

  <root level="error">
    <appender-ref ref="FILE" />
    <appender-ref ref="STDOUT" />

This is a logback configuration following the basic regular requirements. If you do not have any other special requirements, you can directly copy the configuration file for use. We will briefly introduce this configuration. If you want to have a better understanding about the configuration of logback, please refer to the configuration section of the official document.
<configuration> is root tag of the configuration file, and all sub-tags must be contained in it. The sub-tags include <appender>, <encoder>, <logger> and <root>.
There are inheritance relationships among log instances of Logback. <root> is the parent instance of all log instances, and all remaining <logger> are child instances, which will inherit configuration of the parent instance by default. While all <logger> will continue to maintain the inheritance relationships according to the prefix of their name attribute. If two log instances - com.aliyun.odps and com.aliyun.odps.jdbc are defined in this configuration, the latter will serve as child instance of the former. As the log output code only exists in the com.aliyun.odps.jdbc package, we only configure a log instance, which overrides the error log level of <root> and changes it to debug level.
The relationship between the output level and severity level of the log is as follows:
The severity level is increasing, while the number of logs is decreasing from left to right. The log levels with lower severity levels will output all logs with higher or equal levels. For instance, if you select the DEBUG level, all logs except TRACE will be output. OFF means no log will be output.
<encoder> is used to code the log events into byte streams referencing <pattern>. <pattern> indicates that the default PatternLayout will be used, where <pattern>%date %level [%thread] %logger{10} %X{connectionId} [%file:%line] %msg%n</pattern> references many built-in variables in PatternLayout, which will be used to output the date, level, thread name, log instance name, connectionId, class files, line number, msg information and line separators respectively. For more built-in variables, please refer to the official layouts document.
The only one that needs your special attention in these variables is %X{connectionId}, which is the unique ID of each JDBC Connection set via MDC that will be used to identify the JDBC logs from the same Connection.
<appender> will be used to output the specified logs. The current instance is configured with two appenders: FILE and STDOUT, indicating that the logs are output to the files and standard output streams respectively, where the appender in FILE specifies the storage path of the log file via <file>.
In the end, we can see that the log instance will reference appender, while appender will again reference encoder, thus forming the overall log configuration system.
Through the above configuration, we can get the output log in the following format under the /Users/emerson/odps.log path (that is, your home directory; the above configuration sample is compatible in terms of the home directory of Linux and windows):
Logback is a quite powerful Java logging framework, and you can get the functions such as rotation/archiving of log files, and even output of logs by JMS/email and so on through configuration.