This topic describes how to connect Log Service to DataV and display data on a dashboard.

Background information

Real-time data dashboards are widely used in massive online promotions. Real-time data dashboards are based on the stream computing architecture, which consists of the following modules:
  • Data collection: collects data from each source in real time.
  • Intermediate storage: uses Kafka Queues to decouple production systems and consumption systems.
  • Real-time computing: subscribes to real-time data and uses computing rules to compute data on the dashboard.
  • Result storage: stores the computing results in SQL and NoSQL databases.
  • Visualization: uses APIs to call results, and visualizes the results.
Alibaba Group provides multiple products to support these modules, as shown in the following figure.
Figure 1. Data dashboard-related products
Data dashboard-related products
In addition to the preceding solution, you can also call LogSearch and Analytics API operations of Log Service to connect to DataV and display data on a dashboard.
Figure 2. Connect Log Service with DataV
Connect Log Service with DataV

Log Service allows you to analyze log data in real time by using query statements and SQL-92 syntax. You can use built-in dashboards of Log Service, Grafana, or Tableau (JDBC) to visualize data analysis results.

Features

The following two computing methods are supported:
  • Real-time computing (streaming computing): fixed computing and dynamic data.
  • Batch computing (data warehouse and batch computing): dynamic computing and fixed data.
Log Service provides two connection methods to collect data in real time. In log analysis scenarios, LogHub data can be indexed in real time. Then, you can use the LogSearch and Analytics feature to query and analyze log data. This method has the following benefits:
  • Quick response: After calling the relevant API operations, you can obtain query and analysis results in real time, without the need for pre-computing.
  • Real-time display: Up to 99.9% of logs can be displayed on the data dashboard within one second after they are generated.
  • Dynamic data refresh: When you modify analysis methods or import data to Logstores, the display result is refreshed in real time.
This method has the following limits:
  • Data volume: Up to 10 billion lines of data can be computed at a time. If you need to compute more data, you must set multiple time ranges.
  • Flexibility: Only SQL-92 syntax is supported. User defined functions (UDFs) are not supported.
Figure 3. Log Service benefits
Log Service benefits

Configuration process

To connect Log Service to DataV data dashboard, follow these steps:
  1. Collect data. Connect the data source to Log Service. For more information, see Get started.
  2. Configure the index. For more information, see Overview or Collect and analyze NGINX access logs.
  3. Connect with DataV plug-ins to convert SQL query results to a view in real time.

After completing steps 1 and 2, you can view the raw logs on the search and analysis page. The following procedure describes the substeps of step 3.

Procedure

  1. Create a DataV data source.
    1. Log on to the DataV console.
    2. Choose Data Sources > Add Source.
    In the dialog box that appears, select Log Service from the Type drop-down list.
    Figure 4. Add a data source
    Add a data source
    Parameter Description
    Type Select Log Service.
    Name Enter a name for the data source.
    AppKey The AccessKey ID of the Alibaba Cloud account, or the AccessKey ID of a RAM user account that has permission to read data from Log Service.
    AppSecret The AccessKey secret of the Alibaba Cloud account, or the AccessKey secret of a RAM user account that has permission to read data from Log Service.
    Endpoint Enter the endpoint of Log Service. For more information, see Service endpoint. Set this parameter based on the network type and region of your Log Service project.
  2. Create a line chart and a filter.
    1. Create a line chart.
      In the Data Response Result section on the right of the page, click set datasource. In the Set Data Source dialog box that appears, select Log Service from the Data Source Type field, select log_service_api from the Select Source field, and then enter an SQL statement in the SQL text box.
      Figure 5. Configure a data source
      Configure a data source
      An example SQL statement and a description of the relevant parameters are as follows:
      {
       "projectName": "dashboard-demo",
       "logStoreName": "access-log",
       "topic": "",
       "from": ":from",
       "to": ":to",
       "query": "*| select approx_distinct(remote_addr) as uv ,count(1) as pv , date_format(from_unixtime(date_trunc('hour',__time__) ) ,'%Y/%m/%d %H:%i:%s')   as time group by time  order by time limit 1000" ,
       "line": 100,
       "offset": 0
      }
      Parameter Description
      projectName The name of the project.
      logstoreName The name of the Logstore.
      topic The log topic. Do not set this parameter if you have not set a log topic.
      from, to The from and to parameters refer to the start time and end time during which logs are obtained.
      Note In this example, the parameters are set to :from and :to. When you test your parameter settings, you can enter a Unix timestamp, for example, 1509897600. After the test, change the Unix timestamp to :from and :to. Then, you can specify a time range in the URL parameters. For example, the URL in the preview is http://datav.aliyun.com/screen/86312. After the URL http://datav.aliyun.com/screen/86312?from=1510796077&to=1510798877 is opened, the values are computed based on the specified time.
      query The query condition. In this example, the query condition is the number of page views (PVs) per minute. For information about the query syntax, see Syntax description.
      Note The time in the query must be in the format of 2017/07/11 12:00:00. You must use the following syntax to align the time to the hour, and then convert the time to the specified format.
      date_format(from_unixtime(date_trunc('hour',__time__) ) ,'%Y/%m/%d
                          %H:%i:%s')
      line Enter the default value 100.
      offset Enter the default value 0.
      After the configurations are complete, click Data Response Result.
      Figure 6. View the data response result
      View the data response result
    2. Create a filter.
      In the Set Data Source dialog box, select Data Filter, and then click the plus sign (+) next to the Add Filter field.
      Enter a function in the New Filter text box by using the following example syntax:
      return Object.keys(data).map((key) => {
      let d= data[key];
      d["pv"] = parseInt(d["pv"]);
      return d;
      }
      )

      In the filter, convert the result used by the y-axis to the int type. In this example, the y-axis indicates the PV, and the pv column must be converted.

      The results contain the t and pv columns. You can set the x-axis to t and the y-axis to pv.

  3. Configure a pie chart.
    1. Create a carousel pie chart.
      Figure 7. Carousel pie chart
      Carousel pie chart
      Enter an SQL statement in the Query text box by using the following example syntax:
      {
       "projectName": "dashboard-demo",
       "logStoreName": "access-log",
       "topic": "",
       "from": 1509897600,
       "to": 1509984000,
       "query": "*| select count(1) as pv ,method group by method" ,
       "line": 100,
       "offset": 0
      }

      During the query, the ratios of different methods can be computed.

    2. Add a filter. Enter a function in the New Filter text box by using the following example syntax:
      return Object.keys(data).map((key) => {
      let d= data[key];
      d["pv"] = parseInt(d["pv"]);
      return d;
      }
      )

      Enter method in the type text box and pv in the value text box for the carousel pie chart.

  4. Preview and publish the DataV dashboard
    Click the Preview icon to preview the DataV dashboard, and then click the Publish icon to publish the DataV dashboard. You can view PV data in real time during online promotions.

Example

You need the PV statistics of your website across China during the Apsara Conference. You have configured full log data collection and enabled the LogSearch and Analytics features in Log Service. You only need to enter a query statement in the Query text box to obtain the PV statistics.
  • On the first day of the conference, you need statistics of unique visitors (UVs). You can enter the following statement to obtain the unique count of the forward field under NGINX in all access logs during a specified time range (for example, from October 11 to the present):
    * | select approx_distinct(forward) as uv
  • On the second day of the conference, you need the statistics generated by the yunqi domain name. You can enter the following statement to add a filter condition (host) for the real-time query:
    host:yunqi.aliyun.com | select approx_distinct(forward) as uv
  • If the NGINX access logs contain multiple IP addresses, you can enter the following statement to only reserve the first IP address:
    host:yunqi.aliyun.com | select approx_distinct(split_part(forward,',',1)) as uv
  • On the third day of the conference, you need to remove statistics generated by UC browser advertisement from access statistics. You can add a filter condition (not ...) to the statement to obtain the latest result in real time:
    host:yunqi.aliyun.com not url:uc-iflow  | select approx_distinct(split_part(forward,',',1)) as uv