All Products
Search
Document Center

Background and preparations

Last Updated: May 09, 2019

Background

DataV is designed to benefit more users with data visualization, and to help non-professional engineers establish professional visual applications easily on graphical user interfaces. DataV provides a variety of graphical templates to satisfy the data presentation demands of exhibitions, business monitoring, risk warning, and geographic information analysis. In most of dashboard application scenarios with infrequent data updates, the solution that includes DataV + DLA + OSS is less costly than the solution that includes DataV + traditional databases.

Solution

DataV, DLA, and OSS can be combined to form two solutions.

  • DLA + OSS + DataV

    Upload incremental data that is produced by business applications to OSS, and then synchronize the data to DataV for business data updates through periodic queries.

  • Log Service + OSS + DLA + DataV

    Use Log Service to collect the business logs that are produced by business applications, periodically ship the collected data to OSS (with a minimum delay of five minutes), and then synchronize the data to DataV for business data updates through periodic queries.

    1

The preceding solutions have three advantages:

  • Log Service (LOG for short, originally known as SLS) is a one-stop real-time data service developed by Alibaba Group. It is applicable to a large number of big data application scenarios. It provides log data collection, intelligent query and analysis, consumption, and shipment functions to improve the capability to process and analyze a large number of logs. With an outstanding log shipping capability, Log Service supports a variety of log formats on data sources and steadily ships logs to specified locations.

  • Object Storage Service (OSS) is cost-effective and allows you to save your data files for if you want.

  • Data Lake Analytics (DLA) provides a powerful log analysis capability, adopts a serverless architecture, and its fees are charged based on the volume of data it scans. It allows you to partition the data shipped to OSS by year, month, or day to improve the data hit rate and reduce the amount of data to be scanned. In this way, a large amount of historical data can be analyzed quickly at a low cost.

  • DataV provides templates for different scenarios to facilitate your design. The graphic interfaces and configurable widgets for business modules allow you to create professional visual applications by using simple drag-and-drop operations. Minimal programming skills are required. Your visual applications can be published and shared with other users. This enables external users to access your data services, no matter whether they have bought a DataV product or not.

This topic uses the solution that includes DLA + OSS + DataV as an example to describe how to create a DataV dashboard.

Prerequisites

Before you use the solution that includes DLA + OSS + DataV to create a dashboard, follow these steps to prepare test data in OSS, and create an OSS schema and required tables in DLA.

Step 1: Prepare test data in OSS

  1. Activate OSS.

  2. Create a bucket.

  3. Upload the TPC-H dataset.

    Click TPC-H to download the 100 MB dataset, which contains eight data files. Create a folder for each data file in OSS, and then upload the files to the corresponding paths.

    2

    1. http://bucket-name.oss-cn-hangzhou.aliyuncs.com/TPC-H/customer/customer.tbl
    2. http://bucket-name.oss-cn-hangzhou.aliyuncs.com/TPC-H//lineitem/lineitem.tbl
    3. http://bucket-name.oss-cn-hangzhou.aliyuncs.com/TPC-H/nation/nation.tbl
    4. http://bucket-name.oss-cn-hangzhou.aliyuncs.com/TPC-H/orders/orders.tbl
    5. http://bucket-name.oss-cn-hangzhou.aliyuncs.com/TPC-H/part/part.tbl
    6. http://bucket-name.oss-cn-hangzhou.aliyuncs.com/TPC-H/partsupp/partsupp.tbl
    7. http://bucket-name.oss-cn-hangzhou.aliyuncs.com/TPC-H/region/region.tbl
    8. http://bucket-name.oss-cn-hangzhou.aliyuncs.com/TPC-H/supplier/supplier.tbl

Step 2: Create an OSS schema

  1. CREATE SCHEMA tpch with DBPROPERTIES(
  2. LOCATION = 'oss://bucket-name/TPC-H/',
  3. catalog='oss'
  4. );

Step 3: Create OSS tables

  • Table nation

    1. CREATE EXTERNAL TABLE nation (
    2. N_NATIONKEY INT,
    3. N_NAME STRING,
    4. N_ID STRING,
    5. N_REGIONKEY INT,
    6. N_COMMENT STRING
    7. )
    8. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    9. STORED AS TEXTFILE
    10. LOCATION 'oss://bucket-name/TPC-H/nation';
  • Table lineitem

    1. CREATE EXTERNAL TABLE lineitem (
    2. L_ORDERKEY INT,
    3. L_PARTKEY INT,
    4. L_SUPPKEY INT,
    5. L_LINENUMBER INT,
    6. L_QUANTITY DOUBLE,
    7. L_EXTENDEDPRICE DOUBLE,
    8. L_DISCOUNT DOUBLE,
    9. L_TAX DOUBLE,
    10. L_RETURNFLAG STRING,
    11. L_LINESTATUS STRING,
    12. L_SHIPDATE DATE,
    13. L_COMMITDATE DATE,
    14. L_RECEIPTDATE DATE,
    15. L_SHIPINSTRUCT STRING,
    16. L_SHIPMODE STRING,
    17. L_COMMENT STRING
    18. )
    19. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    20. STORED AS TEXTFILE
    21. LOCATION 'oss://bucket-name/TPC-H/lineitem';
  • Table orders

    1. CREATE EXTERNAL TABLE orders (
    2. O_ORDERKEY INT,
    3. O_CUSTKEY INT,
    4. O_ORDERSTATUS STRING,
    5. O_TOTALPRICE DOUBLE,
    6. O_ORDERDATE DATE,
    7. O_ORDERPRIORITY STRING,
    8. O_CLERK STRING,
    9. O_SHIPPRIORITY INT,
    10. O_COMMENT STRING
    11. )
    12. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    13. STORED AS TEXTFILE
    14. LOCATION 'oss://bucket-name/TPC-H/orders';
  • Table supplier

    1. CREATE EXTERNAL TABLE supplier (
    2. S_SUPPKEY INT,
    3. S_NAME STRING,
    4. S_ADDRESS STRING,
    5. S_NATIONKEY INT,
    6. S_PHONE STRING,
    7. S_ACCTBAL DOUBLE,
    8. S_COMMENT STRING
    9. )
    10. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    11. STORED AS TEXTFILE
    12. LOCATION 'oss://bucket-name/TPC-H/supplier';
  • Table partsupp

    1. CREATE EXTERNAL TABLE partsupp (
    2. PS_PARTKEY INT,
    3. PS_SUPPKEY INT,
    4. PS_AVAILQTY INT,
    5. PS_SUPPLYCOST DOUBLE,
    6. PS_COMMENT STRING
    7. )
    8. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    9. STORED AS TEXTFILE
    10. LOCATION 'oss://bucket-name/TPC-H/partsupp';
  • Table customer

    1. CREATE EXTERNAL TABLE customer (
    2. C_CUSTKEY INT,
    3. C_NAME STRING,
    4. C_ADDRESS STRING,
    5. C_NATIONKEY INT,
    6. C_PHONE STRING,
    7. C_ACCTBAL DOUBLE,
    8. C_MKTSEGMENT STRING,
    9. C_COMMENT STRING
    10. )
    11. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    12. STORED AS TEXTFILE
    13. LOCATION 'oss://bucket-name/TPC-H/customer';
  • Table part

    1. CREATE EXTERNAL TABLE part (
    2. P_PARTKEY INT,
    3. P_NAME STRING,
    4. P_MFGR STRING,
    5. P_BRAND STRING,
    6. P_TYPE STRING,
    7. P_SIZE INT,
    8. P_CONTAINER STRING,
    9. P_RETAILPRICE DOUBLE,
    10. P_COMMENT STRING
    11. )
    12. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    13. STORED AS TEXTFILE
    14. LOCATION 'oss://bucket-name/TPC-H/part';
  • Table region

    1. CREATE EXTERNAL TABLE region (
    2. R_REGIONKEY INT,
    3. R_NAME STRING,
    4. R_COMMENT STRING
    5. )
    6. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    7. STORED AS TEXTFILE
    8. LOCATION 'oss://bucket-name/TPC-H/region';

After these tables are created, refresh the page to view the eight tables in the schema.

3

Next step

Procedure