This topic explains how common dimensions and drill-down dimensions are different.
Common dimension: It is applicable to all scenarios, and does not have the acceleration index unless the ID dimension is enabled. For more information, see the following description.
Drill-down Dimension: It is applicable to a specific scenario where a hierarchical relationship, such as Province > City > District, exists between dimensions. For drill-down dimensions, query at each layer will be accelerated.
Analysis of a common dimension scenario
Take the log of an e-retailer as an example:
2017-01-01 12:00:00｜Category: Men’s Wear ｜ Province: Zhejiang ｜City: Hangzhou｜District: Xihu District | Gender: Male ｜ Height: L| Quantity: 5｜ Total: 100｜
You can get the following fields by splitting the log:
If the data is analyzed by Category, Gender, and Province, the corresponding dimensions are Category, Gender, and Province, and the metrics are Price and Quantity. After pre-aggregation, the data is as follows:
|100||1||2017-01-01 12:00:00||Male||Men’s Wear||Zhejiang|
|300||3||2017-01-01 12:00:00||Female||Men’s Wear||Beijing|
To view data in the category of Men’s wear, the system must read data of all categories, genders, and provinces to filter out the data of men’s wear. In this case, Number of Retrieved Records is greater than Number of Result Records.
Restrictions and optimization methods
If there are two million categories, and data in the category of Men’s wear is to be viewed, then the system must read about N x 2,000,000 data records to filter out the data of men’s wear. In this case, Number of Retrieved Records is greater than Number of Result Records, and the excessive reading of data has a direct impact on the speed of obtaining data.
You can solve this problem by creating a category index.
In the dataset of the common dimension type, ARMS provides an auxiliary dimension called ID dimension. An ID dimension is equivalent to an index dimension, and the value of this ID dimension must be specified to accelerate data query. A dimension is a common non-index dimension, such as the gender, category, and province in the previous example.
Differences between a dimension and an ID dimension
Common dimensions consist of dimensions and ID dimensions. In the query process of a dataset, an ID dimension cannot be blank, but a dimension can be blank. Currently, ARMS can contain up to one ID dimension and seven dimensions.
- A dimension can be used either separately or together with other dimensions. For example, a dataset has three dimensions: A, B, and C. You can select only A, B, or C, or use the combination of B and C or the combination of A, B, and C to query data.
An ID dimension is equivalent to creating an index for this dimension. You can specify an ID dimension to quickly query the desired data.
If data cannot be enumerated or the number of dimensions is large, ID dimensions are recommended.
Analysis of a drill-down dimension scenario
Take an area monitored by the system as an example. System logs contain three dimensions: IDC, group, and IP address. Assuming that a user needs to start from the IDC running status to drill down groups of an IDC, and then query data of a specific machine in the group. If common dimensions are used for solving this problem, a query delay may exist due to the large amount of data queried. Drill-down dimensions are more suitable for this fixed hierarchical query scenario.
Using the drill-down dimensions, you can create multiple levels of indexes as follows for the IDC, group, and IP address: IDC (index 1), IDC - group (index 2), and IDC - group - IP address (index 3). To query data of an IDC, use index 1. To query data of a group in a specific IDC, use index 2. To query data of an IP address of a specific group, use index 3.
Drill-down dimensions are also applicable to the following scenarios: business statistics by province or region, query of the student distribution by school, grade, or class, or sales statistics by manufacturer, brand, or category.
Restrictions of drill-down dimensions
You can configure up to three drill-down dimensions in ARMS.
Drill-down dimensions are of hierarchical relationship with each other. For example, to view data of the second dimension, you must first select the attribute of the first dimension. Drill-down dimensions are similar to a tree structure. Drill-down dimensions must be properly planned. For example, the first dimension is the province, the second dimension is the city, the third dimension is the district, and the metric is the citizen consumption information.
Unless required in special scenarios, two dimensions that are completely irrelevant with each other, such as “Region” and “Commodity type”, should not be defined simultaneously.
ARMS provides the drill-down function that allows you to drill down from the summary data to the detailed data to observe or add dimensions. The purpose of drill-down is to change the dimension hierarchy and the analysis granularity.