This topic describes the best practices for using wide tables to perform tag computing in Hologres.
Background information
In the offline data warehouse model, user tag data is stored in multiple theme- and dimension-oriented tables. This is helpful to build a tag system and maintain and manage data. However, if such a data model is used in online profile analysis to organize tag data, multiple tag tables must be joined to filter tags, which is too costly for database services.
Solution
In the wide table solution, stable property tables are aggregated into wide tables
offline and JOIN operations on multiple tables are converted into operations on a
single wide table. If new tags are required, columns are added to the wide table for
these tags. This solution is suited for the following scenarios:
- Scenarios in which less than 1,000 tags are used.
- Scenarios in which data is infrequently updated.
Use example
In the following example, a wide table is used for profile analysis. SQL statements
are executed to query men (
[gender = Male]
) in the Zhejiang province ([province = Zhejiang]) whose marriage status is married
in the dws_userbase table. We recommend that you set proper indexes for the table based on the query
mode to improve query performance. For more information, see CREATE TABLE. -- The wide table.
BEGIN;
CREATE TABLE dws_userbase
(
uid text not null primary key,
province text,
gender text,
married text
... -- Other property columns.
);
call set_table_property('dws_userbase', 'distribution_key', 'uid');
call set_table_property('dws_userbase', 'bitmap_columns', 'province,gender,married');
END;
-- Query based on basic properties.
SELECT count(distinct uid) as cnt,
married
FROM dws_userbase ub
WHERE province = 'Zhejiang' and gender = 'Male'
GROUP BY married;