This function specifies the number of buckets and the minimum and maximum values of the acceptable range for a bucket. It allows you to construct equi-width buckets, in which the bucket range is divided into intervals that have an identical size. It returns the ID of the bucket into which the value of a specific expression falls. This function is an extension function of MaxCompute V2.0.
Precautions
This function supports the following data types: DECIMAL(precision,scale) in the MaxCompute V2.0 data type edition, BIGINT, INT, FLOAT, DOUBLE, and DECIMAL. For more information, see MaxCompute V2.0 data type edition.
Syntax
width_bucket(numeric <expr>, numeric <min_value>, numeric <max_value>, int <num_buckets>)
Parameters
- expr: required. This parameter specifies the expression for which you want to identify the matching bucket ID.
- min_value: required. This parameter specifies the minimum value of the acceptable range for the bucket.
- max_value: required. This parameter specifies the maximum value of the acceptable range for the bucket. The value must be greater than min_value.
- num_buckets: required. This parameter specifies the number of buckets. The value must be greater than 0.
Return value
- If the value of expr is less than that of min_value, 0 is returned.
- If the value of expr is greater than that of max_value, the value of num_buckets plus 1 is returned.
- If the value of expr is null, null is returned. In other cases, the ID of the bucket into which the value
falls is returned. The bucket ID is calculated based on the following formula:
Bucket ID = floor(num_buckets × (expr - min_value)/(max_value - min_value) + 1)
. - If the value of min_value, max_value, or num_buckets is null, null is returned.
Examples
- Example 1: The values of all input parameters are not null. Sample statement:
The following result is returned:select key,value,width_bucket(value,100,500,5) as value_group from values (1,99), (2,100), (3,199), (4,200), (5,499), (6,500), (7,501), (8,NULL) as t(key,value);
+-------+--------+-------------+ | key | value | value_group | +-------+--------+-------------+ | 1 | 99 | 0 | | 2 | 100 | 1 | | 3 | 199 | 2 | | 4 | 200 | 2 | | 5 | 499 | 5 | | 6 | 500 | 6 | | 7 | 501 | 6 | | 8 | \N | \N | +-------+--------+-------------+
- Example 2: The value of an input parameter is null. Sample statement:
The following result is returned:select key,value,width_bucket(value,100,500,null) as value_group from values (1,99), (2,100), (3,199), (4,200), (5,499), (6,500), (7,501), (8,NULL) as t(key,value);
+------+-------+-------------+ | key | value | value_group | +------+-------+-------------+ | 1 | 99 | NULL | | 2 | 100 | NULL | | 3 | 199 | NULL | | 4 | 200 | NULL | | 5 | 499 | NULL | | 6 | 500 | NULL | | 7 | 501 | NULL | | 8 | NULL | NULL | +------+-------+-------------+