COLLECT_SET
Description
The collect_set
function is used to collect and return a set of unique elements from a given set of input data. This function ensures that the returned array does not contain duplicate elements, making the result more concise and clear. When the distinct
keyword is specified, the function will perform deduplication on the result. However, note that even without specifying distinct
, the collect_set
function will still perform deduplication, so the distinct
keyword does not affect the result in this scenario.
Parameter Description
expr
: Can be an expression of any data type.
Return Result
- Returns an array, where the element type in the array is the same as the input parameter type.
- Using the
distinct
keyword does not affect the deduplication result. - The order of elements in the returned array is not guaranteed to be the same as the input order.
- Input
null
values will not be included in the result array.
Example
Example 1: Basic Usage
Example 2: Deduplication Operation
Notes
- When handling large amounts of data, please note that the
collect_set
function may consume a significant amount of memory resources. - Since the order of the elements in the returned array is not guaranteed to be the same as the input order, if sorting is needed, please use other functions such as
ARRAY_SORT
to sort the results.