COLLECT_LIST

collect_list([distinct] expr)

Description

The collect_list function is used to collect and return an array from a set of input data. Users can optionally use the distinct keyword to return a set without duplicate elements.

Parameter Description

  • expr: An expression of any type, used to collect elements from the input data.

Return Result

  • Returns an array where the element type is the same as the input parameter type.
  • If the distinct keyword is set, a deduplicated set is returned.
  • The function does not guarantee the order of elements in the return result.
  • If the input data contains null values, null will not be included in the calculation.

Usage Example

  1. Return an array containing non-duplicate elements:

    SELECT collect_list(DISTINCT col) FROM VALUES (1), (2), (1), (null) AS tab(col);
    +----------------------------+
    | collect_list(DISTINCT col) |
    +----------------------------+
    | [2,1]                      |
    +----------------------------+
  2. Return an array containing duplicate elements:

    SELECT collect_list(col) FROM VALUES (1), (2), (1), (null) AS tab(col);
    +-------------------+
    | collect_list(col) |
    +-------------------+
    | [1,2,1]           |
    +-------------------+
  3. Collecting Characters from String Data:

    SELECT collect_list(col) FROM VALUES ("apple"), ("banana"), ("cherry"), (null) AS tab(col);
    +-----------------------------+
    |      collect_list(col)      |
    +-----------------------------+
    | ["apple","banana","cherry"] |
    +-----------------------------+
  4. Collect and return an array containing null values:

    SELECT collect_list(col) FROM VALUES (true), (false), (null) AS tab(col);
    +-------------------+
    | collect_list(col) |
    +-------------------+
    | [true,false]      |
    +-------------------+