STRUCT
STRUCT is a composite data type that combines multiple fields of different types into a named field set. It is useful for storing related data in a single column — for example, storing a user's name, age, and status as a single unit, or storing structured objects inside an ARRAY.
Type selection reference for complex types:
Type Structure Best for STRUCT<f:T,...>Named fields, heterogeneous types Fixed-structure objects where field names are determined at table creation MAP<K,V>Key-value pairs, uniform key type Dynamic attributes with a variable number of fields ARRAY<T>Ordered list, uniform element type Tags, scores, ID lists
Syntax and Definition
Type declaration
STRUCT<field1_name: field1_type, field2_name: field2_type, ...>
Constructors
-- named_struct: specify field names (recommended)
named_struct('field1_name', value1, 'field2_name', value2, ...)
-- struct: auto-generates field names col1, col2, ...
struct(value1, value2, ...)
Field access : field names are case-insensitive:
named_struct('Name', 'Alice', 'age', 30).name -- Alice
named_struct('Name', 'Alice', 'age', 30).Name -- Alice (same field)
Creating Tables and Writing Data
CREATE TABLE doc_struct_demo (
id INT,
person STRUCT<name: STRING, age: INT, active: BOOLEAN>
);
INSERT INTO doc_struct_demo VALUES
(1, named_struct('name', 'Alice', 'age', 30, 'active', true)),
(2, named_struct('name', 'Bob', 'age', 25, 'active', false)),
(3, named_struct('name', 'Carol', 'age', NULL, 'active', true));
Querying the full column and accessing fields:
SELECT id, person, person.name, person.age, person.active
FROM doc_struct_demo
ORDER BY id;
id person name age active 1 {"name":"Alice","age":30,"active":true} Alice 30 true 2 {"name":"Bob","age":25,"active":false} Bob 25 false 3 {"name":"Carol","age":null,"active":true} Carol NULL true
Constructor Comparison
-- named_struct: custom field names
SELECT named_struct('company', 'Singdata', 'headcount', 5) AS s1;
-- Returns: {"company":"Singdata","headcount":5}
-- struct: field names auto-generated as col1, col2, ...
SELECT struct('Alice', 30, true) AS s2;
-- Returns: {"col1":"Alice","col2":30,"col3":true}
⚠️ Note : Field names generated by struct() (col1, col2) are not readable. It is recommended to use named_struct() to explicitly specify field names.
Modifying Field Values
STRUCT fields cannot be modified individually; use the struct_update() function:
SELECT struct_update(
named_struct('name', 'Alice', 'age', 30, 'active', true),
'age', 31
) AS updated;
-- Returns: {"name":"Alice","age":31,"active":true}
struct_update only modifies the specified field; all other fields remain unchanged.
Nested STRUCT
A STRUCT field's type can be another STRUCT; use chained dot notation to access nested fields:
SELECT named_struct(
'outer', named_struct('inner_val', 42, 'label', 'test'),
'count', 3
) AS nested;
-- Returns: {"outer":{"inner_val":42,"label":"test"},"count":3}
-- Access nested field
SELECT named_struct(
'outer', named_struct('inner_val', 42, 'label', 'test'),
'count', 3
).outer.inner_val AS deep_val;
-- Returns: 42
STRUCT Combined with ARRAY
STRUCT is commonly combined with ARRAY to store lists of structured objects:
-- Construct an array of STRUCTs
SELECT ARRAY(
named_struct('name', 'Alice', 'score', 95),
named_struct('name', 'Bob', 'score', 87)
) AS students;
-- Returns: [{"name":"Alice","score":95},{"name":"Bob","score":87}]
Use EXPLODE to expand a STRUCT array, then access individual fields:
SELECT s.name, s.score
FROM (
SELECT ARRAY(
named_struct('name', 'Alice', 'score', 95),
named_struct('name', 'Bob', 'score', 87)
) AS students
) t
LATERAL VIEW EXPLODE(students) tmp AS s;
Type Conversion
-- Convert field types
SELECT CAST(named_struct('a', 1, 'b', 2) AS STRUCT<a:BIGINT, b:BIGINT>) AS cast_struct;
-- Returns: {"a":1,"b":2} (field types become bigint)
Difference between CAST to STRING and TO_JSON :
SELECT CAST(named_struct('name', 'Alice', 'age', 30) AS STRING);
-- Returns: {Alice, 30} (values only, no field names, not JSON)
SELECT TO_JSON(named_struct('name', 'Alice', 'age', 30));
-- Returns: {"name":"Alice","age":30} (standard JSON, includes field names)
⚠️ Note : STRUCT's CAST to STRING outputs field values only, without field names, in the format {val1, val2, ...}. For JSON format, use TO_JSON.
NULL Handling
Any field in a STRUCT can be NULL:
SELECT named_struct('name', NULL, 'age', 30) AS s;
-- Returns: {"name":null,"age":30}
When filtering by a struct field, a NULL field in a comparison returns NULL (not true or false):
SELECT id, person.name
FROM doc_struct_demo
WHERE person.age > 26;
-- id=3 has age NULL, that row does not appear in results
Notes
STRUCT supports = equality comparison but does not support >, <, or other ordering comparisons, and cannot be used as a GROUP BY key or JOIN key.
Field names within a single STRUCT must be unique.
Field names are case-insensitive; person.Name and person.name access the same field.
CAST to STRING outputs values only, without field names; for JSON with field names, use TO_JSON.