Complex Types
DataArts Fabric SQL can read and write complex columns from Iceberg tables and Parquet files. For ORC files, only read operations are supported, limited to struct and array types, including their nested variations.
Complex Type DDL
Currently, DDL only supports STRUCT and ARRAY types.
The definition syntax is as follows:
- ARRAY: Defined with the syntax ARRAY<data_type>.
- STRUCT: Defined with the syntax STRUCT<col_name : data_type, ...>.
Here, data_type indicates the data type, which can be either a basic type or a complex type. For details about basic types, refer to Mapping Between DataArts Fabric SQL and LakeFormation Data Types.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
-- Create a managed table containing complex types. CREATE TABLE test_managed_table ( name TEXT, address STRUCT<zip:INT, city:TEXT>, tags ARRAY<TEXT>, logs ARRAY<STRUCT<term:TEXT, time:TIMESTAMP>> ) store as parquet; -- Create an external table containing complex types. CREATE EXTERNAL TABLE test_external_table ( name TEXT, address STRUCT<zip:INT, city:TEXT>, tags ARRAY<TEXT>, logs ARRAY<STRUCT<term:TEXT, time:TIMESTAMP>> ) store as orc location 'obs://test/test1'; -- After the table is created, you can view the full definition of LakeFormation. DESCRIBE test_managed_table ; DESCRIBE test_external_table ; |
Complex Type DML
- STRUCT type: A STRUCT is constructed using ROW(x,x,x,...), for example, ROW(1,2,3,..). If the STRUCT contains more than one column, the ROW keyword can be omitted, such as (1,2,3,...).
- ARRAY type: An ARRAY is built using ARRAY[x,x,x,..], for example, ARRAY[1,2,3,...].
- Nested type: Nested types are constructed by applying the above rules sequentially. For example, a type like struct<a:array<int>, b:array<int>> would be represented as ROW(ARRAY[1,2], ARRAY[3,4]).
1 2 3 4 5 6 7 8 9 10 11 12 13 |
-- Insert data. INSERT INTO test_managed_table VALUES('Alice', (10001,'New York'), array['friend','colleague'], array[('login','2024-01-01 00:00:00'),('login','2024-02-01 00:00:00')]); INSERT INTO test_managed_table SELECT 'Bob', (90001,'Los Angeles')::STRUCT<zip:INT, city:TEXT>, array['family'], array[('purchase','2024-02-15 00:00:00'),('purchase','2024-02-10 00:00:00')]::ARRAY<STRUCT<term:TEXT, time:TIMESTAMP>>; INSERT INTO test_managed_table SELECT 'Charlie', (60601,'Chicago')::STRUCT<zip:INT, city:TEXT>, array['friend','hobby'], array[NULL]::ARRAY<STRUCT<term:TEXT, time:TIMESTAMP>>; -- View data. SELECT * FROM test_managed_table ORDER BY name ; name | address | tags | logs ---------+-----------------------+--------------------+--------------------------------------------------------------------------------------- Alice | (10001,"New York") | {friend,colleague} | {"(login,\"Mon Jan 01 00:00:00 2024\")","(login,\"Thu Feb 01 00:00:00 2024\")"} Bob | (90001,"Los Angeles") | {family} | {"(purchase,\"Thu Feb 15 00:00:00 2024\")","(purchase,\"Sat Feb 10 00:00:00 2024\")"} Charlie | (60601,Chicago) | {friend,hobby} | {NULL} (3 rows) |
Complex Type DQL
- The Struct type allows for direct retrieval of internal elements through point queries, where the table column names should be enclosed in parentheses (()).
- Arrays function similarly to regular arrays, with specific elements accessed via indices. Note that indexing begins at 1.
DQL example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
-- Query complex data. SELECT * FROM test_managed_table; name | address | tags | logs ---------+-----------------------+--------------------+--------------------------------------------------------------------------------------- Alice | (10001,"New York") | {friend,colleague} | {"(login,\"Mon Jan 01 00:00:00 2024\")","(login,\"Thu Feb 01 00:00:00 2024\")"} Bob | (90001,"Los Angeles") | {family} | {"(purchase,\"Thu Feb 15 00:00:00 2024\")","(purchase,\"Sat Feb 10 00:00:00 2024\")"} Charlie | (60601,Chicago) | {friend,hobby} | {NULL} (3 rows) -- Extract struct elements. SELECT (address).city, (address).zip FROM test_managed_table; city | zip -------------+------- Los Angeles | 90001 New York | 10001 Chicago | 60601 (3 rows) -- Extract array elements. SELECT tags[1], tags[2] FROM test_managed_table; tags | tags ---------+---------- friend | colleague family | friend | hobby (3 rows) |
Complex Type Conversion to JSONB
Currently, complex types such as STRUCT and ARRAY can be read in JSON format by enabling a specific GUC parameter:
SET ENABLE_COMPLEX_TYPE=OFF;
After retrieving the data, you must explicitly convert it into the corresponding data type for further use.
DQL examples (for detailed operator and function descriptions of jsonb, refer to JSON/JSONB Functions and Operators.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
-- -> Extracts elements of a JSON object or array by key or index. SELECT name, address, tags, logs->0->'term' FROM test_managed_table; name | address | tags | ?column? ---------+---------------------------------------+-------------------------+------------ Alice | {"zip": 10001, "city": "New York"} | ["friend", "colleague"] | "login" Bob | {"zip": 90001, "city": "Los Angeles"} | ["family"] | "purchase" Charlie | {"zip": 60601, "city": "Chicago"} | ["friend", "hobby"] | (3 rows) -- ->> Extracts element and converts it to text. SELECT name, address FROM test_managed_table WHERE address->>'zip' = '10001'; name | address -------+------------------------------------ Alice | {"zip": 10001, "city": "New York"} (1 row) -- @> Array contains elements, where the left array includes all elements of the right array. SELECT name, tags FROM test_managed_table WHERE tags @> '["friend"]'::jsonb; name | tags ---------+------------------------- Alice | ["friend", "colleague"] Charlie | ["friend", "hobby"] (2 rows) -- Display type conversion. Use ->> to get as character and then perform type conversion. SELECT (logs->0->>'time')::TIMESTAMP FROM test_managed_table; timestamp --------------------- 2024-01-01 00:00:00 2024-02-15 00:00:00 (3 rows) |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot