Updated on 2023-02-21 GMT+08:00

Complex Type

ARRAY

Array

Example: ARRAY[1, 2, 3].

-- Create an ARRAY type table.
create table array_tb(col1 ARRAY<STRING>);

-- Insert a record of the ARRAY type.
insert into array_tb values(ARRAY['HetuEngine','Hive','Mppdb']);

--Query data.
select * from array_tb; --    [HetuEngine, Hive, Mppdb]

MAP

Data type of a key-value pair.

Example: MAP(ARRAY['foo', 'bar'], ARRAY[1, 2]).

-- Create a Map table.
create table map_tb(col1 MAP<STRING,INT>);

-- Insert a piece of data of the Map type.
insert into map_tb values(MAP(ARRAY['foo','bar'],ARRAY[1,2]));

--Query data.
select * from map_tb; --   {bar=2, foo=1} 

ROW

ROW fields can be any supported data type or a combination of different field data types.

--Create a ROW table.
create table row_tb (id int,col1 row(a int,b varchar));

--Insert data of the ROW type.
insert into row_tb values (1,ROW(1,'HetuEngine'));

--Query data.
select * from row_tb; -- 
 id |     col1     
----|--------------
  1 | {a=1, b=HetuEngine} 

--Fields can be named. By default, a Row field is not named.
select  row(1,2e0),CAST(ROW(1, 2e0) AS ROW(x BIGINT, y DOUBLE));
           _col0          |    _col1     
------------------------|--------------
 {field0=1, field1=2.0} | {x=1, y=2.0} 
(1 row) 

--Named fields can be accessed using the domain operator ".".
select col1.b from row_tb; -- HetuEngine

--Both named and unnamed fields can be accessed through location indexes, which start from 1 and must be a constant.
select col1[1] from row_tb;   -- 1

IPADDRESS

IP address, which can be an IPv4 or IPv6 address. In the system, however, this type of address is a unified IPv6 address.

IPv4 is supported by mapping IPv4 addresses to the IPv6 address range (RFC 4291#section-2.5.5.2). When an IPv4 address is created, it is mapped to an IPv6 address. During formatting, if the data is IPv4, the data will be mapped to IPv4 again. Other addresses are formatted according to the format defined in RFC 5952.

Example:

select IPADDRESS '10.0.0.1', IPADDRESS '2001:db8::1';
  _col0   |    _col1
----------|-------------
 10.0.0.1 | 2001:db8::1
(1 row)

UUID

A standard universally unique identifier (UUID) is also called a globally unique identifier (GUID).

It complies with the format defined in RFC 4122.

Example:

select UUID '12151fd2-7586-11e9-8f9e-2a86e4085a59';
                _col0
--------------------------------------
 12151fd2-7586-11e9-8f9e-2a86e4085a59
(1 row)

HYPERLOGLOG

Base statistics.

The cost of using HyperLogLog to approximate the count value of a unique number is far less than that of using count.

For details, see HyperLogLog Function.

  • HyperLogLog

    A HyperlogLog Sketch can be used to efficiently calculate the approximate value of distinct().

    It begins with a sparse representation, then becomes a dense representation. And at this time efficiency becomes higher.

  • P4HyperLogLog

    Similar to a HyperlogLog Sketch, but it starts with a dense representation.

QDIGEST

A quantile, also referred to as a quantile point, refers to that a probability distribution range of a random variable is divided into several equal numerical points. Commonly used quantile points include a median (that is, a 2-quantile), a 4-quartile, and a 100-percentile. Quantile digest is a set of quantiles. When the data to be queried is close to a quantile, the quantile can be used as the approximate value of the data to be queried. Its precision can be adjusted, but higher precision results in expensive space overhead.

STRUCT

The bottom layer is implemented using ROW. For details, see ROW.

Example:

-- Creating a STRUCT table.
create table struct_tab (id int,col1 struct<col2: integer, col3: string>);

-- Insert data of the STRUCT type.
 insert into struct_tab VALUES(1, struct<2, 'HetuEngine'>);

--Query data.
select * from struct_tab; 
 id |        col1         
----|---------------------
  1 | {col2=2, col3=HetuEngine}