Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

HLL Functions and Operators

Updated on 2024-08-20 GMT+08:00

Hash Functions

  • hll_hash_boolean(bool)

    Description: Hashes data of the Boolean type.

    Return type: hll_hashval

    Example:
    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_boolean(FALSE);
      hll_hash_boolean   
    ---------------------
     -5451962507482445012
    (1 row)
    
  • hll_hash_boolean(bool, int32)

    Description: Configures a hash seed (that is, change the hash policy) and hashes data of the Boolean type.

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_boolean(FALSE, 10);
      hll_hash_boolean  
    --------------------
     -1169037589280886076
    (1 row)
    
  • hll_hash_smallint(smallint)

    Description: Hashes data of the smallint type.

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_smallint(100::smallint);
      hll_hash_smallint  
    ---------------------
     962727970174027904
    (1 row)
    
NOTE:

If parameters with the same numeric value are hashed using different data types, the data will differ, because hash functions select different calculation policies for each type.

  • hll_hash_smallint(smallint, int32)

    Description: Configures a hash seed (that is, change the hash policy) and hashes data of the smallint type.

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_smallint(100::smallint, 10);
      hll_hash_smallint  
    ---------------------
     -9056177146160443041
    (1 row)
    
  • hll_hash_integer(integer)

    Description: Hashes data of the integer type.

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_integer(0);
       hll_hash_integer   
    ----------------------
     5156626420896634997
    (1 row)
    
  • hll_hash_integer(integer, int32)

    Description: Hashes data of the integer type and configures a hash seed (that is, change the hash policy).

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_integer(0, 10);
      hll_hash_integer  
    --------------------
     -5035020264353794276
    (1 row)
    
  • hll_hash_bigint(bigint)

    Description: Hashes data of the bigint type.

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_bigint(100::bigint);
       hll_hash_bigint   
    ---------------------
     -2401963681423227794
    (1 row)
    
  • hll_hash_bigint(bigint, int32)

    Description: Hashes data of the bigint type and configures a hash seed (that is, changes the hash policy).

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_bigint(100::bigint, 10);
       hll_hash_bigint   
    ---------------------
     -2305749404374433531
    (1 row)
    
  • hll_hash_bytea(bytea)

    Description: Hashes data of the bytea type.

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_bytea(E'\\x');
     hll_hash_bytea 
    ----------------
     0
    (1 row)
    
  • hll_hash_bytea(bytea, int32)

    Description: Hashes data of the bytea type and configures a hash seed (that is, changes the hash policy).

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_bytea(E'\\x', 10);
       hll_hash_bytea    
    ---------------------
     7233188113542599437
    (1 row)
    
  • hll_hash_text(text)

    Description: Hashes data of the text type.

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_text('AB');
        hll_hash_text    
    ---------------------
     -5666002586880275174
    (1 row)
    
  • hll_hash_text(text, int32)

    Description: Hashes data of the text type and configures a hash seed (that is, changes the hash policy).

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_text('AB', 10);
    hll_hash_text
    ---------------------
    -2215507121143724132
    (1 row)
    
  • hll_hash_any(anytype)

    Description: Hashes data of any type.

    Return type: hll_hashval

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    gaussdb=# SELECT hll_hash_any(1);
         hll_hash_any     
    ----------------------
     -1316670585935156930
    (1 row)
    
    gaussdb=# SELECT hll_hash_any('08:00:2b:01:02:03'::macaddr);
         hll_hash_any     
    ----------------------
     -3719950434455589360
    (1 row)
    
  • hll_hash_any(anytype, int32)

    Description: Hashes data of any type and configures a hash seed (that is, changes the hash policy).

    Return type: hll_hashval

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hash_any(1, 10);
         hll_hash_any     
    ----------------------
     7048553517657992351
    (1 row)
    
  • hll_hashval_eq(hll_hashval, hll_hashval)

    Description: Compares two pieces of data of the hll_hashval type to check whether they are the same.

    Return type: Boolean

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hashval_eq(hll_hash_integer(1), hll_hash_integer(1));
     hll_hashval_eq 
    ----------------
     t
    (1 row)
    
  • hll_hashval_ne(hll_hashval, hll_hashval)

    Description: Compares two pieces of data of the hll_hashval type to check whether they are different.

    Return type: Boolean

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_hashval_ne(hll_hash_integer(1), hll_hash_integer(1));
     hll_hashval_ne 
    ----------------
     f
    (1 row)
    

HLL Functions

There are three HLL modes: explicit, sparse, and full. When the data size is small, the explicit mode is used. In this mode, distinct values are calculated without errors. As the number of distinct values increases, the HLL mode is switched to the sparse and full modes in sequence. The two modes have no difference in the calculation result, but vary in the calculation efficiency of HLL functions and the storage space of HLL objects. The following functions can be used to view some HLL parameters:

  • hll_print(hll)

    Description: Prints some debugging parameters of an HLL.

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_print(hll_empty());
                                       hll_print
    -------------------------------------------------------------------------------
     type=1(HLL_EMPTY), log2m=14, log2explicit=10, log2sparse=12, duplicatecheck=0
    (1 row)
    
  • hll_type(hll)

    Description: Checks the type of the current HLL. The return values are described as follows: 0 indicates HLL_UNINIT, an HLL object that is not initialized. 1 indicates HLL_EMPTY, an empty HLL object. 2 indicates HLL_EXPLICIT, an HLL object in explicit mode. 3 indicates HLL_SPARSE, an HLL object in sparse mode. 4 indicates HLL_FULL, an HLL object in full mode. 5 indicates HLL_UNDEFINED, an invalid HLL object.

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_type(hll_empty());
     hll_type 
    ----------
            1
    (1 row)
    
  • hll_log2m(hll)

    Description: Checks the value of log2m in the current HLL data structure. log2m is the logarithm of the number of buckets. This value affects the error rate of calculating distinct values by HLL. The error rate = ±1.04/√(2^log2m). When the value of log2m is explicitly set to a value ranging from 10 to 16, HLL sets the number of buckets to 2log2m. When the value of log2explicit is explicitly set to –1, the built-in default value is used.

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    gaussdb=# SELECT hll_log2m(hll_empty());
     hll_log2m
    -----------
            14
    (1 row)
    
    gaussdb=# SELECT hll_log2m(hll_empty(10));
     hll_log2m
    -----------
            10
    (1 row)
    
    gaussdb=# SELECT hll_log2m(hll_empty(-1));
     hll_log2m
    -----------
            14
    (1 row)
    
  • hll_log2explicit(hll)

    Description: Queries the value of log2explicit in the current HLL data structure. Generally, the HLL changes from the explicit mode to the sparse mode and then to the full mode. This process is called the promotion hierarchy policy. You can change the value of log2explicit to change the policy. For example, if the value of log2explicit is 0, the HLL will skip the explicit mode and directly enter the sparse mode. When the value of log2explicit is explicitly set to a value ranging from 1 to 12, the HLL will switch to the sparse mode when the length of the data segment exceeds 2log2explicit. When the value of log2explicit is explicitly set to –1, the built-in default value is used.

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    gaussdb=# SELECT hll_log2explicit(hll_empty());
     hll_log2explicit
    ------------------
                   10
    (1 row)
    
    gaussdb=# SELECT hll_log2explicit(hll_empty(12, 8));
     hll_log2explicit
    ------------------
                    8
    (1 row)
    
    gaussdb=# SELECT hll_log2explicit(hll_empty(12, -1));
     hll_log2explicit
    ------------------
                   10
    (1 row)
    
  • hll_log2sparse(hll)

    Description: Queries the value of log2sparse in the current HLL data structure. Generally, the HLL changes from the explicit mode to the sparse mode and then to the full mode. This process is called the promotion hierarchy policy. You can adjust the value of log2sparse to change the policy. For example, if the value of log2sparse is 0, the system skips the sparse mode and directly enters the full mode. If the value of log2sparse is explicitly set to a value ranging from 1 to 14, the HLL will switch to the full mode when the length of the data segment exceeds 2log2sparse. When the value of log2sparse is explicitly set to –1, the built-in default value is used.

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    gaussdb=# SELECT hll_log2sparse(hll_empty());
     hll_log2sparse
    ----------------
                 12
    (1 row)
    
    gaussdb=# SELECT hll_log2sparse(hll_empty(12, 8, 10));
     hll_log2sparse
    ----------------
                 10
    (1 row)
    
    gaussdb=# SELECT hll_log2sparse(hll_empty(12, 8, -1));
     hll_log2sparse
    ----------------
                 12
    (1 row)
    
  • hll_duplicatecheck(hll)

    Description: Specifies whether duplicate check is enabled. 0: disable; 1: enable. This function is disabled by default. If there are many duplicate values, you can enable this function to improve efficiency. When the value of duplicatecheck is explicitly set to –1, the built-in default value is used.

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    gaussdb=# SELECT hll_duplicatecheck(hll_empty());
     hll_duplicatecheck
    --------------------
                      0
    (1 row)
    
    gaussdb=# SELECT hll_duplicatecheck(hll_empty(12, 8, 10, 1));
     hll_duplicatecheck
    --------------------
                      1
    (1 row)
    
    gaussdb=# SELECT hll_duplicatecheck(hll_empty(12, 8, 10, -1));
     hll_duplicatecheck
    --------------------
                      0
    (1 row)
    

Functional Functions

  • hll_empty()

    Description: Creates an empty HLL.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_empty();
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000002b05000000000000000000000000000000000000
    (1 row)
    
  • hll_empty(int32 log2m)

    Description: Creates an empty HLL and sets the log2m parameter. The parameter value ranges from 10 to 16. If the input is –1, the built-in default value is used.

    Return type: HLL

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    gaussdb=# SELECT hll_empty(10);
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000002b04000000000000000000000000000000000000
    (1 row)
    
    gaussdb=# SELECT hll_empty(-1);
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000002b05000000000000000000000000000000000000
    (1 row)
    
  • hll_empty(int32 log2m, int32 log2explicit)

    Description: Creates an empty HLL and sets the log2m and log2explicit parameters in sequence. The value of log2explicit ranges from 0 to 12. The value 0 indicates that the explicit mode is skipped. This parameter is used to set the threshold of the explicit mode. When the length of the data segment reaches 2log2explicit, the mode is switched to the sparse or full mode. If the input is –1, the built-in default value of log2explicit is used.

    Return type: HLL

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    gaussdb=# SELECT hll_empty(10, 4);
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000001304000000000000000000000000000000000000
    (1 row)
    
    gaussdb=# SELECT hll_empty(10, -1);
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000002b04000000000000000000000000000000000000
    (1 row)
    
  • hll_empty(int32 log2m, int32 log2explicit, int64 log2sparse)

    Description: Creates an empty HLL and sets the log2m, log2explicit and log2sparse parameters in sequence. The value of log2sparse ranges from 0 to 14. The value 0 indicates that the sparse mode is skipped. This parameter is used to set the threshold of the sparse mode. When the length of the data segment reaches 2log2sparse, the mode is switched to the full mode. If the input is –1, the built-in default value of log2sparse is used.

    Return type: HLL

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    gaussdb=# SELECT hll_empty(10, 4, 8);
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000001204000000000000000000000000000000000000
    (1 row)
    
    gaussdb=# SELECT hll_empty(10, 4, -1);
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000001304000000000000000000000000000000000000
    (1 row)
    
  • hll_empty(int32 log2m, int32 log2explicit, int64 log2sparse, int32 duplicatecheck)

    Description: Creates an empty HLL and sets the log2m, log2explicit, log2sparse, and duplicatecheck parameters in sequence. The value of duplicatecheck is 0 or 1, indicating whether the duplicate check mode is enabled. By default, this mode is disabled. If the input is –1, the built-in default value of duplicatecheck is used.

    Return type: HLL

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    gaussdb=# SELECT hll_empty(10, 4, 8, 0);
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000001204000000000000000000000000000000000000
    (1 row)
    
    gaussdb=# SELECT hll_empty(10, 4, 8, -1);
                             hll_empty
    ------------------------------------------------------------
     \x484c4c00000000001204000000000000000000000000000000000000
    (1 row)
    
  • hll_add(hll, hll_hashval)

    Description: Adds hll_hashval to an HLL.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_add(hll_empty(), hll_hash_integer(1));
                                      hll_add
    ----------------------------------------------------------------------------
     \x484c4c08000002002b0900000000000000f03f3e2921ff133fbaed3e2921ff133fbaed00
    (1 row)
    
  • hll_add_rev(hll_hashval, hll)

    Description: Adds hll_hashval to an HLL. This function works the same as hll_add, except that the positions of parameters are switched.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_add_rev(hll_hash_integer(1), hll_empty());
                                    hll_add_rev
    ----------------------------------------------------------------------------
     \x484c4c08000002002b0900000000000000f03f3e2921ff133fbaed3e2921ff133fbaed00
    (1 row)
    
  • hll_eq(hll, hll)

    Description: Compares two HLLs to check whether they are the same.

    Return type: Boolean

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_eq(hll_add(hll_empty(), hll_hash_integer(1)), hll_add(hll_empty(), hll_hash_integer(2)));
     hll_eq 
    --------
     f
    (1 row)
    
  • hll_ne(hll, hll)

    Description: Compares two HLLs to check whether they are different.

    Return type: Boolean

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_ne(hll_add(hll_empty(), hll_hash_integer(1)), hll_add(hll_empty(), hll_hash_integer(2)));
     hll_ne 
    --------
     t
    (1 row)
    
  • hll_cardinality(hll)

    Description: Calculates the number of distinct values of an HLL.

    Return type: int

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_cardinality(hll_empty() || hll_hash_integer(1));
     hll_cardinality 
    -----------------
                   1
    (1 row)
    
  • hll_union(hll, hll)

    Description: Performs the UNION operation on two HLL data structures to obtain one HLL.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_union(hll_add(hll_empty(), hll_hash_integer(1)), hll_add(hll_empty(), hll_hash_integer(2)));
                                             hll_union
    --------------------------------------------------------------------------------------------
     \x484c4c10002000002b090000000000000000400000000000000000b3ccc49320cca1ae3e2921ff133fbaed00
    (1 row)
    

Aggregate Functions

  • hll_add_agg(hll_hashval)

    Description: Groups hashed data into HLL.

    Return type: HLL

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    -- Prepare data:
    gaussdb=# CREATE TABLE t_id(id int);
    gaussdb=# INSERT INTO t_id VALUES(generate_series(1,500));
    gaussdb=# CREATE TABLE t_data(a int, c text);
    gaussdb=# INSERT INTO t_data SELECT mod(id,2), id FROM t_id;
    
    -- Create a table and specify an HLL column:
    gaussdb=# CREATE TABLE t_a_c_hll(a int, c hll);
    
    -- Use GROUP BY on column a to group data, and insert the data to the HLL:
    gaussdb=# INSERT INTO t_a_c_hll SELECT a, hll_add_agg(hll_hash_text(c)) FROM t_data GROUP BY a;
    
    -- Calculate the number of distinct values for each group in the HLL:
    gaussdb=# SELECT a, #c AS cardinality FROM t_a_c_hll ORDER BY a;
     a |   cardinality    
    ---+------------------
     0 | 247.862354346299
     1 | 250.908710610377
    (2 rows)
    
  • hll_add_agg(hll_hashval, int32 log2m)

    Description: Groups hashed data into HLL and specifies the log2m parameter. The value ranges from 10 to 16. If the input is –1 or NULL, the built-in default value is used.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_cardinality(hll_add_agg(hll_hash_text(c), 12)) FROM t_data;
     hll_cardinality  
    ------------------
     497.965240179228
    (1 row)
    
  • hll_add_agg(hll_hashval, int32 log2m, int32 log2explicit)

    Description: Groups hashed data into HLL and specifies the log2m and log2explicit parameters in sequence. The value of log2explicit ranges from 0 to 12. The value 0 indicates that the explicit mode is skipped. This parameter is used to set the threshold of the explicit mode. When the length of the data segment reaches 2log2explicit, the mode is switched to the sparse or full mode. If the input is –1 or NULL, the built-in default value of log2explicit is used.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_cardinality(hll_add_agg(hll_hash_text(c), NULL, 1)) FROM t_data;
     hll_cardinality  
    ------------------
     498.496062953313
    (1 row)
    
  • hll_add_agg(hll_hashval, int32 log2m, int32 log2explicit, int64 log2sparse)

    Description: Groups hashed data into HLL and sets the log2m, log2explicit, and log2sparse parameters in sequence. The value of log2sparse ranges from 0 to 14. The value 0 indicates that the sparse mode is skipped. This parameter is used to set the threshold of the sparse mode. When the length of the data segment reaches 2log2sparse, the mode is switched to the full mode. If the input is –1 or NULL, the built-in default value of log2sparse is used.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_cardinality(hll_add_agg(hll_hash_text(c), NULL, 6, 10)) FROM t_data;
     hll_cardinality  
    ------------------
     498.496062953313
    (1 row)
    
  • hll_add_agg(hll_hashval, int32 log2m, int32 log2explicit, int64 log2sparse, int32 duplicatecheck)

    Description: Groups hashed data into HLL and sets the log2m, log2explicit, log2sparse, and duplicatecheck parameters. The value of duplicatecheck can be 0 or 1, indicating whether to enable this mode. By default, this mode is disabled. If the input is –1 or NULL, the built-in default value of duplicatecheck is used.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT hll_cardinality(hll_add_agg(hll_hash_text(c), NULL, 6, 10, -1)) FROM t_data;
     hll_cardinality  
    ------------------
     498.496062953313
    (1 row)
    
  • hll_union_agg(hll)

    Description: Performs the UNION operation on multiple pieces of data of the HLL type to obtain one HLL.

    Return type: HLL

    Example:

    1
    2
    3
    4
    5
    6
    -- Perform the UNION operation on data of the HLL type in each group to obtain one HLL, and calculate the number of distinct values:
    gaussdb=# SELECT #hll_union_agg(c) AS cardinality FROM t_a_c_hll;
       cardinality    
    ------------------
     498.496062953313
    (1 row)
    
    NOTE:

    To perform the UNION operation on data in multiple HLLs, ensure that the HLLs have the same precision. Otherwise, UNION cannot be performed. This constraint also applies to the hll_union(hll, hll) function.

Obsolete Functions

Some old HLL functions are discarded due to version upgrade. You can replace them with similar functions.

  • hll_schema_version(hll)

    Description: Checks the schema version in the current HLL. In earlier versions, the schema version is fixed at 1, which is used to verify the header of the HLL field. After refactoring, the HLL field is added to the header for verification. The schema version is no longer used.

  • hll_regwidth(hll)

    Description: Queries the bucket size in the HLL data structure. In earlier versions, the value of regwidth ranges from 1 to 5, which has a large error and limits the upper limit of the cardinality estimation. After refactoring, the value of regwidth is fixed at 6 and the variable is not used.

  • hll_expthresh(hll)

    Description: Obtains the value of expthresh in the current HLL. The hll_log2explicit(hll) function is used to replace similar functions.

  • hll_sparseon(hll)

    Description: Specifies whether the sparse mode is enabled. Use hll_log2sparse(hll) to replace similar functions. The value 0 indicates that the sparse mode is disabled.

Built-In Functions

HLL has a series of built-in functions for internal data processing. Generally, users do not need to know how to use these functions. For details, see Table 1.

Table 1 Built-In Functions

Function

Description

hll_in

Receives HLL data in string format.

hll_out

Sends HLL data in string format.

hll_recv

Receives HLL data in bytea format.

hll_send

Sends HLL data in bytea format.

hll_trans_in

Receives hll_trans_type data in string format.

hll_trans_out

Sends hll_trans_type data in string format.

hll_trans_recv

Receives hll_trans_type data in bytea format.

hll_trans_send

Sends hll_trans_type data in bytea format.

hll_typmod_in

Receives typmod data.

hll_typmod_out

Sends typmod data.

hll_hashval_in

Receives hll_hashval data.

hll_hashval_out

Sends hll_hashval data.

hll_add_trans0

Works similar to hll_add. No input parameter is specified during initialization. It is usually used on DNs in the first phase of aggregation operations.

hll_add_trans1

Works similar to hll_add. An input parameter is specified during initialization. It is usually used on DNs in the first phase of aggregation operations.

hll_add_trans2

Works similar to hll_add. Two input parameters are specified during initialization. It is usually used on DNs in the first phase of aggregation operations.

hll_add_trans3

Works similar to hll_add. Three input parameters are specified during initialization. It is usually used on DNs in the first phase of aggregation operations.

hll_add_trans4

Works similar to hll_add. Four input parameters are specified during initialization. It is usually used on DNs in the first phase of aggregation operations.

hll_union_trans

Works similar to hll_union and is used on DNs in the first phase of aggregation operations.

hll_union_collect

Works similar to hll_union and is used on DNs in the second phase of aggregation operations to summarize the results of each DN.

hll_pack

Converts the user-defined hll_trans_type to the HLL type on DNs in the third phase of aggregation operations.

hll

Converts an HLL type to another HLL type. Input parameters can be specified.

hll_hashval

Converts the bigint type to the hll_hashval type.

hll_hashval_int4

Converts the int4 type to the hll_hashval type.

Operators

  • =

    Description: Compares the values of the HLL or hll_hashval type to check whether they are the same.

    Return type: Boolean

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    --hll
    gaussdb=# SELECT (hll_empty() || hll_hash_integer(1)) = (hll_empty() || hll_hash_integer(1));
    column 
    ----------
     t
    (1 row)
    
    --hll_hashval
    gaussdb=# SELECT hll_hash_integer(1) = hll_hash_integer(1);
     ?column? 
    ----------
     t
    (1 row)
    
  • <> or !=

    Description: Compares the values of the HLL or hll_hashval type to check whether they are different.

    Return type: Boolean

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    --hll
    gaussdb=# SELECT (hll_empty() || hll_hash_integer(1)) <> (hll_empty() || hll_hash_integer(2));
     ?column? 
    ----------
     t
    (1 row)
    
    --hll_hashval
    gaussdb=# SELECT hll_hash_integer(1) <> hll_hash_integer(2);
     ?column? 
    ----------
     t
    (1 row)
    
  • ||

    Description: Represents the hll_add, hll_union, and hll_add_rev functions.

    Return type: HLL

    Example:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    --hll_add
    gaussdb=# SELECT hll_empty() || hll_hash_integer(1);
                                      ?column?
    ----------------------------------------------------------------------------
     \x484c4c08000002002b0900000000000000f03f3e2921ff133fbaed3e2921ff133fbaed00
    (1 row)
    
    --hll_add_rev
    gaussdb=# SELECT hll_hash_integer(1) || hll_empty();
                                      ?column?
    ----------------------------------------------------------------------------
     \x484c4c08000002002b0900000000000000f03f3e2921ff133fbaed3e2921ff133fbaed00
    (1 row)
    
    --hll_union
    gaussdb=# SELECT (hll_empty() || hll_hash_integer(1)) || (hll_empty() || hll_hash_integer(2));
                                              ?column?
    --------------------------------------------------------------------------------------------
     \x484c4c10002000002b090000000000000000400000000000000000b3ccc49320cca1ae3e2921ff133fbaed00
    (1 row)
    
  • #

    Description: Calculates the number of distinct values of an HLL. It works the same as the hll_cardinality function.

    Return type: int

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT #(hll_empty() || hll_hash_integer(1));
     ?column? 
    ----------
            1
    (1 row)
    

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback