Help Center> GaussDB> Distributed_2.x> SQL Reference> Functions and Operators> Distribution Column Recommendation Functions
Updated on 2023-10-23 GMT+08:00

Distribution Column Recommendation Functions

Distribution column recommendation is used to recommend distribution columns and distribution modes in a distributed database. The purpose is to reduce the labor cost of selecting distribution columns during service migration or rollout.

  • sqladvisor.init(char, boolean, boolean, boolean, int, int)

Description: Initializes parameters.

Return type: Boolean

Table 1 Parameter description of init

Parameter

Type

Description

Mandatory or Not

kind

char

Recommendation type. Currently, this parameter can only be set to D.

Yes

isUseCost

boolean

Indicates whether optimizers are used. If data is available, optimizers are used.

Yes

isUseCollect

boolean

Indicates whether the analysis is started from the collected load. The default value is false.

No

isConstraintPrimaryKey

boolean

Indicates whether primary key constraints are retained. The default value is true.

No

sqlCount

int

Number of collected SQL statements. The default value is 10000. The value ranges from 1 to 100000.

No

maxMemory

int

Maximum memory occupied by distribution column recommendation. The default value is 1024. The value ranges from 1 to 10240, in MB.

No

  • sqladvisor.set_weight_params(real, real, real)

Description: Sets the weight of different components in heuristic rules. A default parameter is set when the init function is invoked. This function does not need to be invoked during analysis.

Return type: Boolean

Table 2 Parameter description of set_weight_params

Parameter

Type

Description

Mandatory or Not

joinWeight

real

Weight of JOIN. The value ranges from 0 to 1000.

Yes

goupbyWeight

real

Weight of GROUP BY. The value ranges from 0 to 1000.

Yes

qualWeight

real

Weight of predicate. The value ranges from 0 to 1000.

Yes

This function is optional. When the init function is executed, the default weights of JOIN, GROUP BY, and predicate are preset to 1.0, 0.1, and 0.05, respectively.

  • sqladvisor.set_cost_params(bigint, boolean, text)

Description: Parameter that can be set in the Whtif cost model.

Return type: Boolean

Table 3 Parameter description of set_cost_params

Parameter

Type

Description

Mandatory or Not

maxTime

bigint

Maximum recommendation duration, in minutes. If the value is less than or equal to 0, the duration is not limited by default.

Yes

isTotalSQL

boolean

Indicates whether all SQL statements are used for calculation. The value true indicates that all SQL statements are used for calculation. The value false indicates that SQL statements whose cost is too high or too low are filtered out based on the percentile.

Yes

compressLevel

text

Search space size of the recommendation algorithm. The options are low, med, and high.

Yes

  • This function is optional. When the init function is executed, maxTime is preset to –1, isTotalSQL is preset to true, and compressLevel is preset to high.
  • A lower compression level indicates longer time, and it is more likely that a better result can be achieved.
  • sqladvisor.assign_table_type(text)

Description: Specifies a table as a replication table.

Parameter: table name

Return type: Boolean

The specified replication table must be used before analyze_query and analyze_workload are invoked.

  • sqladvisor.analyze_query(text, int)

Description: Imports SQL statements to be recommended and analyzes the components of the statements.

Return type: Boolean

Table 4 Parameter description of analyze_query

Parameter

Type

Description

Mandatory or Not

query

text

SQL statement

Yes

frequency

int

Frequency of a statement in the load. The default value is 1. The value ranges from 1 to 2147483647.

No

  • If the value of the query parameter contains special characters, such as single quotation marks ('), you can use single quotation marks (') to escape the special characters.
  • This function is not supported in semi-online mode.
  • sqladvisor.analyze_workload()

Description: Analyzes the load information collected online.

Return type: Boolean

  • sqladvisor.get_analyzed_result(text)

Description: Obtains beneficial components extracted from the current table.

Parameter: text

Return type: record

The following table describes return fields.

Name

Type

Description

schema_name

text

Schema name

table_name

text

Table name

col_name

text

Column name

operator

text

Operator type

count

int

Number of times that an operator is used

  • sqladvisor.run()

Description: Performs calculation and analysis based on the specified schema and input SQL statements.

Return type: Boolean

  • sqladvisor.get_distribution_key()

Description: Obtains the recommendation result.

The analysis result is saved in a session. If the session disconnects, the result will be lost.

Return type: record

The following table describes return fields.

Name

Type

Description

db_name

text

Database name

schema_name

text

Schema name

table_name

text

Table name

distribution_type

text

Recommended distribution type

distribution_key

text

Recommended distribution column

start_time

timestamp

Recommended start time

end_time

timestamp

Recommended end time

cost_improve

text

Cost increase brought by the recommendation result

comment

text

Comment

  • sqladvisor.clean()

Description: Clears all the memory in the recommendation process of a session.

Return type: Boolean

  • sqladvisor.start_collect_workload(int, int)

Description: Starts online load collection.

Return type: Boolean

Table 5 Parameter description of start_collect_workload

Parameter

Type

Description

Mandatory or Not

sqlCount

int

Maximum number of SQL statements for online load collection. The value ranges from 1 to 100000 and the default value is 10000.

Yes

maxMemory

int

Maximum memory occupied by online load collection. The default value is 1024. The value ranges from 1 to 10240, in MB.

Yes

  • The online collection function can be invoked only by the system administrator.
  • The load of only one database can be collected at a time.
  • Currently, only common SQL statements as well as DML and DQL statements in stored procedures are supported.
  • sqladvisor.end_collect_workload()

Description: Disables online load collection.

Return type: Boolean

  • The online collection function can be disabled only by the system administrator.
  • sqladvisor.clean_workload()

Description: Clears the memory in the load.

Return type: Boolean

  • The function of clearing memory in the load can be invoked only by the system administrator.
  • You must manually execute the cleanup function.

Suggestions

  • Invoke state machines in the heuristic or WhatIf cost recommendation mode.

  • Invoke state machines in semi-online recommendation mode.