Updated on 2024-02-21 GMT+08:00

Updating Table Column Statistics

Function

This API is used to update table column statistics.

URI

POST /v1/{project_id}/instances/{instance_id}/catalogs/{catalog_name}/databases/{database_name}/tables/{table_name}/column-statistics

Table 1 Path Parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For how to obtain the project ID, see Obtaining a Project ID (lakeformation_04_0026.xml).

instance_id

Yes

String

LakeFormation instance ID. The value is automatically generated when the instance is created, for example, 2180518f-42b8-4947-b20b-adfc53981a25.

catalog_name

Yes

String

Catalog name. The value should contain 1 to 256 characters. Only letters, numbers, and underscores (_) are allowed.

database_name

Yes

String

Database name. The value should contain 1 to 128 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed.

table_name

Yes

String

Table name. The value should contain 1 to 256 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed.

Request Parameters

Table 2 Request header parameters

Parameter

Mandatory

Type

Description

X-Auth-Token

Yes

Array of strings

Tenant token.

Table 3 Request body parameters

Parameter

Mandatory

Type

Description

merge

No

Boolean

Whether to merge statistics. The default value is false.

table_column_statistics

Yes

TableColumnStatistics object

Table column statistics.

Table 4 TableColumnStatistics

Parameter

Mandatory

Type

Description

column_statistics_desc

Yes

TableColumnStatisticsDescription object

Table column statistics description.

column_statistics_objects

Yes

Array of ColumnStatisticsObj objects

Column statistics.

Table 5 TableColumnStatisticsDescription

Parameter

Mandatory

Type

Description

last_analyzed_time

Yes

String

Last collected time.

Table 6 ColumnStatisticsObj

Parameter

Mandatory

Type

Description

column_name

Yes

String

Column name. The value can contain 1 to 767 characters. Only letters, digits, and special characters (_-+*(),) are allowed.

column_type

Yes

String

Data type, including array, bigint, binary, boolean, char, date, decimal, double, float, int, interval, map, set, smallint, string, struct, timestamp, tinyint, union, and varchar.

data_type

Yes

String

Statistics type, including binaryStats, booleanStats, dateStats, decimalStats, doubleStats, longStats, and stringStats.

Enumeration values:

  • binaryStats

  • booleanStats

  • dateStats

  • decimalStats

  • doubleStats

  • longStats

  • stringStats

binary_statistics_data

No

BinaryColumnStatisticsData object

Statistics on byte arrays.

long_statistics_data

No

LongColumnStatisticsData object

Statistics on long integers.

decimal_statistics_data

No

DecimalColumnStatisticsData object

Statistics on decimal values.

string_statistics_data

No

StringColumnStatisticsData object

Statistics on strings.

double_statistics_data

No

DoubleColumnStatisticsData object

Statistics on floating point numbers.

date_statistics_data

No

DateColumnStatisticsData object

Statistics on date values.

boolean_statistics_data

No

BooleanColumnStatisticsData object

Statistics on Boolean data.

Table 7 BinaryColumnStatisticsData

Parameter

Mandatory

Type

Description

maximum_length

Yes

Long

Maximum value of a byte array in a column.

average_length

Yes

Double

Average length of byte arrays in a column.

number_of_null

Yes

Long

Number of null values in a column.

Table 8 LongColumnStatisticsData

Parameter

Mandatory

Type

Description

minimum_value

Yes

Long

Minimum long integer value in a column.

maximum_value

Yes

Long

Maximum long integer value in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of long integer values in a column after deduplication.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 9 DecimalColumnStatisticsData

Parameter

Mandatory

Type

Description

minimum_value

Yes

Decimal object

Minimum decimal value in a column.

maximum_value

Yes

Decimal object

Maximum decimal value in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of decimal values in a column after deduplication.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 10 Decimal

Parameter

Mandatory

Type

Description

scale

No

Integer

Integer part.

unscaled

No

String

Decimal part.

Table 11 StringColumnStatisticsData

Parameter

Mandatory

Type

Description

average_length

Yes

Double

Average length of strings in a column.

maximum_length

Yes

Long

Maximum length of strings in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of strings after deduplication in a column.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 12 DoubleColumnStatisticsData

Parameter

Mandatory

Type

Description

minimum_value

Yes

Double

Minimum floating point number in a column.

maximum_value

Yes

Double

Maximum floating point number in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of floating point numbers after deduplication in a column.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 13 DateColumnStatisticsData

Parameter

Mandatory

Type

Description

minimum_value

No

String

Minimum timestamp in a column.

maximum_value

No

String

Maximum timestamp in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of timestamps after deduplication in a column.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 14 BooleanColumnStatisticsData

Parameter

Mandatory

Type

Description

number_of_true

Yes

Long

Number of real records in a column.

number_of_false

Yes

Long

Number of false records in a column.

number_of_null

Yes

Long

Number of empty records in a column.

Response Parameters

Status code: 200

Table 15 Response body parameters

Parameter

Type

Description

column_statistics_desc

TableColumnStatisticsDescription object

Table column statistics description.

column_statistics_objects

Array of ColumnStatisticsObj objects

Column statistics.

Table 16 TableColumnStatisticsDescription

Parameter

Type

Description

last_analyzed_time

String

Last collected time.

Table 17 ColumnStatisticsObj

Parameter

Type

Description

column_name

String

Column name. The value can contain 1 to 767 characters. Only letters, digits, and special characters (_-+*(),) are allowed.

column_type

String

Data type, including array, bigint, binary, boolean, char, date, decimal, double, float, int, interval, map, set, smallint, string, struct, timestamp, tinyint, union, and varchar.

data_type

String

Statistics type, including binaryStats, booleanStats, dateStats, decimalStats, doubleStats, longStats, and stringStats.

Enumeration values:

  • binaryStats

  • booleanStats

  • dateStats

  • decimalStats

  • doubleStats

  • longStats

  • stringStats

binary_statistics_data

BinaryColumnStatisticsData object

Statistics on byte arrays.

long_statistics_data

LongColumnStatisticsData object

Statistics on long integers.

decimal_statistics_data

DecimalColumnStatisticsData object

Statistics on decimal values.

string_statistics_data

StringColumnStatisticsData object

Statistics on strings.

double_statistics_data

DoubleColumnStatisticsData object

Statistics on floating point numbers.

date_statistics_data

DateColumnStatisticsData object

Statistics on date values.

boolean_statistics_data

BooleanColumnStatisticsData object

Statistics on Boolean data.

Table 18 BinaryColumnStatisticsData

Parameter

Type

Description

maximum_length

Long

Maximum value of a byte array in a column.

average_length

Double

Average length of byte arrays in a column.

number_of_null

Long

Number of null values in a column.

Table 19 LongColumnStatisticsData

Parameter

Type

Description

minimum_value

Long

Minimum long integer value in a column.

maximum_value

Long

Maximum long integer value in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of long integer values in a column after deduplication.

bit_vector

String

Bitmap used for estimating unique values.

Table 20 DecimalColumnStatisticsData

Parameter

Type

Description

minimum_value

Decimal object

Minimum decimal value in a column.

maximum_value

Decimal object

Maximum decimal value in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of decimal values in a column after deduplication.

bit_vector

String

Bitmap used for estimating unique values.

Table 21 Decimal

Parameter

Type

Description

scale

Integer

Integer part.

unscaled

String

Decimal part.

Table 22 StringColumnStatisticsData

Parameter

Type

Description

average_length

Double

Average length of strings in a column.

maximum_length

Long

Maximum length of strings in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of strings after deduplication in a column.

bit_vector

String

Bitmap used for estimating unique values.

Table 23 DoubleColumnStatisticsData

Parameter

Type

Description

minimum_value

Double

Minimum floating point number in a column.

maximum_value

Double

Maximum floating point number in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of floating point numbers after deduplication in a column.

bit_vector

String

Bitmap used for estimating unique values.

Table 24 DateColumnStatisticsData

Parameter

Type

Description

minimum_value

String

Minimum timestamp in a column.

maximum_value

String

Maximum timestamp in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of timestamps after deduplication in a column.

bit_vector

String

Bitmap used for estimating unique values.

Table 25 BooleanColumnStatisticsData

Parameter

Type

Description

number_of_true

Long

Number of real records in a column.

number_of_false

Long

Number of false records in a column.

number_of_null

Long

Number of empty records in a column.

Status code: 400

Table 26 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Status code: 404

Table 27 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Status code: 500

Table 28 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Example Requests

POST https://{endpoint} /v1/{project_id}/instances/{instance_id}/catalogs/{catalog_name}/databases/{database_name}/tables/{table_name}/column-statistics

{
  "merge" : false,
  "table_column_statistics" : {
    "column_statistics_desc" : {
      "last_analyzed_time" : "1970-01-01T00:00:00.100+00:00"
    },
    "column_statistics_objects" : [ {
      "column_name" : "column_prefix1",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix2",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix3",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    } ]
  }
}

Example Responses

Status code: 200

OK

{
  "column_statistics_desc" : {
    "last_analyzed_time" : "2023-05-31T02:25:35.614+00:00"
  },
  "column_statistics_objects" : [ {
    "column_name" : "1f3cbc18c07434435900b9cc7ba77678e",
    "column_type" : "bigint",
    "data_type" : "longStats",
    "long_statistics_data" : {
      "minimum_value" : -1469440606,
      "maximum_value" : 1927485019,
      "number_of_null" : -762838456,
      "number_of_distinct_value" : 531813078,
      "bit_vector" : "AWioLRcudhP0QQ=="
    }
  } ]
}

Status code: 400

Bad Request

{
  "error_code" : "common.01000001",
  "error_msg" : "failed to read http request, please check your input, code: 400, reason: Type mismatch., cause: TypeMismatchException"
}

Status code: 401

Unauthorized

{
  "error_code": 'APIG.1002',
  "error_msg": 'Incorrect token or token resolution failed'
}

Status code: 403

Forbidden

{
  "error" : {
    "code" : "403",
    "message" : "X-Auth-Token is invalid in the request",
    "error_code" : null,
    "error_msg" : null,
    "title" : "Forbidden"
  },
  "error_code" : "403",
  "error_msg" : "X-Auth-Token is invalid in the request",
  "title" : "Forbidden"
}

Status code: 404

Not Found

{
  "error_code" : "common.01000001",
  "error_msg" : "response status exception, code: 404"
}

Status code: 408

Request Timeout

{
  "error_code" : "common.00000408",
  "error_msg" : "timeout exception occurred"
}

Status code: 500

Internal Server Error

{
  "error_code" : "common.00000500",
  "error_msg" : "internal error"
}

Status Codes

Status Code

Description

200

OK

201

Created

400

Bad Request

401

Unauthorized

403

Forbidden

404

Not Found

408

Request Timeout

500

Internal Server Error

Error Codes

See Error Codes.