Updated on 2024-02-21 GMT+08:00

Obtaining Column Statistics

Function

This API is used to obtain column statistics.

URI

POST /v1/{project_id}/instances/{instance_id}/catalogs/{catalog_name}/databases/{database_name}/tables/{table_name}/column-statistics/batch-get

Table 1 Path Parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For how to obtain the project ID, see Obtaining a Project ID (lakeformation_04_0026.xml).

instance_id

Yes

String

LakeFormation instance ID. The value is automatically generated when the instance is created, for example, 2180518f-42b8-4947-b20b-adfc53981a25.

catalog_name

Yes

String

Catalog name. The value should contain 1 to 256 characters. Only letters, numbers, and underscores (_) are allowed.

database_name

Yes

String

Database name. The value should contain 1 to 128 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed.

table_name

Yes

String

Table name. The value should contain 1 to 256 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed.

Request Parameters

Table 2 Request header parameters

Parameter

Mandatory

Type

Description

X-Auth-Token

Yes

Array of strings

Tenant token.

Table 3 Request body parameters

Parameter

Mandatory

Type

Description

column_names

Yes

Array of strings

Column name.

Response Parameters

Status code: 200

Table 4 Response body parameters

Parameter

Type

Description

[items]

Array of ColumnStatisticsObj objects

OK

Table 5 ColumnStatisticsObj

Parameter

Type

Description

column_name

String

Column name. The value can contain 1 to 767 characters. Only letters, digits, and special characters (_-+*(),) are allowed.

column_type

String

Data type, including array, bigint, binary, boolean, char, date, decimal, double, float, int, interval, map, set, smallint, string, struct, timestamp, tinyint, union, and varchar.

data_type

String

Statistics type, including binaryStats, booleanStats, dateStats, decimalStats, doubleStats, longStats, and stringStats.

Enumeration values:

  • binaryStats

  • booleanStats

  • dateStats

  • decimalStats

  • doubleStats

  • longStats

  • stringStats

binary_statistics_data

BinaryColumnStatisticsData object

Statistics on byte arrays.

long_statistics_data

LongColumnStatisticsData object

Statistics on long integers.

decimal_statistics_data

DecimalColumnStatisticsData object

Statistics on decimal values.

string_statistics_data

StringColumnStatisticsData object

Statistics on strings.

double_statistics_data

DoubleColumnStatisticsData object

Statistics on floating point numbers.

date_statistics_data

DateColumnStatisticsData object

Statistics on date values.

boolean_statistics_data

BooleanColumnStatisticsData object

Statistics on Boolean data.

Table 6 BinaryColumnStatisticsData

Parameter

Type

Description

maximum_length

Long

Maximum value of a byte array in a column.

average_length

Double

Average length of byte arrays in a column.

number_of_null

Long

Number of null values in a column.

Table 7 LongColumnStatisticsData

Parameter

Type

Description

minimum_value

Long

Minimum long integer value in a column.

maximum_value

Long

Maximum long integer value in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of long integer values in a column after deduplication.

bit_vector

String

Bitmap used for estimating unique values.

Table 8 DecimalColumnStatisticsData

Parameter

Type

Description

minimum_value

Decimal object

Minimum decimal value in a column.

maximum_value

Decimal object

Maximum decimal value in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of decimal values in a column after deduplication.

bit_vector

String

Bitmap used for estimating unique values.

Table 9 Decimal

Parameter

Type

Description

scale

Integer

Integer part.

unscaled

String

Decimal part.

Table 10 StringColumnStatisticsData

Parameter

Type

Description

average_length

Double

Average length of strings in a column.

maximum_length

Long

Maximum length of strings in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of strings after deduplication in a column.

bit_vector

String

Bitmap used for estimating unique values.

Table 11 DoubleColumnStatisticsData

Parameter

Type

Description

minimum_value

Double

Minimum floating point number in a column.

maximum_value

Double

Maximum floating point number in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of floating point numbers after deduplication in a column.

bit_vector

String

Bitmap used for estimating unique values.

Table 12 DateColumnStatisticsData

Parameter

Type

Description

minimum_value

String

Minimum timestamp in a column.

maximum_value

String

Maximum timestamp in a column.

number_of_null

Long

Number of null values in a column.

number_of_distinct_value

Long

Number of timestamps after deduplication in a column.

bit_vector

String

Bitmap used for estimating unique values.

Table 13 BooleanColumnStatisticsData

Parameter

Type

Description

number_of_true

Long

Number of real records in a column.

number_of_false

Long

Number of false records in a column.

number_of_null

Long

Number of empty records in a column.

Status code: 400

Table 14 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Status code: 404

Table 15 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Status code: 500

Table 16 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Example Requests

POST https://{endpoint} /v1/{project_id}/instances/{instance_id}/catalogs/{catalog_name}/databases/{database_name}/tables/{table_name}/column-statistics/batch-get

{
  "column_names" : [ "column1" ]
}

Example Responses

Status code: 200

OK

[ {
  "column_name" : "column_name",
  "column_type" : "string",
  "data_type" : "int",
  "binary_statistics_data" : {
    "maximum_length" : 0,
    "average_length" : 0,
    "number_of_null" : 0
  },
  "long_statistics_data" : {
    "minimum_value" : 0,
    "maximum_value" : 0,
    "number_of_null" : 0,
    "number_of_distinct_value" : 0,
    "bit_vector" : "string"
  },
  "decimal_statistics_data" : {
    "minimum_value" : {
      "scale" : 0,
      "unscaled" : "string"
    },
    "maximum_value" : {
      "scale" : 0,
      "unscaled" : "string"
    },
    "number_of_null" : 0,
    "number_of_distinct_value" : 0,
    "bit_vector" : "string"
  },
  "string_statistics_data" : {
    "average_length" : 0,
    "maximum_length" : 0,
    "number_of_null" : 0,
    "number_of_distinct_value" : 0,
    "bit_vector" : "string"
  },
  "double_statistics_data" : {
    "minimum_value" : 0,
    "maximum_value" : 0,
    "number_of_null" : 0,
    "number_of_distinct_value" : 0,
    "bit_vector" : "string"
  },
  "date_statistics_data" : {
    "minimum_value" : "2023-01-09T09:40:45.206Z",
    "maximum_value" : "2023-01-09T09:40:45.206Z",
    "number_of_null" : 0,
    "number_of_distinct_value" : 0,
    "bit_vector" : "string"
  },
  "boolean_statistics_data" : {
    "number_of_true" : 0,
    "number_of_false" : 0,
    "number_of_null" : 0
  }
} ]

Status code: 400

Bad Request

{
  "error_code" : "common.01000001",
  "error_msg" : "failed to read http request, please check your input, code: 400, reason: Type mismatch., cause: TypeMismatchException"
}

Status code: 401

Unauthorized

{
  "error_code": 'APIG.1002',
  "error_msg": 'Incorrect token or token resolution failed'
}

Status code: 403

Forbidden

{
  "error" : {
    "code" : "403",
    "message" : "X-Auth-Token is invalid in the request",
    "error_code" : null,
    "error_msg" : null,
    "title" : "Forbidden"
  },
  "error_code" : "403",
  "error_msg" : "X-Auth-Token is invalid in the request",
  "title" : "Forbidden"
}

Status code: 404

Not Found

{
  "error_code" : "common.01000001",
  "error_msg" : "response status exception, code: 404"
}

Status code: 408

Request Timeout

{
  "error_code" : "common.00000408",
  "error_msg" : "timeout exception occurred"
}

Status code: 500

Internal Server Error

{
  "error_code" : "common.00000500",
  "error_msg" : "internal error"
}

Status Codes

Status Code

Description

200

OK

400

Bad Request

401

Unauthorized

403

Forbidden

404

Not Found

408

Request Timeout

500

Internal Server Error

Error Codes

See Error Codes.