Help Center/ DataArts Lake Formation/ API Reference/ API/ LakeCat/ Partition Statistics/ Setting Partition Statistics in Batches
Updated on 2024-02-21 GMT+08:00

Setting Partition Statistics in Batches

Function

This API is used to set partition statistics in batches.

URI

POST /v1/{project_id}/instances/{instance_id}/catalogs/{catalog_name}/databases/{database_name}/tables/{table_name}/partitions/column-statistics

Table 1 Path Parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For how to obtain the project ID, see Obtaining a Project ID (lakeformation_04_0026.xml).

instance_id

Yes

String

LakeFormation instance ID. The value is automatically generated when the instance is created, for example, 2180518f-42b8-4947-b20b-adfc53981a25.

catalog_name

Yes

String

Catalog name. The value should contain 1 to 256 characters. Only letters, numbers, and underscores (_) are allowed.

database_name

Yes

String

Database name. The value should contain 1 to 128 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed.

table_name

Yes

String

Table name. The value should contain 1 to 256 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed.

Request Parameters

Table 2 Request header parameters

Parameter

Mandatory

Type

Description

X-Auth-Token

Yes

Array of strings

Tenant token.

Table 3 Request body parameters

Parameter

Mandatory

Type

Description

need_merge

Yes

Boolean

Whether to incorporate original statistics.

statistics

Yes

Array of PartitionColumnStatistics objects

List of partition statistics.

Table 4 PartitionColumnStatistics

Parameter

Mandatory

Type

Description

column_statistics_desc

Yes

PartitionColumnStatisticsDescription object

Column statistics description.

column_statistics_objects

Yes

Array of ColumnStatisticsObj objects

Column statistics.

Table 5 PartitionColumnStatisticsDescription

Parameter

Mandatory

Type

Description

partition_values

No

Array of strings

Partition values.

last_analyzed_time

Yes

String

Last collected time.

Table 6 ColumnStatisticsObj

Parameter

Mandatory

Type

Description

column_name

Yes

String

Column name. The value can contain 1 to 767 characters. Only letters, digits, and special characters (_-+*(),) are allowed.

column_type

Yes

String

Data type, including array, bigint, binary, boolean, char, date, decimal, double, float, int, interval, map, set, smallint, string, struct, timestamp, tinyint, union, and varchar.

data_type

Yes

String

Statistics type, including binaryStats, booleanStats, dateStats, decimalStats, doubleStats, longStats, and stringStats.

Enumeration values:

  • binaryStats

  • booleanStats

  • dateStats

  • decimalStats

  • doubleStats

  • longStats

  • stringStats

binary_statistics_data

No

BinaryColumnStatisticsData object

Statistics on byte arrays.

long_statistics_data

No

LongColumnStatisticsData object

Statistics on long integers.

decimal_statistics_data

No

DecimalColumnStatisticsData object

Statistics on decimal values.

string_statistics_data

No

StringColumnStatisticsData object

Statistics on strings.

double_statistics_data

No

DoubleColumnStatisticsData object

Statistics on floating point numbers.

date_statistics_data

No

DateColumnStatisticsData object

Statistics on date values.

boolean_statistics_data

No

BooleanColumnStatisticsData object

Statistics on Boolean data.

Table 7 BinaryColumnStatisticsData

Parameter

Mandatory

Type

Description

maximum_length

Yes

Long

Maximum value of a byte array in a column.

average_length

Yes

Double

Average length of byte arrays in a column.

number_of_null

Yes

Long

Number of null values in a column.

Table 8 LongColumnStatisticsData

Parameter

Mandatory

Type

Description

minimum_value

Yes

Long

Minimum long integer value in a column.

maximum_value

Yes

Long

Maximum long integer value in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of long integer values in a column after deduplication.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 9 DecimalColumnStatisticsData

Parameter

Mandatory

Type

Description

minimum_value

Yes

Decimal object

Minimum decimal value in a column.

maximum_value

Yes

Decimal object

Maximum decimal value in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of decimal values in a column after deduplication.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 10 Decimal

Parameter

Mandatory

Type

Description

scale

No

Integer

Integer part.

unscaled

No

String

Decimal part.

Table 11 StringColumnStatisticsData

Parameter

Mandatory

Type

Description

average_length

Yes

Double

Average length of strings in a column.

maximum_length

Yes

Long

Maximum length of strings in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of strings after deduplication in a column.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 12 DoubleColumnStatisticsData

Parameter

Mandatory

Type

Description

minimum_value

Yes

Double

Minimum floating point number in a column.

maximum_value

Yes

Double

Maximum floating point number in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of floating point numbers after deduplication in a column.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 13 DateColumnStatisticsData

Parameter

Mandatory

Type

Description

minimum_value

No

String

Minimum timestamp in a column.

maximum_value

No

String

Maximum timestamp in a column.

number_of_null

Yes

Long

Number of null values in a column.

number_of_distinct_value

Yes

Long

Number of timestamps after deduplication in a column.

bit_vector

No

String

Bitmap used for estimating unique values.

Table 14 BooleanColumnStatisticsData

Parameter

Mandatory

Type

Description

number_of_true

Yes

Long

Number of real records in a column.

number_of_false

Yes

Long

Number of false records in a column.

number_of_null

Yes

Long

Number of empty records in a column.

Response Parameters

Status code: 400

Table 15 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Status code: 404

Table 16 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Status code: 500

Table 17 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error message.

solution_msg

String

Solution.

Example Requests

POST https://{endpoint} /v1/{project_id}/instances/{instance_id}/catalogs/{catalog_name}/databases/{database_name}/tables/{table_name}/partitions/column-statistics

{
  "need_merge" : false,
  "statistics" : [ {
    "column_statistics_desc" : {
      "partition_values" : [ "value1", "value2" ],
      "last_analyzed_time" : "2023-05-31T02:52:16.137Z"
    },
    "column_statistics_objects" : [ {
      "column_name" : "column_prefix0",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix1",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix2",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix3",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix4",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix5",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix6",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix7",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix8",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    }, {
      "column_name" : "column_prefix9",
      "column_type" : "string",
      "data_type" : "stringStats",
      "string_statistics_data" : {
        "average_length" : 10,
        "maximum_length" : 100,
        "number_of_null" : 30,
        "number_of_distinct_value" : 20,
        "bit_vector" : "FwAAAAAAAAAAAA=="
      }
    } ]
  } ]
}

Example Responses

Status code: 400

Bad Request

{
  "error_code" : "common.01000001",
  "error_msg" : "failed to read http request, please check your input, code: 400, reason: Type mismatch., cause: TypeMismatchException"
}

Status code: 401

Unauthorized

{
  "error_code": 'APIG.1002',
  "error_msg": 'Incorrect token or token resolution failed'
}

Status code: 403

Forbidden

{
  "error" : {
    "code" : "403",
    "message" : "X-Auth-Token is invalid in the request",
    "error_code" : null,
    "error_msg" : null,
    "title" : "Forbidden"
  },
  "error_code" : "403",
  "error_msg" : "X-Auth-Token is invalid in the request",
  "title" : "Forbidden"
}

Status code: 404

Not Found

{
  "error_code" : "common.01000001",
  "error_msg" : "response status exception, code: 404"
}

Status code: 408

Request Timeout

{
  "error_code" : "common.00000408",
  "error_msg" : "timeout exception occurred"
}

Status code: 500

Internal Server Error

{
  "error_code" : "common.00000500",
  "error_msg" : "internal error"
}

Status Codes

Status Code

Description

200

OK

201

Created

400

Bad Request

401

Unauthorized

403

Forbidden

404

Not Found

408

Request Timeout

500

Internal Server Error

Error Codes

See Error Codes.