Updated on 2024-06-03 GMT+08:00

GS_STATISTIC_EXT_HISTORY

GS_STATISTIC_EXT_HISTORY is a multi-column historical statistics management table. It stores historical extended statistics about tables in the database, including multi-column statistics and expression statistics (supported later). You can specify the extended statistics to collect. This system catalog is accessible only to system administrators.

Table 1 GS_STATISTIC_EXT_HISTORY columns

Name

Type

Description

starelid

oid

Table or index that the described column belongs to.

starelkind

"char"

Type of the object to which a table belongs. 'c' indicates an ordinary table, and 'p' indicates a partitioned table.

stainherit

Boolean

Determines whether to collect statistics for objects that have inheritance relationship.

statimestamp

timestamp with time zone

Time when the statistics are collected.

stanullfrac

real

Percentage of column entries that are null.

stawidth

integer

Average stored width, in bytes, of non-null entries.

stadistinct

real

Number of distinct, non-null data values in the column for database nodes.
  • A value greater than 0 indicates the actual number of distinct values.
  • A value less than 0 indicates the ratio of the distinct value to the total number of rows. For example, if stadistinct is -0.5, the actual distinct value is the total number of rows multiplied by 0.5.
  • The value 0 indicates that the number of distinct values is unknown.

standistinct

real

Number of unique non-null data values in the DN1 column.
  • A value greater than 0 indicates the actual number of distinct values.
  • A value less than 0 is the ratio of the distinct value to the total number of rows. For example, if standistinct is -0.5, the actual distinct value is the total number of rows multiplied by 0.5.
  • The value 0 indicates that the number of distinct values is unknown.

standvfunc

"char"

Algorithm used to calculate the NDV based on the statistics.

  • d: The original DUJ1 algorithm is used for estimation.
  • c: The C19 algorithm is used for estimation.

staorigin

"char"

Source of the statistics collection mode.

  • a: The collection is triggered by AUTOANALYZE.
  • m: The collection is triggered by manual ANALYZE.
  • g: The gsstat thread is triggered to perform ANALYZE collection when a large amount of data is inserted.

stakindN

smallint

Code number stating that the type of statistics is stored in slot N of the pg_statistic row.

The value of N ranges from 1 to 5.

staopN

oid

Operator used to generate the statistics stored in slot N. For example, a histogram slot shows the < operator that defines the sort order of the data.

The value of N ranges from 1 to 5.

stakey

int2vector

Array of a column ID.

stanumbersN

real[]

Numerical statistics of the appropriate type for slot N. The value is NULL if the slot does not involve numerical values.

The value of N ranges from 1 to 5.

stavaluesN

anyarray

Column data values of the appropriate type for slot N. The value is NULL if the slot type does not store any data values. Each array's element values are actually of the specific column's data type so there is no way to define these columns' type more specifically than anyarray.

The value of N ranges from 1 to 5.

staexprs

pg_node_tree

Expression corresponding to the extended statistics information.

stasource

"char"

Source of extended statistics:
  • 'a': indicates that the statistics data is automatically created, which is controlled by the GUC parameter auto_statistic_ext_columns.
  • 'm': indicates that a user manually creates the statistics data using analyze tablename ((column list)) or alter table tablename add statistics ((column list)).

stastatus

"char"

Status of extended statistics:
  • 'a': active and available.
  • 'd': disabled. Related information is not collected, and the optimizer does not use the data when generating a plan. You can use the alter table tablename disable/enable statistics((column list)) syntax to modify the status of extended statistics.

staextname

name

Alias of the multi-column group of multi-column statistics.