Help Center > > Developer Guide> Data Import> Analyzing a Table

Analyzing a Table

Updated at: Sep 17, 2021 GMT+08:00

The execution plan generator needs to use table statistics to generate the most effective query execution plan to improve query performance. After data is imported, you are advised to run the ANALYZE statement to update table statistics. The statistics are stored in the PG_STATISTIC system catalog.

Analyzing a Table

ANALYZE supports row-store, column-store, HDFS, and OBS tables in ORC or CarbonData format. ANALYZE can also collect statistics about specified columns of a local table. For details about ANALYZE, see ANALYZE | ANALYSE.

  1. Update table statistics.

    Do ANALYZE to the product_info table.
    ANALYZE product_info;

Automatically Analyzing a Table

GaussDB(DWS) provides automatic table analysis for the following two scenarios.

  • If a query contains a table that has no statistics and the execution plan does not use Fast Query Shipping (FQS), the GUC parameter autoanalyze is used to control the automatic collection of table statistics in this scenario. In this case, the query statement generates a better execution plan after the statistics are successfully collected, and then executes the original query statement. This greatly improves the query performance.
  • The GUC parameter autovacuum is used when the data volume of a table changes greatly and reaches the specified threshold. When autovacuum is set to on, the system starts the autovacuum thread at regular intervals to trigger automatic analysis for tables in the following two scenarios: an empty table that has more than 50 data rows inserted, or a table that already has data rows and the number of data rows has reached the threshold (50 + 10% × reltuples, where reltuples indicates the total number of rows in the table).
  • The autoanalyze function supports only the default sampling mode and not the percentage sampling mode.
  • The autoanalyze function does not collect multi-column statistics, which only supports percentage sampling.
  • The autoanalyze function is automatically triggered if statistics get lost during the query. The autoanalyze function cannot be triggered for foreign tables or temporary tables with the ON COMMIT [DELETE ROWS | DROP] option.
  • The autoanalyze function triggered by a scheduled autovacuum thread supports only row-store and column-store tables. It does not support foreign tables, HDFS tables, OBS foreign tables, temporary tables, unlogged tables, or toast tables.
  • The autovacuum function also depends on the following two GUC parameters in addition to autovacuum:
    • track_counts: This parameter must be set to on to enable statistics collection about the database.
    • autovacuum_max_workers: This parameter must be set to a value greater than 0 to specify the maximum number of concurrent autovacuum threads.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?

Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel