Help Center/ MapReduce Service/ Component Operation Guide (LTS)/ Using Saprk/Spark2x/ Common Issues About Spark2x/ Failed to Query Table Statistics by Partition Using Non-Standard Time Format When the Partition Column in the Table Creation Statement is timestamp
Updated on 2024-05-29 GMT+08:00

Failed to Query Table Statistics by Partition Using Non-Standard Time Format When the Partition Column in the Table Creation Statement is timestamp

Symptom

When the partition column in the table creation statement is timestamp, the tabale statistics failed to be queried by partition using non-standard time format, and the result code of show partitions table is incorrect.

Run the desc formatted test_hive_orc_snappy_internal_table partition(a='2016-8-1 11:45:5'); command to query the error. The following figures shows as an example.

Solution

The spark.sql.hive.convertInsertingPartitionedTable switch controls the insert and writing logic of Hive and Datasource tables. When Hive tables are used, timestamps are not automatically formatted. When Datasource tables are used, timestamps are automatically formatted.

If the written partition field is a='2016-8-1 11:45:5', an error is reported, and it is automatically formatted to a='2016-08-01 11:45:05'.

To correctly query the table statistics, perform the following operation:

If the value of spark.sql.hive.convertInsertingPartitionedTable is set to true, use the data source table logic. You can run the following command to query the statistics:

desc formatted test_hive_orc_snappy_internal_table partition(a='2016-08-01 11:45:05');