Updated on 2025-05-29 GMT+08:00

Possible Causes of Data Table Bloat

To ensure the service performance of the database, the delete operation does not immediately release the space of historical version rows. Instead, the rows are marked as deleted. The space is then asynchronously reclaimed. If the recycling operation is blocked or a logic defect occurs, the deleted space cannot be reused, and the newly inserted rows continuously expand the file, leading to tablespace bloat. Space bloat may occur in database indexes, heap tables, and log files. In this section, the principle of heap tablespace bloat is discussed.

For Astore, the multi-version mechanism stores historical versions in the heap table to form a version chain. When a tuple is deleted or updated, the original tuple becomes a dead tuple. Recycling can be triggered only when the transaction ID of the operation transaction is less than oldestxmin. Therefore, Astore is prone to have space bloat issues.

Ustore uses the multi-version management mechanism to store historical versions in undo logs. The historical versions can be cleared after transactions are submitted. In addition, Ustore uses in-place update instead of the append update mode of Astore. The update operation does not cause heap tablespace bloat. Therefore, in frequent update scenarios, the space stability of Ustore is much better than that of Astore. However, this does not mean that Ustore is not subject to space bloat. Based on the backtracking and summary of historical problems, the main causes of heap tablespace bloat are classified into the following types:

  1. A long transaction exists. As a result, oldestxmin is not properly updated. After a transaction is deleted and submitted, the space cannot be reused.
  2. The AUTOVACUUM cleanup is inefficient and cannot keep up with the rate at which dead tuples are generated.
  3. FSM is incorrect or inaccurate.