Updated on 2024-10-24 GMT+08:00

High Table Fragmentation Rate

Scenario

High table fragmentation rate is a common problem in RDS for MySQL instances. Table fragments mean that table data and indexes are scattered in different physical blocks. These physical blocks may be discontinuous or have some free space, so the storage of table data and indexes on disks is not optimal.

This problem is caused by operations (such as deletion, update, and insertion) on table data. If data rows in tables are frequently modified and moved, data segments in the tables become discontinuous.

Impact and Risk

  • Tablespace bloat

    High table fragmentation rate causes a large amount of unused space in the instance. It is a waste of space.

  • Poor query optimization

    If the table fragmentation rate is too high, the optimizer cannot correctly and effectively use indexes, affecting execution plan selection and degrading the query performance.

  • Slow SQL execution

    If the table fragmentation rate is too high, extra time is required for I/O scanning and defragmentation when SQL statements are executed. As a result, the query and update operations are slow and the response time is prolonged.

Troubleshooting

Method 1: Use DBA Assistant to view the storage usage of your DB instance in real time to prevent storage space from being insufficient.

  1. Log in to the management console.
  2. Click in the upper left corner and select a region.
  3. Click in the upper left corner of the page and choose Databases > Relational Database Service.
  4. On the Instances page, click the DB instance name.
  5. In the navigation pane, choose DBA Assistant > Real-Time Diagnosis.
  6. Click the Storage Analysis tab. On the displayed page, you can view the fragment space and fragmentation rate in the top 50 databases and tables area.
    Figure 1 Top 50 databases and tables

Method 2: Run commands to view the fragmentation rate.

  1. Run the following command to analyze a table and update the table statistics:
    ANALYZE TABLE table_name;
  2. Run the following commands to view details about a table:
    SELECT 
        table_name,
        data_length,
        data_free
    FROM
        information_schema.tables
    WHERE
        table_schema = 'database_name' 
        AND
        table_name = 'table_name';
    • table_name: name of the table
    • data_length: size of data stored in the table (in byte)
    • data_free: size of free space of the table (in byte)

    Generally, you can preliminarily determine the fragmentation rate based on the ratio of data_free to data_length.

Possible Causes

Cause 1: Parallel Migration During DRS Full Migration

During a full migration, DRS uses row-level parallel migration to ensure migration performance and transmission stability. If the source database data is compact, a high fragmentation rate may cause data bloat after data is migrated to RDS for MySQL. As a result, storage usage of the destination database is much higher than that of the source database.

Cause 2: Table Fragmentation After a Large Number of Deletions

When data is deleted, RDS for MySQL does not reclaim the storage occupied by the deleted data. Instead, it only marks the deletion and fills the space with new data if there is any. If there is no data to fill up the space, tablespace bloat is the result, along with table fragmentation.

You can run the SQL statement shown below to query details about a table. The DATA_FREE field in the command output indicates the size of tablespace fragments.

select * from information_schema.tables where table_schema='db_name' and table_name = 'table_name'\G;
Figure 2 Command output

Solution

Optimize the table fragmentation rate in the following scenarios:

  • The instance has been running for a long period of time.

    Data operations, such as insertion, update, and deletion, may generate table fragments.

  • There are a large number of data changes.

    A large number of data changes may cause fragments.

  • Database performance deteriorates.

    If you identify obvious performance deterioration when querying a given amount of data, you may need to check the fragmentation rate.

  • The storage space is insufficient.

    If the storage space usage is too high, you can check the fragment space and defragment the table to release the storage space.

To solve the problem of high table fragmentation rate, you are advised to periodically analyze fragments of frequently accessed tables, clear the fragments, and optimize tablespaces to improve performance.

To optimize a table, run the following command:

OPTIMIZE TABLE table_name;

The optimize table statement locks the table for a short period of time. The overall execution time depends on the table size. Generally, the execution takes a long time and occupies many resources (the storage space that is 1.5 times the size of the table to be optimized must be reserved). To avoid impact on your workloads, you are advised to optimize a table during off-peak hours.