High Table Fragmentation Rate
Scenario
High table fragmentation rate is a common problem in RDS for MySQL instances. Table fragments mean that table data and indexes are scattered in different physical blocks. These physical blocks may be discontinuous or have some free space, so the storage of table data and indexes on disks is not optimal.
This problem is caused by operations (such as deletion, update, and insertion) on table data. If data rows in tables are frequently modified and moved, data segments in the tables become discontinuous.
Impact and Risk
- Tablespace bloat
High table fragmentation rate causes a large amount of unused space in the instance. It is a waste of space.
- Poor query optimization
If the table fragmentation rate is too high, the optimizer cannot correctly and effectively use indexes, affecting execution plan selection and degrading the query performance.
- Slow SQL execution
If the table fragmentation rate is too high, extra time is required for I/O scanning and defragmentation when SQL statements are executed. As a result, the query and update operations are slow and the response time is prolonged.
Troubleshooting
Method 1: Use DBA Assistant to view the storage usage of your DB instance in real time to prevent storage space from being insufficient.
- Log in to the management console.
- Click in the upper left corner and select a region.
- Click in the upper left corner of the page and choose Databases > Relational Database Service.
- On the Instances page, click the DB instance name.
- In the navigation pane, choose DBA Assistant > Real-Time Diagnosis.
- Click the Storage Analysis tab. On the displayed page, you can view the fragment space and fragmentation rate in the top 50 databases and tables area.
Figure 1 Top 50 databases and tables
Method 2: Run commands to view the fragmentation rate.
- Run the following command to analyze a table and update the table statistics:
ANALYZE TABLE table_name;
- Run the following commands to view details about a table:
SELECT table_name, data_length, data_free FROM information_schema.tables WHERE table_schema = 'database_name' AND table_name = 'table_name';
- table_name: name of the table
- data_length: size of data stored in the table (in byte)
- data_free: size of free space of the table (in byte)
Generally, you can preliminarily determine the fragmentation rate based on the ratio of data_free to data_length.
Possible Causes
Cause 1: Parallel Migration During DRS Full Migration
During a full migration, DRS uses row-level parallel migration to ensure migration performance and transmission stability. If the source database data is compact, a high fragmentation rate may cause data bloat after data is migrated to RDS for MySQL. As a result, storage usage of the destination database is much higher than that of the source database.
Cause 2: Table Fragmentation After a Large Number of Deletions
When data is deleted, RDS for MySQL does not reclaim the storage occupied by the deleted data. Instead, it only marks the deletion and fills the space with new data if there is any. If there is no data to fill up the space, tablespace bloat is the result, along with table fragmentation.
You can run the SQL statement shown below to query details about a table. The DATA_FREE field in the command output indicates the size of tablespace fragments.
select * from information_schema.tables where table_schema='db_name' and table_name = 'table_name'\G;
Solution
Optimize the table fragmentation rate in the following scenarios:
- The instance has been running for a long period of time.
Data operations, such as insertion, update, and deletion, may generate table fragments.
- There are a large number of data changes.
- Database performance deteriorates.
If you identify obvious performance deterioration when querying a given amount of data, you may need to check the fragmentation rate.
- The storage space is insufficient.
If the storage space usage is too high, you can check the fragment space and defragment the table to release the storage space.
To solve the problem of high table fragmentation rate, you are advised to periodically analyze fragments of frequently accessed tables, clear the fragments, and optimize tablespaces to improve performance.
To optimize a table, run the following command:
OPTIMIZE TABLE table_name;
The optimize table statement locks the table for a short period of time. The overall execution time depends on the table size. Generally, the execution takes a long time and occupies many resources (the storage space that is 1.5 times the size of the table to be optimized must be reserved). To avoid impact on your workloads, you are advised to optimize a table during off-peak hours.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot