Updated on 2023-03-06 GMT+08:00

Out of Memory (OOM) Errors

RDS for MySQL Memory Description

The memory of an RDS for MySQL instance can be roughly divided into two parts: globally shared memory and session-level private memory.

  • Shared memory is allocated upon the creation of an instance based on parameter settings and is shared by all connections.
  • Private memory is allocated by the system upon connection to the RDS for MySQL instance and is released only when the connection is released.

Inefficient SQL statements or improper database parameter settings may increase memory usage and even cause an OOM error during peak hours.

Scenario

The memory usage of an RDS for MySQL instance increased sharply. An OOM error occurred and then the instance restarted.

Troubleshooting

  1. Check the memory usage. In this example, it shot up around 16:30.
    Figure 1 Memory usage
  2. Check for slow SQL queries. The number of slow SQL queries increased sharply in that period.
    Figure 2 Slow SQL queries
  3. Check the disk throughput. There were a large number of read and write operations being performed on the disk in that period.
    Figure 3 Disk throughput
  4. Analyze slow query logs generated in that period. There were a large number of multi-value INSERT statements, which cause every session to request a large amount of session-level memory at the same time. Therefore, an OOM error occurred.
    Figure 4 Slow query logs

Solution

  1. For the OOM error caused by multi-value INSERT statements, reduce the amount of data inserted at a time and disconnect sessions to release memory. You can run the show full processlist command to check whether there are sessions with high memory usage.
  2. Set the session-level memory parameter to an appropriate value. You can estimate the maximum memory based on the following formula: Global memory + Session-level memory x Maximum number of sessions. Note that setting performance_schema to ON also causes memory overhead.
  3. Upgrade the instance specifications to maintain the memory usage within a proper range, preventing a sudden increase in traffic from causing an OOM crash.