Out of Memory (OOM) Errors
TaurusDB Memory Description
The memory of a TaurusDB instance can be roughly divided into two parts: globally shared memory and session-level private memory.
- Shared memory is allocated upon the creation of an instance based on parameter settings and is shared by all connections.
- Private memory is allocated by the system upon connection to the TaurusDB instance and is released only when the connection is released.
Inefficient SQL statements or improper database parameter settings may increase memory usage and even cause an OOM error during peak hours.
Scenario
The memory usage of a TaurusDB instance increased sharply at 16:30. An OOM error occurred and then the instance rebooted.
Troubleshooting
- Check the memory usage. In this example, it shot up around 16:30.
    Figure 1 Memory usage  
- Check for slow SQL queries. The number of slow SQL queries increased sharply in that period.
    Figure 2 Slow SQL queries  
- Check the disk throughput. There were a large number of read and write operations being performed on the disk in that period.
    Figure 3 Disk throughput  
- Analyze slow query logs generated in that period. There were a large number of multi-value INSERT statements, which cause every session to request a large amount of session-level memory at the same time. Therefore, an OOM error occurred.
    Figure 4 Slow query logs  
Solution
- For the OOM error caused by multi-value INSERT statements, reduce the amount of data inserted at a time and disconnect sessions to release memory. You can run the show full processlist command to check whether there are sessions with high memory usage.
- Set the session-level memory parameter to an appropriate value. You can estimate the maximum memory based on the following formula: Global memory + Session-level memory x Maximum number of sessions. Note that setting performance_schema to ON also causes memory overhead.
- Upgrade the instance specifications to maintain the memory usage within a proper range, preventing a sudden increase in traffic from causing an OOM crash.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot 
    