Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

Troubleshooting High Memory Usage

Updated on 2025-02-13 GMT+08:00

Description

The memory usage of an RDS for PostgreSQL instance includes the usage of both shared memory and local memory.

  • Shared memory: It is mainly used for the data buffer and WAL buffer to improve the read and write performance. It also stores some global information, such as process and lock information.

    The value of shared_buffers determines the size of the initial shared memory you can request. The initial value for this parameter is set to 25% of the physical memory for an RDS for PostgreSQL instance. The value ranges from 25% to 40%. If the value exceeds 40% of the physical memory, the buffer effect is not obvious. This is because RDS for PostgreSQL runs on the file system and if the file system also has a buffer, there will be two buffers, causing negative impacts.

  • Local memory: Backend services need local memory to temporarily store data that does not need to be stored globally. Local memory is specified by the following parameters:
    • temp_buffers specifies the maximum amount of memory used for temporary buffers within each database session.
    • work_mem specifies the base maximum amount of memory to be used by a query operation (such as a sort or hash table) before writing to temporary disk files. Note that each sort operation instead of each SQL statement will use as much memory as the value of work_mem.
    • maintenance_work_mem specifies the maximum amount of memory to be used by maintenance operations.

Impact

Redundancy is required for the memory of a production DB instance. In normal cases, the memory usage must be less than 70%. If the memory usage continues to be higher than this limit, you are advised to upgrade the memory specifications. High memory usage may trigger an alarm and cause the following problems:

  • Data is frequently swapped between memory and disks, which consumes a large number of CPU resources. As a result, the database performance deteriorates and data reads and writes are affected.
  • In severe cases, an out of memory (OOM) problem may occur. If an OOM problem occurs, the database service process restarts, existing database connections are interrupted, and new connections cannot be established. Then the HA process restarts the DB instance. During this period, the instance replays the WAL logs generated from the last checkpoint to the time when the OOM problem occurred to ensure transaction consistency.

Cause Analysis

High memory usage is generally caused by an increase in connections, active connections, slow SQL queries, TPS, or persistent connections. If the memory usage increases sharply or does not meet your expectation, analyze the causes as follows:

Figure 1 Cause analysis

Troubleshooting

  • Connections/Active Connections
    • On the Cloud Eye console, check whether the memory usage, connection usage, number of database connections, and number of active connections increase or decrease simultaneously in the target period.
    • Run the following SQL statement to check the maximum number of connections allowed for the instance:
      show max_connections;
    • Run the following SQL statement to check the number of active connections to the instance:
      select count(1) from pg_stat_activity where state <> 'idle';
    • Run the following SQL statement to check the number of idle connections:
      select count(1) from pg_stat_activity where state = 'idle';
  • Slow SQL Statements
    • On the Cloud Eye console, check whether the memory usage, number of SQL statements executed for more than 1s, number of SQL statements executed for more than 3s, and number of SQL statements executed for more than 5s increase or decrease simultaneously in the target period.
    • Run the following SQL statement to view the top three slow SQL statements (for RDS for PostgreSQL 10 and later versions) and check whether the SQL statements in the returned query field use the JOIN or ORDER syntax:

      select (extract(epoch from now() - query_start)) query_time, datname, usename, client_addr, wait_event, state, query from pg_stat_activity where state not like 'idle%' and query_start is not null and backend_type = 'client backend' and pid <> pg_backend_pid() order by 1 desc limit 3;

    • Query the pg_stat_statements view to obtain statistics and query the SQL statement that consumes the most shared memory. For details, see 4.
  • TPS

    On the Cloud Eye console, check whether the memory usage and TPS increase or decrease simultaneously in the target period.

  • Persistent Connections
    • Run the SQL statement shown below to view the top three persistent connections (for RDS for PostgreSQL 10 and later versions). In the command output, the conn_time field indicates the connection lifetime, and the query field indicates the SQL statement executed by the process.
      select (extract(epoch from now()-backend_start)) conn_time, datname, pid, usename, client_addr, wait_event_type, wait_event, state, query from pg_stat_activity where backend_type = 'client backend' order by conn_time desc nulls last limit 3;
    • Persistent connections cache certain information, such as query results, transaction information, and lock information, in the database. If many persistent connections are maintained for a long period of time, the cached information increases accordingly, occupying more memory. To further locate the fault, query the pg_stat_statements view based on the value of the query field obtained in the last step and check how much shared memory the SQL statement has used.
      select userid::regrole, dbid, shared_blks_hit, shared_blks_dirtied from pg_stat_statements where query =  'query';

Solution

  • Too Many Connections or Active Connections

    If there are too many connections or idle connections, run the SQL statement shown below or configure connection timeout for clients to release idle connections, or use a connection pool to reduce the overhead of establishing new database connections. If there are too many active connections, reduce the number of concurrent requests or upgrade the memory specifications. You can also kill unnecessary sessions to reduce the memory usage. For details, see Killing Sessions.

    NOTE:

    Killing a session may cause service disconnection. Your applications should be able to reconnect to the instance.

    select pg_terminate_backend(pid) from pg_stat_activity where state = 'idle';
  • Too Many Slow SQL Statements

    Locate the SQL statements that consume much memory, optimize the SQL statements or upgrade the memory specifications.

  • High TPS

    Reduce the number of transactions or upgrade the memory specifications.

  • Too Many Persistent Connections/Long Connection Lifetime

    Periodically release persistent connections because maintaining them may generate a large cache and use up memory.

FAQ

Q: Why does the memory usage increase when pg_dump is used to export data? How do I avoid this problem?

A: When pg_dump is used to export data, a process accesses all objects such as tables and indexes in the target database to obtain structure data. If the accessed tables or indexes are too large, there may be large RelCache (relational table caches) or CatCache (system catalog table caches) that cannot be released. As a result, the memory usage increases and even an OOM problem occurs.

Suggestions for executing a pg_dump task:

  1. Do not perform DDL operations.
  2. Monitor the metric of slow SQL statements. If there is a lock conflict, kill the conflicting process.
  3. Execute the pg_dump task during off-peak hours.
  4. Decrease the values of shared_buffers and work_mem to 1/2 or 1/4 of the current values or less. After the task is complete, roll back the parameters.
  5. Upgrade the memory specifications.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback