Updated on 2024-10-14 GMT+08:00

Events Supported by Event Monitoring

Table 1 Resource exception events

Event Source

Event Name

Event ID

Event Severity

Description

Solution

Impact

RDS

DB instance creation failure

createInstanceFailed

Major

A DB instance fails to create because the number of disks is insufficient, the quota is insufficient, or underlying resources are exhausted.

Check the number of disks and quota size. Release resources and create DB instances again.

DB instances cannot be created.

Failed to synchronize cross-region backups

crossRegionBackupSyncFailed

Minor

Generally, this problem is caused by insufficient underlying network and replication resources.

If this event is continuously reported, submit a service ticket to adjust the underlying resource allocation.

Backups cannot be used for restoration in the destination region.

Full backup failure

fullBackupFailed

Major

A single full backup failure does not affect the files that have been successfully backed up, but prolongs the incremental backup restoration time during the point-in-time recovery (PITR).

Create a manual backup again.

Backup failed.

Primary/standby switchover failure

activeStandBySwitchFailed

Major

The standby DB instance does not take over workloads from the primary DB instance due to network or server failures. The original primary DB instance continues to provide workloads within a short time.

Check whether the connection between your application and the database is re-established.

None

Replication status abnormal

abnormalReplicationStatus

Major

The possible causes are as follows:

The replication delay between the primary and standby instances is too long, which usually occurs when a large amount of data is written to databases or a large transaction is processed. During peak hours, data may be blocked.

The network between the primary and standby instances is disconnected.

Submit a service ticket.

Your applications are not affected because this event does not interrupt data read and write.

Replication status recovered

replicationStatusRecovered

Major

The replication delay between the primary and standby instances is within the normal range, or the network connection between them has restored.

No action is required.

None

DB instance faulty

faultyDBInstance

Major

A single or primary DB instance was faulty due to a disaster or a server failure.

Check whether an automated backup policy has been configured for the DB instance and submit a service ticket.

The database service may be unavailable.

DB instance recovered

DBInstanceRecovered

Major

RDS rebuilds the standby DB instance with its high availability. After the instance is rebuilt, this event will be reported.

No action is required.

None

Failure of changing single DB instance to primary/standby

singleToHaFailed

Major

A fault occurs when RDS is creating the standby DB instance or configuring replication between the primary and standby DB instances. The fault may occur because resources are insufficient in the data center where the standby DB instance is located.

Submit a service ticket.

Your applications are not affected because this event does not interrupt data read and write of the DB instance.

Database process restarted

DatabaseProcessRestarted

Major

The database process is stopped due to insufficient memory or high load.

Log in to the Cloud Eye console. Check whether the memory usage increases sharply, the CPU usage is too high for a long time, or the storage space is insufficient. You can increase the CPU and memory specifications or optimize the service logic.

Downtime occurs. In this case, RDS automatically restarts the database process and attempts to recover the workloads.

Instance storage full

instanceDiskFull

Major

Generally, the cause is that the data space usage is too high.

Scale up the instance.

The DB instance becomes read-only because the storage space is full, and data cannot be written to the database.

Instance storage full recovered

instanceDiskFullRecovered

Major

The instance disk is recovered.

No action is required.

The instance is restored and supports both read and write operations.

Maximum MySQL instance connections reached

mysqlConnectionsFull

Major

The maximum number of connections supported has been reached as service volume surged.

  • Release unnecessary connections.
  • Reduce the load by limiting traffic.
  • Upgrade the instance class to allow more connections.

New connections cannot be established.

The number of MySQL instance connections has been reduced to below the maximum number of connections

mysqlConnectionsFullRecovered

Major

The number of instance connections has been reduced to below the maximum number of connections.

Check whether services are running properly.

The number of instance connections has been reduced to below the maximum number of connections.

New connection errors caused by a MySQL overload

highLoadInstanceConnectionsAbnormal

Major

New connections cannot be set up or are abnormal because resources such as CPUs, the memory, storage, or network bandwidth are insufficient.

  • Scale up system resources like CPUs, the memory, and storage.
  • Adjust MySQL parameters, for example, increasing the connection pool size and adjusting the cache size.
  • Select the abnormal session you want to end and kill it for the databases to recover.

New connections cannot be set up or are abnormal.

New connection error caused by MySQL overload has been recovered

highLoadInstanceConnectionsAbnormalRevocered

Major

The new connection error caused by MySQL overload has been recovered.

Check whether services are running properly.

The new connection error caused by MySQL overload has been recovered.

Kafka connection failed

kafkaConnectionFailed

Major

The network is unstable or the Kafka server does not work properly.

Check your network connection and the Kafka server status.

Audit logs cannot be sent to the Kafka server.

Database proxy

Proxy instance access to DB instance failure

proxy_connection_failure_cause_security_group

Major

No rules in the security group of the DB instance allow the proxy instance to access the DB instance.

Add the proxy instance address to the rules of the security group.

Service requests routed through the proxy instance are interrupted.

Connection failure between proxy instance and DB instance

proxy_connection_failure_to_db

Major

The proxy instance failed to establish a new connection with the primary DB instance, and it may fail to establish a new connection with a read replica. The DB instance or proxy instance is overloaded, or the network between the them is abnormal.

Change values of related parameters based on metrics (connections, active connections, and CPU usage) of the DB instance and proxy instance. If the metrics are normal, submit a service ticket.

Service requests routed through the proxy instance are interrupted.

Connection failure between database proxy and read replica

proxy_connection_failure_to_replica

Minor

The proxy instance failed to establish a new connection with a read replica. The read replica is overloaded, or the network between the proxy instance and read replica is abnormal.

Change values of related parameters based on metrics (connections, active connections, and CPU usage) of the read replica. If the metrics are normal, submit a service ticket.

Read requests routed through the proxy instance are partially interrupted.

Table 2 Operation events

Event Source

Event Name

Event ID

Event Severity

Description

RDS

Reset administrator password

resetPassword

Major

The password of the database administrator is reset.

Operate DB instance

instanceAction

Major

The storage space is scaled or the instance class is changed.

Delete DB instance

deleteInstance

Minor

The DB instance is deleted.

Modify backup policy

setBackupPolicy

Minor

The backup policy is modified.

Modify parameter group

updateParameterGroup

Minor

The parameter group is modified.

Delete parameter group

deleteParameterGroup

Minor

The parameter group is deleted.

Reset parameter group

resetParameterGroup

Minor

The parameter group is reset.

Change database port

changeInstancePort

Major

The database port is changed.

Primary/standby switchover or failover

PrimaryStandbySwitched

Major

A switchover or failover is performed.