Help Center/ Distributed Cache Service/ Troubleshooting/ Troubleshooting Redis Connection Failures
Updated on 2024-10-30 GMT+08:00

Troubleshooting Redis Connection Failures

Overview

This topic describes why Redis connection problems occur and how to solve the problems.

Problem Classification

To troubleshoot abnormal connections to a Redis instance, check the following items:

Connection Between the Redis Instance and the ECS

The ECS where the client is located must be in the same VPC as the Redis instance and be able to communicate with the Redis instance.

  • For a Redis 3.0 or professional edition instance, check the security group rules of the instance and the ECS.

    Correctly configure security group rules for the ECS and the Redis instance to allow the Redis instance to be accessed. For details, see How Do I Configure a Security Group?

  • For a DCS Redis 4.0/5.0/6.0 basic instance, check the whitelist of the instance.

    If the instance has a whitelist, ensure that the client IP address is included in the whitelist. Otherwise, the connection will fail. For details, see Managing IP Address Whitelist. If the client IP address changes, add the new IP address to the whitelist.

  • Check the regions of the Redis instance and the ECS.

    If the Redis instance and the ECS are not in the same region, create another Redis instance in the same region as the ECS and migrate data from the old instance to the new instance by referring to Migration Solution Notes.

  • Check the VPCs of the Redis instance and the ECS.

    Different VPCs cannot communicate with each other. An ECS cannot access a Redis instance if they are in different VPCs. You can establish VPC peering connections to allow the ECS to access the Redis instance across VPCs.

    For more information on how to create and use VPC peering connections, see VPC Peering Connection.

Public Access to Redis 3.0

Before accessing a Redis instance through a public network, ensure that the instance supports public access. For details, see the public access explanation.

  • Symptom: "Error: Connection reset by peer" is displayed or a message is displayed indicating that the remote host forcibly closes an existing connection.
    • Possible cause 1: The security group is incorrectly configured.

      Solution: Correctly configure the Redis instance and access the instance by following the public access instructions.

    • Possible cause 2: Check whether the VPC subnet where Redis resides is associated with a network ACL and whether the network ACL denies outbound traffic. If yes, remove the ACL restriction.
    • Possible cause 3: SSL encryption has been enabled, but Stunnel is not configured during connection. Instead, the IP address displayed on the console was used for connection.

      Solution: When enabling SSL encryption, install and configure the Stunnel client. For details, see Connecting to Redis with SSL Encryption. In the command for connecting to the Redis instance, the address must be set to the IP address and port number of the Stunnel client. Do not use the public access address and port displayed on the console.

  • Symptom: Public access has been automatically disabled.

    Cause: The EIP bound to the DCS Redis instance is unbound. As a result, public access is automatically disabled.

    Solution: Enable public access for the instance and bind an EIP to the instance on the management console. Then, try again.

Password

If the instance password is incorrect, the port can still be accessed but the authentication will fail. If you forget the password, you can reset the password. For details, see Resetting Instance Passwords.

Instance Configuration

If a connection to Redis is rejected, log in to the DCS console, go to the instance details page, and modify the maxclients parameter. For details, see Modifying Configuration Parameters of an Instance.

Client Connections

  • The connection fails when you use redis-cli to connect to a Redis Cluster instance.
    Solution: Check whether -c is added to the connection command. Ensure that the correct connection command is used when connecting to the cluster nodes.
    • Run the following command to connect to a Redis Cluster instance:

      ./redis-cli -h {dcs_instance_address} -p 6379 -a {password} -c

    • Run the following command to connect to a single-node, master/standby, or Proxy Cluster instance:

      ./redis-cli -h {dcs_instance_address} -p 6379 -a {password}

    For details, see Access Using redis-cli.

  • Error "Read timed out" or "Could not get a resource from the pool" occurs.

    Solution:

    • Check if the KEYS command has been used. This command consumes a lot of resources and can easily block Redis. Instead, use the SCAN command and do not execute the command frequently.
    • Check if the DCS instance is Redis 3.0. Redis 3.0 uses SATA disks. During AOF persistence, the disk performance may occasionally deteriorate and cause a connection failure. In this case, disable AOF persistence if data persistence is not required. Alternatively, you can use a DCS Redis 4.0 instance or later because they use SSD disks that offer higher performance.
  • Error "unexpected end of stream" occurs and causes service exceptions.

    Solution:

  • The connection is interrupted.

    Solution:

    • Modify the application timeout duration.
    • Optimize the service to avoid slow queries.
    • Replace the KEYS command with the SCAN command.
  • If an error occurs when you use the Jedis connection pool, see Troubleshooting a Jedis Connection Pool Error.

Bandwidth

If the bandwidth reaches the upper limit of the corresponding instance specifications, Redis connections may time out.

You can view the Flow Control Times metric to check whether the bandwidth has reached the upper limit.

Then, check whether the instance has big keys and hot keys. If a single key is too large or overloaded, operations on the key may occupy too many bandwidth resources. For details about big keys and hot keys, see Analyzing Big Keys and Hot Keys.

Redis Performance

Connections to an instance may become slow or time out if the CPU usage spikes due to resource-consuming commands such as KEYS, or too much memory is used because the expiration time is not set for the instance or expired keys remain in the memory. In these cases, do as follows:

  • Use the SCAN command instead of the KEYS command, or disable the KEYS command.
  • Check the monitoring data and configure alarm rules. For details, see Setting Alarm Rules for Critical Metrics.

    For example, you can view the Memory Usage and Used Memory metrics to keep track of the instance memory usage, and view the Connected Clients metric to determine whether the instance connections limit has been reached.

  • Check whether the instance has big keys and hot keys.

    For details about the operations of big key and hot key analysis, see Analyzing Big Keys and Hot Keys.