Help Center/ Distributed Cache Service/ Troubleshooting/ Troubleshooting High Bandwidth Usage of a DCS Redis Instance
Updated on 2024-06-19 GMT+08:00

Troubleshooting High Bandwidth Usage of a DCS Redis Instance

Overview

Redis instances are close to application services, and therefore they process a large amount of data access requests and use network bandwidth. The maximum bandwidth varies depending on the instance specifications. When the maximum bandwidth is exceeded, flow control is triggered, and connections are discarded. This may increase the service latency and cause client connection exceptions. This section describes how to troubleshoot high bandwidth usage of a DCS Redis instance.

Procedure

  1. Check the bandwidth usage.

    Check the bandwidth usage of an instance in a specified period. For details, see Viewing Metrics.

    Generally, if the input and output flows increase rapidly and remain above 80% of the instance's maximum bandwidth, the bandwidth may become insufficient.

    The following figure shows the bandwidth usage. Bandwidth usage = (Input flow + Output flow)/(2 x Maximum bandwidth) x 100%

    Figure 1 Bandwidth usage

    Even if the bandwidth usage exceeds 100%, flow control may not necessarily be triggered and can be reflected on the Flow Control Times metric.

    Even if the bandwidth usage is below 100%, flow control may still be triggered. The real-time bandwidth usage is reported once in every reporting period. Flow controls are checked every second. The traffic may surge within seconds and then fall back between reporting periods. By the time the bandwidth usage is reported, it may have already restored to the normal level.

  2. Optimize the bandwidth usage.

    1. The service access traffic may not match the expected bandwidth consumption, for example, the bandwidth usage growth trend is inconsistent with the QPS growth trend. If this happens, analyze whether the traffic increase is from read services or write services by checking the input flow and output flow metrics. If the bandwidth usage on a single node increases, use the cache analysis function to detect big keys by referring to Analyzing Big Keys and Hot Keys. Optimize big keys (keys larger than 10 KB). For example, split big keys, access big keys less frequently, or delete unnecessary big keys.
    2. If the bandwidth usage is still high, scale up the instance to a larger memory size to carry more network traffic. For details, see Modifying Specifications.

      Before the scale-up, you can buy a pay-per-use instance to test whether the desired specifications can meet the service load requirements. After the test is complete, you can release the instance by referring to Deleting an Instance.