Retry Mechanism for GeminiDB Redis Clients

The retry mechanism for GeminiDB Redis clients can ensure high availability and stability of applications if the network is unstable or a server is temporarily faulty.

There may be the following temporary faults.

Cause	Description
HA is triggered.	GeminiDB Redis API automatically monitors node health. If a node breaks down, a primary/standby switchover or shard takeover is automatically triggered. Generally, HA may be triggered when: A GeminiDB process on a node restarts due to OOM or hardware faults. Nodes are automatically removed or added when they are scaled or specifications are changed. In these scenarios, clients may be intermittently disconnected in seconds or commands time out.
The network fluctuates.	Complex network environments between clients and GeminiDB Redis servers may cause problems such as occasional network jitter and data retransmission. In this case, requests initiated by the clients may temporarily fail.
Servers are overloaded.	Requests initiated by clients may not be responded immediately due to heavy loads and slow queries on GeminiDB Redis servers. As a result, the requests time out.

Cause

Description

HA is triggered.

GeminiDB Redis API automatically monitors node health. If a node breaks down, a primary/standby switchover or shard takeover is automatically triggered. Generally, HA may be triggered when:

A GeminiDB process on a node restarts due to OOM or hardware faults.
Nodes are automatically removed or added when they are scaled or specifications are changed.
In these scenarios, clients may be intermittently disconnected in seconds or commands time out.

The network fluctuates.

Complex network environments between clients and GeminiDB Redis servers may cause problems such as occasional network jitter and data retransmission. In this case, requests initiated by the clients may temporarily fail.

Servers are overloaded.

Requests initiated by clients may not be responded immediately due to heavy loads and slow queries on GeminiDB Redis servers. As a result, the requests time out.

When setting retry rules on clients, follow the best practices below.

Best Practice	Description
Configure a proper interval and retry times.	Configure a proper interval and retry times based on business requirements. If an excessive number of retries are attempted, it takes a longer time to recover from a fault. If the interval between retries is shorter than expected, servers may become overwhelmed. In heavy-load scenarios, you are advised to increase the retry interval exponentially to prevent server breakdown due to a large number of concurrent retries.
Retry only idempotent operations.	Commands have been executed on a server, but a timeout occurs when the result is returned to a client. In this case, the commands may be executed repeatedly. Therefore, you are advised to retry only idempotent operations (for example, the SET command), and the result remains unchanged after multiple operations. For non-idempotent operations (for example, the INCR command), you need to confirm whether duplicate data can be tolerated, and multiple operations may increase a counter value.
Generate client logs.	You are advised to configure the system to generate client logs during the retry process, such as the connected IP address and port number, error commands, and keys, to facilitate troubleshooting.

The following SDK code examples are used for reference only.

Jedis

In JedisPool mode, Jedis 4.0.0 or later supports retries. The following uses Jedis 4.0.0 as an example:

package nosql.cloud.huawei.jedis;
 
import redis.clients.jedis.DefaultJedisClientConfig;
import redis.clients.jedis.HostAndPort;
import redis.clients.jedis.JedisClientConfig;
import redis.clients.jedis.UnifiedJedis;
import redis.clients.jedis.providers.PooledConnectionProvider;
import java.time.Duration;
 
// UnifiedJedis API supported in Jedis >= 4.0.0
public class UnifiedJedisDemo {
    private static final int MAX_ATTEMPTS = 5;
    private static final Duration MAX_TOTAL_RETRIES_DURATION = Duration.ofSeconds(15);
    public static void main(String[] args) {
        // Basic connection config
        JedisClientConfig jedisClientConfig = DefaultJedisClientConfig.builder().password("xxx").build();
        // Implement retry
        PooledConnectionProvider provider = new 
            PooledConnectionProvider(HostAndPort.from("{ip}:{port}"), jedisClientConfig);
        UnifiedJedis jedis = new UnifiedJedis(provider, MAX_ATTEMPTS, MAX_TOTAL_RETRIES_DURATION);
        try {
            System.out.println("set key: " + jedis.set("key", "value"));
        } catch (Exception e) {
            // Signifies reaching either the maximum number of failures,
            MAX_ATTEMPTS, or the maximum query time, MAX_TOTAL_RETRIES_DURATION
            e.printStackTrace();
        }
    }
}

Redisson

package nosql.cloud.huawei.jedis;
import org.redisson.Redisson;
import org.redisson.api.RBucket;
import org.redisson.api.RedissonClient;
import org.redisson.config.Config;
 
public class RedissonDemo {
    private static final int TIME_OUT = 3000;
    private static final int RETRY_ATTEMPTS = 5;
    private static final int RETRY_INTERVAL = 1500;
    
    public static void main(String[] args) {
        Config config = new Config();
        config.useSingleServer()
              .setPassword("xxx")
              .setTimeout(TIME_OUT)
              .setRetryAttempts(RETRY_ATTEMPTS)
              .setRetryInterval(RETRY_INTERVAL)
              .setAddress("redis://{ip}:{port}");
        RedissonClient redissonClient = Redisson.create(config);
        RBucket<String> bucket = redissonClient.getBucket("key");
        bucket.set("value");
    }
}

Go-redis

package main
 
import (
       "context"
       "fmt"
       "time"
 
       "github.com/redis/go-redis/v9"
)
 
var ctx = context.Background()
 
func main() {
 
    client := redis.NewClient(&redis.Options{
        Addr:     "localhost:6379",
        Password: "", // no password set
        DB:       0,  // use default DB
        MaxRetries: 3, // set max retry times
        MinRetryBackoff: time.Duration(1) * time.Second, // set retry interval
        MaxRetryBackoff: time.Duration(2) * time.Second, // set retry interval
    })
 
    // Execute command
    err := client.Set(ctx, "key", "value", 0).Err()
    if err != nil {
        panic(err)
    }
 
    // Test
    pong, err := client.Ping(ctx).Result()
    if err != nil {
        fmt.Println("Failed:", err)
        return
    }
    fmt.Println("Success:", pong)
}

Redis-py

import redis
from redis.retry import Retry
from redis.exceptions import ConnectionError
from redis.backoff import ExponentialBackoff
from redis.client import Redis
from redis.exceptions import (
   BusyLoadingError,
   ConnectionError,
   TimeoutError
)
 
# Run 3 retries with exponential backoff strategy
retry_strategy = Retry(ExponentialBackoff(), 3)
 
# Redis client with retries
client = redis.Redis(
    host = 'localhost',
    port = 6379,
    retry = retry_strategy,
    # Retry on custom errors
    retry_on_error = [BusyLoadingError, ConnectionError, TimeoutError],
    # Retry on timeout
    retry_on_timeout = True
)
 
try:
    client.ping()
    print("Connected to Redis!")
except ConnectionError:
    print("Failed to connect to Redis after retries.")
 
try:
    client.set('key', 'value')
    print("Set key and value success!")
except ConnectionError:
    print("Failed to set key after retries.")

Hiredis

Hiredis is a minimalistic C client library and does not provide a preset automated retry mechanism. You need to manually compile the logic.

The following is a simple example of how to implement an automated connection retry in a loop and with a delay, similar to command retry settings.

#include <hiredis/hiredis.h>
#include <stdio.h>
#include <unistd.h>
 
redisContext* connect_with_retry(const char *hostname, int port, int max_retries, int retry_interval) {
    redisContext *c = NULL;
    int attempt = 0;
 
    while (attempt < max_retries) {
        c = redisConnect(hostname, port);
        if (c != NULL && c->err == 0) {
            printf("Connection success!\n");
            return c;
        }
 
        if (c != NULL) {
            printf("Connection error: %s\n", c->errstr);
            redisFree(c);
        } else {
            printf("Connection failed\n");
        }
 
        printf("Retrying in %d seconds...\n", retry_interval);
        sleep(retry_interval);
        attempt++;
    }
 
    return NULL;
}
 
int main() {
    const char* hostname = "127.0.0.1";
    int port = 6379;
    int max_retries = 5;
    int retry_interval = 2;
 
    redisContext *c = connect_with_retry(hostname, port, max_retries, retry_interval);
    if (c == NULL) {
        printf("Failed to connect to Redis after %d attempts\n", max_retries);
        return 1;
    }
 
    redisFree(c);
    return 0;
}

Parent topic: Development Reference

Previous topic: Processing Transactions on a GeminiDB Redis Instance

Next topic: GeminiDB Redis API Pub/Sub