Help Center/ Cloud Search Service/ Troubleshooting/ Clusters/ How Do I Handle the Error "Connection reset by peer" That Occurs When Spring Boot Uses ES?
Updated on 2022-08-31 GMT+08:00

How Do I Handle the Error "Connection reset by peer" That Occurs When Spring Boot Uses ES?

Issue

When Spring Boot uses ES RestHighLevelClient to connect to ES, the error "Connection reset by peer" is reported, the TCP connection is interrupted, and service data fails to be written.

Symptom

The TCP connection is interrupted, and service data fails to be written.

Possible Causes

There are many possible causes. For example, the connection was disabled; the firewall, switch, or VPN was faulty; the keepalive settings were incorrect; the connected server node was changed; or the network was unstable.

Procedure

  • Method 1

    Modify the timeout interval of RestHighLevelClient connection requests. The default value is 1000 ms. You can increase the value to 10000 ms.

    RestClientBuilder builder = RestClient.builder(new HttpHost(endpoint, port))
            .setHttpClientConfigCallback(httpClientBuilder->    
    httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider))
            .setRequestConfigCallback(requestConfigBuilder ->
    requestConfigBuilder.setConnectTimeout(10000).setSocketTimeout(60000));
        return new RestHighLevelClient(builder);

    Settings for a single request: request.timeout(TimeValue.timeValueSeconds(60));

  • Method 2

    Create a timer in Spring Boot to periodically check for the keepalive signals of ES.

    @Scheduled(fixedRate = 60000, initialDelay = 60000)
    public void keepConnectionAlive() {
        log.debug("Trying to ping Elasticsearch");
        try {
            final long noOfSportsFacilities = restHighLevelClient.status();
            log.debug("Ping succeeded for SportsFacilityViewRepository, it contains {} entities", noOfSportsFacilities);
        } catch (Exception e) {
            log.debug("Ping failed for SportsFacilityViewRepository");
        }
    }
  • Method 3

    Set the RestHighLevelClient keepalive time to 15 minutes.

  • Method 4

    Capture the exception in code and retry the request.

Reference

  • TCP connections

    TCP connections are classified into persistent connections and short connections. A short TCP connection is automatically disconnected after data packets are sent. A persistent TCP connection uses the keepalive timer function, and remains open for a certain period of time after data packets are sent.

  • TCP keepalive mechanism

    The keepalive mechanism is implemented using a timer. If the timer is activated, the server will send a keepalive probe packet. An ACK message is expected as a response. If the client does not respond, the server will terminate the connection. If the client responds, the keepalive timer will be reset.

    The keepalive duration on the server is set to 30m. In Linux, three parameters can be used to control the keepalive duration: tcp_keepalive_time (idle duration for enabling the keepalive function), tcp_keepalive_intvl (interval for sending keepalive packets), and tcp_keepalive_probes (the number of times the keepalive packets are sent if no response is received).

  • http-keepalive

    The http-keepalive mechanism enables a TCP connection to transmit as many packets as possible. The http-keepalive duration is updated each time a packet is transmitted. If the http-keepalive duration expires, it indicates that the client and server did not exchange packets during this period. In this case, the connection is automatically closed and released.

    The tcp-keepalive mechanism retains a TCP connection until the connection is deliberately closed.