Configuring Container Health Check
Scenario
Health check regularly checks the health of containers when the containers are running. If health check is not configured, a pod cannot detect application exceptions or automatically restart the application to recover it. As a result, the pod may be in the Running state, but the application is unavailable or abnormal.
Kubernetes provides three types of health check probes to monitor the applications in containers for system stability and high availability.
- Liveness probe: checks whether a container is still alive. It is similar to the ps command that checks whether a process exists. If a liveness probe fails, the cluster restarts the container. If the liveness probe is successful, no further action is taken.
- Readiness probe: checks whether the application in a container is ready to receive traffic. In some scenarios, an application has started, but it is not ready to provide services because a large amount of disk data needs to be loaded or external services need to be initialized. In this case, you can use a readiness probe to prevent traffic from being routed to the application. If the readiness probe fails, the CCE cluster temporarily removes the container from the endpoint list of the Service to block external requests. If the readiness probe is successful, the container is considered ready and can receive traffic.
- Startup probe: checks whether the application in a container has started. The cluster starts the liveness and readiness checks only when the startup probe is successful to ensure that the checks do not affect the application startup. This kind of probe works well for containers that take a long time to start. They can effectively prevent containers from being incorrectly considered as abnormal and terminated before the initialization is complete.
For more information, see Liveness, Readiness, and Startup Probes and how to configure them.
Configuring Liveness Probes
A liveness probe detects issues where a container is running but unresponsive, such as deadlocks.
- Log in to the CCE console.
- When creating a workload, select Health Check in Container Information.
- Enable the liveness probe. Figure 1 Liveness probe status setting
Table 1 Probe parameters Check Method
Description
Specific Parameter
Common Parameter
HTTP (httpGet)
This method applies to containers that expose HTTP or HTTPS services. The cluster periodically sends an HTTP/HTTPS GET request to the target container's endpoint. If the response status code is within the range 200–399, the probe is considered successful. If the response status code is outside this range, the probe fails, and the kubelet stops and restarts the container.
- Path: the HTTP or HTTPS path to check, as defined by the container image. It must be an absolute path starting with a slash (/).
- Port: the container port exposed for health checks. Valid range: 1 to 65535.
- Host Address: the target host IP address for the request. If unspecified, it defaults to the pod IP address.
- Protocol: the protocol used for the request. It must match the protocol exposed by the container service.
- Request Header: the HTTP or HTTPS header to include in the request, specified as a key-value pair.
- Period (periodSeconds): the interval between probe checks, in seconds.
For example, if this parameter is set to 30, the health check is performed every 30 seconds.
- Delay (initialDelaySeconds): the grace period for container startup before health checks begin, in seconds.
For example, if this parameter is set to 30, the health check starts 30 seconds after the container starts.
- Timeout (timeoutSeconds): the maximum time to wait for a probe response, in seconds. If it is set to 0 or left blank, the default value (1) will be used.
For example, if this parameter is set to 10, the health check timeout is 10 seconds. The probe fails if the response takes longer than 10 seconds.
- Success Threshold (successThreshold): the minimum consecutive successful probes required to mark a container healthy after a failure. The default value is 1, which is also the minimum value. This parameter must be set to 1 for liveness and startup probes.
For example, if this parameter is set to 1, the container recovers after one successful probe following a failure.
- Failure Threshold (failureThreshold): the consecutive probe failures before marking a container unhealthy. The default value is 3, and the minimum value is 1.
For liveness probes, if the number of consecutive failures reaches this threshold, the container is marked unhealthy, and the kubelet restarts the container.
For readiness probes, if the number of consecutive failures reaches this threshold, the pod is marked unready and removed from Service endpoints. In this case, the container receives no new traffic but is not restarted.
TCP (tcpSocket)
This method applies to containers that expose TCP services (such as databases, caches, and custom TCP applications). The cluster periodically attempts to set up a TCP connection with the target container. If the connection is successful, the probe is healthy. Otherwise, the probe fails, and the kubelet stops and restarts the container.
Port: the container port exposed for health checks. Valid range: 1 to 65535.
Command (exec)
You need to specify an executable command for the cluster to periodically run inside the container. If the command exits with code 0, the probe is successful. Otherwise, the probe fails, and the kubelet stops and restarts the container.
CAUTION:Avoid command-based probes in high-load environments because they consume system resources. If system resources are insufficient, such as high CPU usage or a locked filesystem, probes may time out and fail. If you must use them, follow these guidelines:
- Increase the failure threshold and timeout to prevent transient resource spikes from triggering probe failures. However, this reduces detection sensitivity for actual unhealthy states, so tune conservatively.
- Set proper CPU limits on service containers and system add-ons. Otherwise, time-slice starvation can keep kernel locks held and block exec probes across all pods on the node.
Command: the command executed inside the container to check its status. Enter multiple commands on separate lines.
NOTE:- Before using this method, you must package the required programs and tools in the container image. The cluster executes commands directly in the container. Host filesystems and other containers' filesystems are inaccessible. If the dependent programs or tools (such as curl, nc, or custom scripts) are not included in the image, error message "Command not found" will be displayed.
- If a shell script is executed, you must specify a script interpreter. The cluster does not provide an interactive terminal, so you cannot execute scripts directly. You must use the interpreter to invoke the script. For example, if the script is located in /data/scripts/health_check.sh, you need to execute sh /data/scripts/health_check.sh.
gRPC Check (grpc)
This method applies to gRPC applications. No HTTP ports or external scripts are required. Health checks use standard gRPC APIs.
Port: the container port exposed for health checks. Valid range: 1 to 65535.
- Configure other parameters and click Create Workload in the lower right corner. If the workload is in the Running state, the health check is successful.
- Use kubectl to access the cluster. For details, see Accessing a Cluster Using kubectl.
- Create a YAML file for configuring a workload. In this example, the file name is health_check.yaml. You can change it as needed.
vim health_check.yamlThe following uses an HTTP request as an example. For details about other health check methods, see Configuring Liveness, Readiness and Startup Probes. File content:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx namespace: default spec: replicas: 1 selector: matchLabels: app: nginx version: v1 template: metadata: labels: app: nginx version: v1 spec: containers: - name: container-1 image: nginx:latest livenessProbe: # The liveness probe httpGet: # HTTP requests are used to check container health. path: / # The HTTP health check path port: 80 # The health check port is 80. host: '' # The host address, which defaults to the pod IP address scheme: HTTP # The health check protocol httpHeaders: # (Optional) The request header name is Custom-Header, and the value is Awesome. - name: Custom-Header value: Awesome initialDelaySeconds: 3 # The grace period for container startup before health checks begin, in seconds timeoutSeconds: 1 # The probe timeout, in seconds periodSeconds: 3 # The probe check period, in seconds successThreshold: 1 # The minimum consecutive successful probes required to mark a container healthy after a failure failureThreshold: 3 # The minimum consecutive probe failures before marking a container unhealthy imagePullSecrets: - name: default-secret - Create the workload.
kubectl create -f health_check.yamlIf information similar to the following is displayed, the workload is being created:
deployment.apps/nginx created
- Check the workload pod.
kubectl get pod
If the pod status is Running, the workload has been created.
NAME READY STATUS RESTARTS AGE nginx-58cdd4f48d-jcsqn 1/1 Running 0 4m19s
Configuring Readiness Probes
A readiness probe determines whether a pod's containers are ready to receive traffic. A pod joins Service endpoints to receive traffic only after all its containers are ready.
- Log in to the CCE console.
- When creating a workload, select Health Check in Container Information.
- Enable the readiness probe. Figure 2 Readiness probe status setting
Table 2 Probe parameters Check Method
Description
Specific Parameter
Common Parameter
HTTP (httpGet)
This method applies to containers that expose HTTP or HTTPS services. The cluster periodically sends an HTTP/HTTPS GET request to the target container's endpoint. If the response status code is within the range 200–399, the probe is considered successful. If the response status code is outside this range, the probe fails, and the kubelet stops and restarts the container.
- Path: the HTTP or HTTPS path to check, as defined by the container image. It must be an absolute path starting with a slash (/).
- Port: the container port exposed for health checks. Valid range: 1 to 65535.
- Host Address: the target host IP address for the request. If unspecified, it defaults to the pod IP address.
- Protocol: the protocol used for the request. It must match the protocol exposed by the container service.
- Request Header: the HTTP or HTTPS header to include in the request, specified as a key-value pair.
- Period (periodSeconds): the interval between probe checks, in seconds.
For example, if this parameter is set to 30, the health check is performed every 30 seconds.
- Delay (initialDelaySeconds): the grace period for container startup before health checks begin, in seconds.
For example, if this parameter is set to 30, the health check starts 30 seconds after the container starts.
- Timeout (timeoutSeconds): the maximum time to wait for a probe response, in seconds. If it is set to 0 or left blank, the default value (1) will be used.
For example, if this parameter is set to 10, the health check timeout is 10 seconds. The probe fails if the response takes longer than 10 seconds.
- Success Threshold (successThreshold): the minimum consecutive successful probes required to mark a container healthy after a failure. The default value is 1, which is also the minimum value. This parameter must be set to 1 for liveness and startup probes.
For example, if this parameter is set to 1, the container recovers after one successful probe following a failure.
- Failure Threshold (failureThreshold): the consecutive probe failures before marking a container unhealthy. The default value is 3, and the minimum value is 1.
For liveness probes, if the number of consecutive failures reaches this threshold, the container is marked unhealthy, and the kubelet restarts the container.
For readiness probes, if the number of consecutive failures reaches this threshold, the pod is marked unready and removed from Service endpoints. In this case, the container receives no new traffic but is not restarted.
TCP (tcpSocket)
This method applies to containers that expose TCP services (such as databases, caches, and custom TCP applications). The cluster periodically attempts to set up a TCP connection with the target container. If the connection is successful, the probe is healthy. Otherwise, the probe fails, and the kubelet stops and restarts the container.
Port: the container port exposed for health checks. Valid range: 1 to 65535.
Command (exec)
You need to specify an executable command for the cluster to periodically run inside the container. If the command exits with code 0, the probe is successful. Otherwise, the probe fails, and the kubelet stops and restarts the container.
CAUTION:Avoid command-based probes in high-load environments because they consume system resources. If system resources are insufficient, such as high CPU usage or a locked filesystem, probes may time out and fail. If you must use them, follow these guidelines:
- Increase the failure threshold and timeout to prevent transient resource spikes from triggering probe failures. However, this reduces detection sensitivity for actual unhealthy states, so tune conservatively.
- Set proper CPU limits on service containers and system add-ons. Otherwise, time-slice starvation can keep kernel locks held and block exec probes across all pods on the node.
Command: the command executed inside the container to check its status. Enter multiple commands on separate lines.
NOTE:- Before using this method, you must package the required programs and tools in the container image. The cluster executes commands directly in the container. Host filesystems and other containers' filesystems are inaccessible. If the dependent programs or tools (such as curl, nc, or custom scripts) are not included in the image, error message "Command not found" will be displayed.
- If a shell script is executed, you must specify a script interpreter. The cluster does not provide an interactive terminal, so you cannot execute scripts directly. You must use the interpreter to invoke the script. For example, if the script is located in /data/scripts/health_check.sh, you need to execute sh /data/scripts/health_check.sh.
gRPC Check (grpc)
This method applies to gRPC applications. No HTTP ports or external scripts are required. Health checks use standard gRPC APIs.
Port: the container port exposed for health checks. Valid range: 1 to 65535.
- Configure other parameters and click Create Workload in the lower right corner. If the workload is in the Running state, the health check is successful.
- Use kubectl to access the cluster. For details, see Accessing a Cluster Using kubectl.
- Create a YAML file for configuring a workload. In this example, the file name is health_check.yaml. You can change it as needed.
vim health_check.yamlThe following uses an HTTP request as an example. For details about other health check methods, see Configuring Liveness, Readiness and Startup Probes. File content:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx namespace: default spec: replicas: 1 selector: matchLabels: app: nginx version: v1 template: metadata: labels: app: nginx version: v1 spec: containers: - name: container-1 image: nginx:latest readinessProbe: # The readiness probe httpGet: # HTTP requests are used to check container health. path: / # The HTTP health check path port: 80 # The health check port is 80. host: '' # The host address, which defaults to the pod IP address scheme: HTTP # The health check protocol httpHeaders: # (Optional) The request header name is Custom-Header, and the value is Awesome. - name: Custom-Header value: Awesome initialDelaySeconds: 3 # The grace period for container startup before health checks begin, in seconds timeoutSeconds: 1 # The probe timeout, in seconds periodSeconds: 3 # The probe check period, in seconds successThreshold: 1 # The minimum consecutive successful probes required to mark a container healthy after a failure failureThreshold: 3 # The minimum consecutive probe failures before marking a container unhealthy imagePullSecrets: - name: default-secret - Create the workload.
kubectl create -f health_check.yamlIf information similar to the following is displayed, the workload is being created:
deployment.apps/nginx created
- Check the workload pod.
kubectl get pod
If the pod status is Running, the workload has been created.
NAME READY STATUS RESTARTS AGE nginx-58cdd4f48d-jcsqn 1/1 Running 0 4m19s
Configuring Startup Probes
Startup probes run at container startup to verify initialization. They are used for slow-starting applications.
- Log in to the CCE console.
- When creating a workload, select Health Check in Container Information.
- Enable the startup probe. Figure 3 Startup probe status setting
Table 3 Probe parameters Check Method
Description
Specific Parameter
Common Parameter
HTTP (httpGet)
This method applies to containers that expose HTTP or HTTPS services. The cluster periodically sends an HTTP/HTTPS GET request to the target container's endpoint. If the response status code is within the range 200–399, the probe is considered successful. If the response status code is outside this range, the probe fails, and the kubelet stops and restarts the container.
- Path: the HTTP or HTTPS path to check, as defined by the container image. It must be an absolute path starting with a slash (/).
- Port: the container port exposed for health checks. Valid range: 1 to 65535.
- Host Address: the target host IP address for the request. If unspecified, it defaults to the pod IP address.
- Protocol: the protocol used for the request. It must match the protocol exposed by the container service.
- Request Header: the HTTP or HTTPS header to include in the request, specified as a key-value pair.
- Period (periodSeconds): the interval between probe checks, in seconds.
For example, if this parameter is set to 30, the health check is performed every 30 seconds.
- Delay (initialDelaySeconds): the grace period for container startup before health checks begin, in seconds.
For example, if this parameter is set to 30, the health check starts 30 seconds after the container starts.
- Timeout (timeoutSeconds): the maximum time to wait for a probe response, in seconds. If it is set to 0 or left blank, the default value (1) will be used.
For example, if this parameter is set to 10, the health check timeout is 10 seconds. The probe fails if the response takes longer than 10 seconds.
- Success Threshold (successThreshold): the minimum consecutive successful probes required to mark a container healthy after a failure. The default value is 1, which is also the minimum value. This parameter must be set to 1 for liveness and startup probes.
For example, if this parameter is set to 1, the container recovers after one successful probe following a failure.
- Failure Threshold (failureThreshold): the consecutive probe failures before marking a container unhealthy. The default value is 3, and the minimum value is 1.
For liveness probes, if the number of consecutive failures reaches this threshold, the container is marked unhealthy, and the kubelet restarts the container.
For readiness probes, if the number of consecutive failures reaches this threshold, the pod is marked unready and removed from Service endpoints. In this case, the container receives no new traffic but is not restarted.
TCP (tcpSocket)
This method applies to containers that expose TCP services (such as databases, caches, and custom TCP applications). The cluster periodically attempts to set up a TCP connection with the target container. If the connection is successful, the probe is healthy. Otherwise, the probe fails, and the kubelet stops and restarts the container.
Port: the container port exposed for health checks. Valid range: 1 to 65535.
Command (exec)
You need to specify an executable command for the cluster to periodically run inside the container. If the command exits with code 0, the probe is successful. Otherwise, the probe fails, and the kubelet stops and restarts the container.
CAUTION:Avoid command-based probes in high-load environments because they consume system resources. If system resources are insufficient, such as high CPU usage or a locked filesystem, probes may time out and fail. If you must use them, follow these guidelines:
- Increase the failure threshold and timeout to prevent transient resource spikes from triggering probe failures. However, this reduces detection sensitivity for actual unhealthy states, so tune conservatively.
- Set proper CPU limits on service containers and system add-ons. Otherwise, time-slice starvation can keep kernel locks held and block exec probes across all pods on the node.
Command: the command executed inside the container to check its status. Enter multiple commands on separate lines.
NOTE:- Before using this method, you must package the required programs and tools in the container image. The cluster executes commands directly in the container. Host filesystems and other containers' filesystems are inaccessible. If the dependent programs or tools (such as curl, nc, or custom scripts) are not included in the image, error message "Command not found" will be displayed.
- If a shell script is executed, you must specify a script interpreter. The cluster does not provide an interactive terminal, so you cannot execute scripts directly. You must use the interpreter to invoke the script. For example, if the script is located in /data/scripts/health_check.sh, you need to execute sh /data/scripts/health_check.sh.
gRPC Check (grpc)
This method applies to gRPC applications. No HTTP ports or external scripts are required. Health checks use standard gRPC APIs.
Port: the container port exposed for health checks. Valid range: 1 to 65535.
- Configure other parameters and click Create Workload in the lower right corner. If the workload is in the Running state, the health check is successful.
- Use kubectl to access the cluster. For details, see Accessing a Cluster Using kubectl.
- Create a YAML file for configuring a workload. In this example, the file name is health_check.yaml. You can change it as needed.
vim health_check.yamlThe following uses an HTTP request as an example. For details about other health check methods, see Configuring Liveness, Readiness and Startup Probes. File content:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx namespace: default spec: replicas: 1 selector: matchLabels: app: nginx version: v1 template: metadata: labels: app: nginx version: v1 spec: containers: - name: container-1 image: nginx:latest startupProbe: # The startup probe httpGet: # HTTP requests are used to check container health. path: / # The HTTP health check path port: 80 # The health check port is 80. host: '' # The host address, which defaults to the pod IP address scheme: HTTP # The health check protocol httpHeaders: # (Optional) The request header name is Custom-Header, and the value is Awesome. - name: Custom-Header value: Awesome initialDelaySeconds: 3 # The grace period for container startup before health checks begin, in seconds timeoutSeconds: 1 # The probe timeout, in seconds periodSeconds: 3 # The probe check period, in seconds successThreshold: 1 # The minimum consecutive successful probes required to mark a container healthy after a failure failureThreshold: 3 # The minimum consecutive probe failures before marking a container unhealthy imagePullSecrets: - name: default-secret - Create the workload.
kubectl create -f health_check.yamlIf information similar to the following is displayed, the workload is being created:
deployment.apps/nginx created
- Check the workload pod.
kubectl get pod
If the pod status is Running, the workload has been created.
NAME READY STATUS RESTARTS AGE nginx-58cdd4f48d-jcsqn 1/1 Running 0 4m19s
Helpful Links
- Learn more about workload parameters. For details, see Creating a Workload.
- After a workload is created, you can upgrade it, edit its YAML file, and view its logs. For details, see Managing Workloads.
- If a workload fails to be created, rectify the fault by referring to Workload Exception Troubleshooting.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot