- What's New
- Function Overview
- Service Overview
- Billing
- Getting Started
- User Guide
- Best Practices
-
Developer Guide
- Overview
- Using Native kubectl (Recommended)
- Namespace and Network
- Pod
- Label
- Deployment
- EIPPool
- EIP
- Pod Resource Monitoring Metric
- Collecting Pod Logs
- Managing Network Access Through Service and Ingress
- Using PersistentVolumeClaim to Apply for Persistent Storage
- ConfigMap and Secret
- Creating a Workload Using Job and Cron Job
- YAML Syntax
-
API Reference
- Before You Start
- Calling APIs
- Getting Started
- Proprietary APIs
-
Kubernetes APIs
- ConfigMap
- Pod
- StorageClass
- Service
-
Deployment
- Querying All Deployments
- Deleting All Deployments in a Namespace
- Querying Deployments in a Namespace
- Creating a Deployment
- Deleting a Deployment
- Querying a Deployment
- Updating a Deployment
- Replacing a Deployment
- Querying the Scaling Operation of a Specified Deployment
- Updating the Scaling Operation of a Specified Deployment
- Replacing the Scaling Operation of a Specified Deployment
- Querying the Status of a Deployment
- Ingress
- OpenAPIv2
- VolcanoJob
- Namespace
- ClusterRole
- Secret
- Endpoint
- ResourceQuota
- CronJob
-
API groups
- Querying API Versions
- Querying All APIs of v1
- Querying an APIGroupList
- Querying APIGroup (/apis/apps)
- Querying APIs of apps/v1
- Querying an APIGroup (/apis/batch)
- Querying an APIGroup (/apis/batch.volcano.sh)
- Querying All APIs of batch.volcano.sh/v1alpha1
- Querying All APIs of batch/v1
- Querying All APIs of batch/v1beta1
- Querying an APIGroup (/apis/crd.yangtse.cni)
- Querying All APIs of crd.yangtse.cni/v1
- Querying an APIGroup (/apis/extensions)
- Querying All APIs of extensions/v1beta1
- Querying an APIGroup (/apis/metrics.k8s.io)
- Querying All APIs of metrics.k8s.io/v1beta1
- Querying an APIGroup (/apis/networking.cci.io)
- Querying All APIs of networking.cci.io/v1beta1
- Querying an APIGroup (/apis/rbac.authorization.k8s.io)
- Querying All APIs of rbac.authorization.k8s.io/v1
- Event
- PersistentVolumeClaim
- RoleBinding
- StatefulSet
- Job
- ReplicaSet
- Data Structure
- Permissions Policies and Supported Actions
- Appendix
- Out-of-Date APIs
- Change History
-
FAQs
- Product Consulting
-
Basic Concept FAQs
- What Is CCI?
- What Are the Differences Between Cloud Container Instance and Cloud Container Engine?
- What Is an Environment Variable?
- What Is a Service?
- What Is Mcore?
- What Are the Relationships Between Images, Containers, and Workloads?
- What Are Kata Containers?
- Can kubectl Be Used to Manage Container Instances?
- What Are Core-Hours in CCI Resource Packages?
- Workload Abnormalities
-
Container Workload FAQs
- Why Service Performance Does Not Meet the Expectation?
- How Do I Set the Quantity of Instances (Pods)?
- How Do I Check My Resource Quotas?
- How Do I Set Probes for a Workload?
- How Do I Configure an Auto Scaling Policy?
- What Do I Do If the Workload Created from the sample Image Fails to Run?
- How Do I View Pods After I Call the API to Delete a Deployment?
- Why an Error Is Reported When a GPU-Related Operation Is Performed on the Container Entered by Using exec?
- Can I Start a Container in Privileged Mode When Running the systemctl Command in a Container in a CCI Cluster?
- Why Does the Intel oneAPI Toolkit Fail to Run VASP Tasks Occasionally?
- Why Are Pods Evicted?
- Why Is the Workload Web-Terminal Not Displayed on the Console?
- Why Are Fees Continuously Deducted After I Delete a Workload?
-
Image Repository FAQs
- Can I Export Public Images?
- How Do I Create a Container Image?
- How Do I Upload Images?
- Does CCI Provide Base Container Images for Download?
- Does CCI Administrator Have the Permission to Upload Image Packages?
- What Permissions Are Required for Uploading Image Packages for CCI?
- What Do I Do If Authentication Is Required During Image Push?
-
Network Management FAQs
- How Do I View the VPC CIDR Block?
- Does CCI Support Load Balancing?
- How Do I Configure the DNS Service on CCI?
- Does CCI Support InfiniBand (IB) Networks?
- How Do I Access a Container from a Public Network?
- How Do I Access a Public Network from a Container?
- What Do I Do If Access to a Workload from a Public Network Fails?
- What Do I Do If Error 504 Is Reported When I Access a Workload?
- What Do I Do If the Connection Timed Out?
- Storage Management FAQs
- Log Collection
- Account
- SDK Reference
- Videos
- General Reference
Copied.
Liveness Probe
Overview
Kubernetes provides the self-healing capability, that is, Kubernetes can detect the container crash and restart the container. However, sometimes memory leakage occurs in a Java program, and the program cannot work normally, while the JVM process is still running. For such issues, Kubernetes provides the liveness probe mechanism to determine whether to restart the container by checking whether the container responses normally. This is a good health check mechanism.
A liveness probe should be defined for each pod. Otherwise, Kubernetes cannot detect whether the pod is running properly.
CCI supports the following detection mechanisms:
- HTTP GET: An HTTP GET request is sent to the container. If the probe receives 2xx or 3xx, the container is healthy.
NOTE:
You need to configure the following annotation for the pod to make timeoutSeconds take effect:
cci.io/httpget-probe-timeout-enable:"true"
For details, see the example in Advanced Configuration of Liveness Probe.
- Exec: The probe runs a command in the container and checks the exit status code. If the exit status code is 0, the probe is healthy.
HTTP GET
HTTP GET is the most common detection method. The mechanism is to send an HTTP GET request to the container. If the probe receives 2xx or 3xx, the container is healthy. The method is defined as follows:
apiVersion: v1 kind: Pod metadata: name: liveness-http spec: containers: - name: liveness image: k8s.gcr.io/liveness args: - /server livenessProbe: # liveness probe httpGet: # HTTP GET definition path: /healthz port: 8080
Create a pod.
$ kubectl create -f liveness-http.yaml -n $namespace_name pod/liveness-http created
As shown above, the probe sends an HTTP GET request to port 8080 of the container. The preceding program returns the status code 500 for the fifth request. Then Kubernetes restarts the container.
View pod details.
$ kubectl describe po liveness-http -n $namespace_name Name: liveness-http ...... Containers: container-0: ...... State: Running Started: Mon, 12 Nov 2018 22:57:28 +0800 Last State: Terminated Reason: Error Exit Code: 137 Started: Mon, 12 Nov 2018 22:55:40 +0800 Finished: Mon, 12 Nov 2018 22:57:27 +0800 Ready: True Restart Count: 1 Liveness: http-get http://:8080/ delay=0s timeout=1s period=10s #success=1 #failure=3 ...... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m5s default-scheduler Successfully assigned default/pod-liveness to node2 Normal Pulling 74s (x2 over 3m4s) kubelet, node2 pulling image "pod-liveness" Normal Killing 74s kubelet, node2 Killing container with id docker://container-0:Container failed liveness probe.. Container will be killed and recreated.
As shown, the pod is in the Running state, the Last State is Terminated, and the Restart Count is 1, indicating that the pod is restarted once. In addition, you can see the following information from the event "Killing container with id docker://container-0:Container failed liveness probe.." Container will be killed and recreated.
After the container is killed, a new container is created.
Exec
Exec is to execute a specific command. The mechanism is that the probe executes the command in the container and checks the exit status code of the command. If the status code is 0, the pod is healthy. The method is defined as follows:
apiVersion: v1 kind: Pod metadata: labels: test: liveness name: liveness-exec spec: containers: - name: liveness image: busybox args: - /bin/sh - -c - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 livenessProbe: # liveness probe exec: # Exec definition command: - cat - /tmp/healthy
Run the cat /tmp/healthy command in the container. If the command is executed successfully and 0 is returned, the container is healthy.
Advanced Configuration of Liveness Probe
In output of the $ kubectl describe po liveness-http command, the following information is displayed:
Liveness: http-get http://:8080/ delay=0s timeout=1s period=10s #success=1 #failure=3
This line indicates the parameter configuration of the liveness probe. The meanings of the parameters are as follows:
- delay=0s indicates that the probe starts immediately after the container is started.
- timeout=1s indicates that the container must respond to the probe within 1s. Otherwise, the detection fails.
- period=10s indicates that the detection is performed every 10s.
- #success=1 indicates that the detection is successful after succeeding once.
- #failure=3 indicates that the container will be restarted after three consecutive detection failures.
These are set by default when the probe is created. You can also manually configure the parameters as follows:
apiVersion: v1 kind: Pod metadata: name: liveness-http spec: template: metadata: annotations: cci.io/httpget-probe-timeout-enable:"true" containers: - image: k8s.gcr.io/liveness livenessProbe: httpGet: path: / port: 8080 initialDelaySeconds: 10 # When does the container start detection after the container is started? timeoutSeconds: 2 # The container must respond to the probe within 2s, or the detection fails. periodSeconds: 30 # The probe is performed every 30s. successThreshold: 1 # The container is considered healthy as long as the probe succeeds once. failureThreshold: 3 # The container will be restarted after three consecutive detection failures.
Generally, the value of initialDelaySeconds must be greater than 0, because in most cases, although the container is started successfully, it takes a while for the application to be ready. After the application is ready, a success message is returned. Otherwise, the probe may fail frequently.
In addition, you can set failureThreshold to allow multiple times of loop detection, so that you do not have to repeatedly run the health check program.
Configuring an Effect Liveness Probe
- What should a liveness probe detect?
A liveness probe should check whether all the key parts of an application are healthy and use a dedicated URL, such as /health. This function is performed when /health is accessed, and then the result is returned. Note that authentication cannot be performed. Otherwise, the probe will repeatedly fail and be restarted.
In addition, the check can be performed only within the application, and cannot be performed outside the dependency. For example, if the frontend web server cannot connect to the database, the web server cannot be considered as unhealthy.
- A liveness probe must be lightweight.
A liveness probe cannot occupy too many resources or too much time. Otherwise, the health check is wasting resources. For example, for Java applications, the HTTP GET method is recommended. If the Exec method is used, the JVM startup occupies too many resources.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot