Help Center/ Cloud Container Engine/ FAQs/ Workload/ Workload Exception Troubleshooting/ What Should I Do If a Workload Exception Occurs Due to a Storage Volume Mount Failure?
Updated on 2024-11-13 GMT+08:00

What Should I Do If a Workload Exception Occurs Due to a Storage Volume Mount Failure?

Symptom

A workload is always in the creating state, and an alarm indicating that a storage volume fails to be mounted is generated. The event is as follows:

AttachVolume.Attach failed for volume "pvc-***" : rpc error: code = Internal desc = [***][disk.csi.everest.io] attaching volume *** to node *** failed: failed to send request of attaching disk(id=***) to node(id=***): error statuscode 400 for posting request, response is {"badRequest": {"message": "Maximum number of scsi disk exceeded", "code": 400}}, request is {"volumeAttachment":{"volumeId":"***","device":"","id":"","serverId":"","bus":"","pciAddress":"","VolumeWwn":"","VolumeMultiAttach":false,"VolumeMetadata":null}}, url is: ......

Possible Causes

This alarm indicates that the number of EVS disks attached to a node has reached the limit. In this case, if a workload pod with an EVS disk attached is scheduled to this node, the disk attachment will fail. As a result, the workload cannot run properly.

If no more than 20 EVS disks can be attached to a node, the node, already has one system disk and one data disk attached, can only accept up to 18 additional EVS disks. If two raw disks are attached to the node through the ECS console for creating a local storage pool, only 16 additional data disks can be attached to the node. If the node has 18 workload pods scheduled to it, each with one EVS disk attached, two of those pods will encounter the preceding error due to disk quotas being exceeded.

Solution

CCE Container Storage (Everest) 2.3.11 or later supports number_of_reserved_disks, which is used to configure the number of disks reserved on a node. By configuring this parameter, you can reserve disk slots for your nodes. Note that the modification of this parameter applies to all nodes in a cluster.

After number_of_reserved_disks is configured, the number of the additional EVS disks that can be attached to a node is calculated as follows:

Number of remaining disks attached to a node= Maximum number of EVS disks that can be attached to the node - Value of number_of_reserved_disks

If the maximum number of EVS disks that can be attached to a node is 20 and number_of_reserved_disks is set to 6, the number of the additional EVS disks that can be attached to the node is 14 (20 – 6 = 14) when a workload with EVS disks attached needs to be scheduled. The reserved six disks include one system disk and one data disk that have been attached to the node. You can attach four EVS disks to this node as additional data disks or raw disks for a local storage pool. In this scenario, if 18 workload pods, each with one EVS disk attached, need to be scheduled in the cluster, the node can accept 14 workload pods at most. The remaining four workload pods will be scheduled to other nodes in the cluster. In this way, the problem that the storage volume mount failure will not occur.