Creating an NPU-accelerated Application
Prerequisites
- If you want to create a cluster by running commands, use kubectl to connect to the cluster. For details, see Connecting to a Cluster Using kubectl.
Constraints
- An NPU can be shared by multiple containers.
Creating an NPU-accelerated Application on the Console
The following uses a Deployment as an example to describe how to create an NPU-accelerated application on the console.
- Log in to the UCS console, choose Fleets, and click the cluster name to access the cluster console.
- In the navigation pane, choose Workloads. On the displayed page, click the Deployments tab. In the upper right corner, click Create from Image.
- Configure the workload parameters. In Basic Info under Container Settings, select NPU for Heterogeneous Resource and set NPU quota.
- Configure other parameters and click Create Workload. You can view the Deployment status in the Deployment list.
If the Deployment is in the Running state, the Deployment is successfully created.
Creating an NPU-accelerated Application Using kubectl
The following uses a Deployment as an example to describe how to create a training job using kubectl.
apiVersion: apps/v1 kind: Deployment metadata: annotations: description: '' labels: appgroup: '' version: v1 name: demo namespace: default spec: selector: matchLabels: app: demo version: v1 template: metadata: labels: app: demo version: v1 spec: containers: - name: container-1 image: swr.cn-north-7.myhuaweicloud.com/ief-ies/demo:latest imagePullPolicy: IfNotPresent env: - name: PAAS_APP_NAME value: demo - name: PAAS_NAMESPACE value: default - name: PAAS_PROJECT_ID value: 0aa612a71f80d4322fe0c010beb80e8a resources: requests: cpu: 250m memory: 512Mi huawei.com/ascend-1980: '1' ##The number of NPUs to be used limits: cpu: 250m memory: 512Mi huawei.com/ascend-1980: '1' ##The number of NPUs to be used terminationGracePeriodSeconds: 30 schedulerName: volcano ## Volcano is specified as the scheduler. tolerations: - key: node.kubernetes.io/not-ready operator: Exists effect: NoExecute tolerationSeconds: 300 - key: node.kubernetes.io/unreachable operator: Exists effect: NoExecute tolerationSeconds: 300 initContainers: [] volumes: [] replicas: 2 revisionHistoryLimit: 10 strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 25% maxSurge: 25% progressDeadlineSeconds: 600
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot