Creating an NPU-accelerated Application
Prerequisites
- If you want to create a cluster by running commands, use kubectl to connect to the cluster. For details, see Connecting to a Cluster Using kubectl.
Constraints
- An NPU can be shared by multiple containers.
Creating an NPU-accelerated Application on the Console
The following uses a Deployment as an example to describe how to create an NPU-accelerated application on the console.
- Log in to the UCS console, choose Fleets, and click the cluster name to access the cluster console.
- In the navigation pane, choose Workloads. On the displayed page, click the Deployments tab. In the upper right corner, click Create from Image.
- Configure the workload parameters. In Basic Info under Container Settings, select NPU for Heterogeneous Resource and set NPU quota.
- Configure other parameters and click Create Workload. You can view the Deployment status in the Deployment list.
If the Deployment is in the Running state, the Deployment is successfully created.
Creating an NPU-accelerated Application Using kubectl
The following uses a Deployment as an example to describe how to create a training job using kubectl.
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
description: ''
labels:
appgroup: ''
version: v1
name: demo
namespace: default
spec:
selector:
matchLabels:
app: demo
version: v1
template:
metadata:
labels:
app: demo
version: v1
spec:
containers:
- name: container-1
image: <your_image_address> # Replace it with the actual image address.
imagePullPolicy: IfNotPresent
env:
- name: PAAS_APP_NAME
value: demo
- name: PAAS_NAMESPACE
value: default
- name: PAAS_PROJECT_ID
value: 0aa612a71f80d4322fe0c010beb80e8a
resources:
requests:
cpu: 250m
memory: 512Mi
huawei.com/ascend-1980: '1' ##The number of NPUs to be used
limits:
cpu: 250m
memory: 512Mi
huawei.com/ascend-1980: '1' ##The number of NPUs to be used
terminationGracePeriodSeconds: 30
schedulerName: volcano ## Volcano is specified as the scheduler.
tolerations:
- key: node.kubernetes.io/not-ready
operator: Exists
effect: NoExecute
tolerationSeconds: 300
- key: node.kubernetes.io/unreachable
operator: Exists
effect: NoExecute
tolerationSeconds: 300
initContainers: []
volumes: []
replicas: 2
revisionHistoryLimit: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
progressDeadlineSeconds: 600
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot