Updated on 2024-12-18 GMT+08:00

Creating an NPU-accelerated Application

Prerequisites

Constraints

  • An NPU can be shared by multiple containers.

Creating an NPU-accelerated Application on the Console

The following uses a Deployment as an example to describe how to create an NPU-accelerated application on the console.

  1. Log in to the UCS console, choose Fleets, and click the cluster name to access the cluster console.
  2. In the navigation pane, choose Workloads. On the displayed page, click the Deployments tab. In the upper right corner, click Create from Image.
  3. Configure the workload parameters. In Basic Info under Container Settings, select NPU for Heterogeneous Resource and set NPU quota.

  4. Configure other parameters and click Create Workload. You can view the Deployment status in the Deployment list.

    If the Deployment is in the Running state, the Deployment is successfully created.

Creating an NPU-accelerated Application Using kubectl

The following uses a Deployment as an example to describe how to create a training job using kubectl.

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    description: ''
  labels:
    appgroup: ''
    version: v1
  name: demo
  namespace: default
spec:
  selector:
    matchLabels:
      app: demo
      version: v1
  template:
    metadata:
      labels:
        app: demo
        version: v1
    spec:
      containers:
        - name: container-1
          image: swr.cn-north-7.myhuaweicloud.com/ief-ies/demo:latest
          imagePullPolicy: IfNotPresent
          env:
            - name: PAAS_APP_NAME
              value: demo
            - name: PAAS_NAMESPACE
              value: default
            - name: PAAS_PROJECT_ID
              value: 0aa612a71f80d4322fe0c010beb80e8a
          resources:
            requests:
              cpu: 250m
              memory: 512Mi
              huawei.com/ascend-1980: '1'          ##The number of NPUs to be used
            limits:
              cpu: 250m
              memory: 512Mi
	          huawei.com/ascend-1980: '1'  	 ##The number of NPUs to be used
      terminationGracePeriodSeconds: 30
      schedulerName: volcano                           ## Volcano is specified as the scheduler.
      tolerations:
        - key: node.kubernetes.io/not-ready
          operator: Exists
          effect: NoExecute
          tolerationSeconds: 300
        - key: node.kubernetes.io/unreachable
          operator: Exists
          effect: NoExecute
          tolerationSeconds: 300
      initContainers: []
      volumes: []
  replicas: 2
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%
      maxSurge: 25%
  progressDeadlineSeconds: 600