更新时间:2024-10-25 GMT+08:00
分享

创建NPU应用

约束与限制

  • 当前不支持npu负载多容器能力。

通过kubectl命令行创建NPU应用

本节以创建无状态工作负载(Deployment)为例,说明使用kubectl命令创建训练任务的方法。

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    description: ''
  labels:
    appgroup: ''
    version: v1
  name: demo
  namespace: default
spec:
  selector:
    matchLabels:
      app: demo
      version: v1
  template:
    metadata:
      labels:
        app: demo
        version: v1
    spec:
      containers:
        - name: container-1
          image: swr.cn-north-7.myhuaweicloud.com/ief-ies/demo:latest
          imagePullPolicy: IfNotPresent
          env:
            - name: PAAS_APP_NAME
              value: demo
            - name: PAAS_NAMESPACE
              value: default
            - name: PAAS_PROJECT_ID
              value: 0aa612a71f80d4322fe0c010beb80e8a
          resources:
            requests:
              cpu: 250m
              memory: 512Mi
              huawei.com/ascend-1980: '1'          ##需要使用的npu卡数量
            limits:
              cpu: 250m
              memory: 512Mi
	          huawei.com/ascend-1980: '1'  	 ##需要使用的npu卡数量
      terminationGracePeriodSeconds: 30
      schedulerName: volcano                           ## 使用的调度器指定为volcano
      tolerations:
        - key: node.kubernetes.io/not-ready
          operator: Exists
          effect: NoExecute
          tolerationSeconds: 300
        - key: node.kubernetes.io/unreachable
          operator: Exists
          effect: NoExecute
          tolerationSeconds: 300
      initContainers: []
      volumes: []
  replicas: 2
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%
      maxSurge: 25%
  progressDeadlineSeconds: 600

相关文档