ModelArts
ModelArts

    Todos os resultados de "" dentro deste produto

      Todos os resultados de "" dentro deste produto

      • Visão geral de serviço
        • Infográficos
          • O que é o ModelArts?
        • O que é o ModelArts?
        • Funções
        • Conhecimento básico
          • Introdução ao ciclo de vida de desenvolvimento da IA
          • Conceitos básicos de desenvolvimento de IA
          • Conceitos comuns do ModelArts
          • Introdução às ferramentas de desenvolvimento
          • Treinamento de modelos
          • Implementação de modelos
        • Serviços relacionados
        • Como acessar o ModelArts?
        • Gerenciamento de permissões
        • Segurança
          • Responsabilidades compartilhadas
          • Identificação e gerenciamento de ativos
          • Autenticação de identidade e controle de acesso
          • Proteção de dados
          • Auditoria e registro em logs
          • Resiliência de serviço
          • Recuperação de falhas
          • Gerenciamento de atualização
          • Certificados
          • Fronteira da segurança
        • Cotas
      • Preparações
        • Criação de um ID da HUAWEI e ativação dos serviços da Huawei Cloud
        • Logon no console de gerenciamento do ModelArts
        • Configuração da autorização de acesso (configuração global)
        • Criação de um bucket do OBS
        • Ativação de recursos do ModelArts
          • Recursos do ModelArts
          • Pagamento por uso
      • DevEnviron
        • Introdução ao DevEnviron
        • Cenários de aplicações
        • Gerenciamento de instâncias do notebook
          • Criação de uma instância de notebook
          • Acesso a uma instância de notebook
          • Pesquisa, inicialização, interrupção ou exclusão de uma instância de notebook
          • Alteração de uma imagem de instância de notebook
          • Alteração do flavor de uma instância de notebook
          • Seleção de armazenamento no DevEnviron
          • Montagem dinâmica de um sistema de arquivos paralelo do OBS
          • Expansão dinâmica da capacidade de disco EVS
          • Modificação da configuração de SSH para uma instância de notebook
          • Exibição das instâncias do notebook de todos os usuários do IAM em uma conta de locatário
          • Exibição de eventos do notebook
          • Relatório de alarme do diretório de cache do notebook
        • JupyterLab
          • Processo de operação no JupyterLab
          • Visão geral do JupyterLab e operações comuns
          • Plug-in de parametrização de código
          • Uso do SDK do ModelArts
          • Uso do plug-in Git
          • Treinamento de modelo visualizado
            • Introdução à visualização do trabalho de treinamento
            • Trabalhos de visualização do MindInsight
            • Visualização de trabalhos do TensorBoard
          • Upload e download de dados no notebook
            • Upload de arquivos para JupyterLab
              • Cenários
              • Upload de arquivos de um caminho local para JupyterLab
                • Cenários e entradas de upload
                • Upload de um arquivo local com menos de 100 MB para JupyterLab
                • Upload de um arquivo local com um tamanho variando de 100 MB a 5 GB para JupyterLab
                • Upload de um arquivo local maior que 5 GB para JupyterLab
              • Clonagem de um repositório de código aberto no GitHub
              • Upload de arquivos do OBS para JupyterLab
              • Upload de arquivos remotos para JupyterLab
            • Download de um arquivo do JupyterLab para um caminho local
        • IDE local
          • Processo de operação em um IDE Local
          • IDE local (PyCharm)
            • Conexão a uma instância de notebook por meio do PyCharm Toolkit
              • PyCharm Toolkit
              • Baixa e instalação do PyCharm Toolkit
              • Conexão a uma instância de notebook por meio do PyCharm Toolkit
            • Conexão manual a uma instância de notebook por meio do PyCharm
            • Envio de um trabalho de treinamento usando PyCharm Toolkit
              • Envio de um trabalho de treinamento (nova versão)
              • Interrupção de um trabalho de treinamento
              • Exibição de logs de treinamento
            • Upload de dados para uma instância de notebook usando o PyCharm
          • IDE local (VS Code)
            • Conexão a uma instância de notebook por meio do VS Code
            • Instalação do VS Code
            • Conexão a uma instância de notebook por meio do VS Code Toolkit
            • Conexão manual a uma instância de notebook por meio do VS Code
            • Depuração remota no VS Code
            • Upload e download de arquivos no VS Code
          • IDE local (acesso usando SSH)
        • Referência de comandos da CLI do ModelArts
          • Visão geral da CLI do ModelArts
          • (Opcional) Instalação local de ma-cli
          • Completamento automático para comandos de ma-cli
          • Autenticação de ma-cli
          • Comando de criação de imagem de ma-cli
            • Comando de criação de imagem de ma-cli
            • Obtenção de um modelo de criação de imagem
            • Carregamento de um modelo de criação de imagem
            • Obtenção de imagens do ModelArts registradas
            • Criação de uma imagem no notebook do ModelArts
            • Obtenção de caches de criação de imagens no notebook do ModelArts
            • Limpeza de caches de criação de imagens no notebook do ModelArts
            • Registro de imagens do SWR com o gerenciamento de imagens do ModelArts.
            • Cancelamento de registro de uma imagem registrada do gerenciamento de imagens do ModelArts
            • Depuração de uma imagem do SWR em um ECS
          • Uso do comando ma-cli ma-job para enviar um trabalho de treinamento do ModelArts
            • Visão geral do comando ma-cli ma-job
            • Obtenção de trabalhos de treinamento do ModelArts
            • Envio de um trabalho de treinamento do ModelArts
            • Obtenção de registros de trabalho de treinamento do ModelArts
            • Obtenção de eventos de trabalho de treinamento do ModelArts
            • Obtenção de mecanismos de IA do ModelArts para treinamento
            • Obtenção de especificações de recursos do ModelArts para treinamento
            • Interrupção de um trabalho de treinamento do ModelArts
          • Uso do comando ma-cli dli-job para enviar um trabalho do Spark de DLI
            • Visão geral
            • Consulta de trabalhos do Spark de DLI
            • Envio de um trabalho do Spark de DLI
            • Consulta de logs de execução do Spark de DLI
            • Consulta de filas do DLI
            • Obtenção de recursos do grupo de DLI
            • Upload de arquivos locais ou arquivos do OBS para um grupo do DLI
            • Interrupção de um trabalho do Spark de DLI
          • Uso de ma-cli para copiar dados do OBS
      • Gerenciamento de recursos
        • Pool de recursos
        • Cluster elástico
          • Atualizações abrangentes das funções de gerenciamento do pool de recursos do ModelArts
          • Criação de um pool de recursos
          • Exibição de detalhes sobre um pool de recursos
          • Redimensionamento de um pool de recursos
          • Definição de uma política de renovação
          • Modificação da política de expiração
          • Migração do espaço de trabalho
          • Alteração de tipos de trabalho suportados por um pool de recursos
          • Atualização de um driver de pool de recursos
          • Exclusão de um pool de recursos
          • Status anormal de um pool de recursos dedicados
          • Rede do ModelArts
          • Nós do ModelArts
        • Logs de auditoria
          • Principais operações gravadas pelo CTS
          • Visualização de logs de auditoria
        • Monitoramento de recursos
          • Visão geral
          • Uso do Grafana para exibir as métricas de monitoramento do AOM
            • Procedimento
            • Instalação e configuração do Grafana
              • Instalação e configuração do Grafana no Windows
              • Instalação e configuração do Grafana no Linux
              • Instalação e configuração do Grafana em uma instância de notebook
            • Configuração de uma fonte de dados do Grafana
            • Uso do Grafana para configurar painéis e visualizar dados métricos
          • Exibição de todas as métricas de monitoramento do ModelArts no console do AOM
      • Contêineres do Docker com ModelArts
        • Gerenciamento de imagens
        • Uso de uma imagem predefinida
          • Imagens predefinidas no notebook
            • Imagens de base do notebook
            • Lista de imagens de base do notebook
            • Imagem de base do notebook com PyTorch x86
            • Imagem de base do notebook com Tensorflow (x86)
            • Imagem de base do notebook com MindSpore x86
            • Imagem de base do notebook com imagem dedicada personalizada (x86)
          • Imagem de base de treinamento
            • Imagens de base de treinamento disponíveis
            • Imagem de base de treinamento (PyTorch)
            • Imagem de base de treinamento (TensorFlow)
            • Imagem de base de treinamento (Horovod)
            • Imagem de base de treinamento (MPI)
            • Início do treinamento com uma imagem predefinida
              • PyTorch
              • TensorFlow
              • Horovod/MPI/MindSpore-GPU
          • Imagens da base de inferência
            • Imagens de base de inferência disponíveis
            • Imagens de base de inferência com TensorFlow (CPU/GPU)
            • Imagens de base de inferência com PyTorch (CPU/GPU)
            • Imagens de base de inferência com MindSpore (CPU/GPU)
        • Uso de imagens personalizadas em instâncias de notebook
          • Registro de uma imagem no ModelArts
          • Criação de uma imagem personalizada
          • Salvamento de uma instância de notebook como uma imagem personalizada
            • Salvamento de uma imagem de ambiente notebook
            • Uso de uma imagem personalizada para criar uma instância de notebook
          • Criação e uso de uma imagem personalizada no notebook
            • Cenários e processos de aplicação
            • Etapa 1 Criar uma imagem personalizada
            • Etapa 2 Registrar uma nova imagem
            • Etapa 3 Usar uma nova imagem para criar um ambiente de desenvolvimento
          • Criação de uma imagem personalizada em um ECS e sua utilização no notebook
            • Cenários e processos de aplicação
            • Etapa 1 Preparar um servidor de Docker e configurar um ambiente
            • Etapa 2 Criar uma imagem personalizada
            • Etapa 3 Registrar uma nova imagem
            • Etapa 5 Criar e iniciar um ambiente de desenvolvimento
        • Uso de uma imagem personalizada para treinar modelos (treinamento de modelo)
          • Visão geral
          • Exemplo: criar uma imagem personalizada para treinamento
            • Exemplo: criar uma imagem personalizada para treinamento (PyTorch + CPU/GPU)
            • Exemplo: criar uma imagem personalizada para treinamento (MPI + CPU/GPU)
            • Exemplo: criar uma imagem personalizada para treinamento (Horovod-PyTorch e GPUs)
            • Exemplo: criar uma imagem personalizada para treinamento (MindSpore e GPUs)
            • Exemplo: criar uma imagem personalizada para treinamento (TensorFlow e GPUs)
          • Preparação de uma imagem de treinamento
            • Especificações para imagens personalizadas para trabalhos de treinamento
            • Migração de uma imagem para o treinamento do ModelArts
            • Uso de uma imagem de base para criar uma imagem de treinamento
            • Instalação de MLNX_OFED em uma imagem de contêiner
          • Criação de um algoritmo usando uma imagem personalizada
          • Uso de uma imagem personalizada para criar um trabalho de treinamento baseado em CPU ou GPU
          • Processo de solução de problemas
        • Uso de uma imagem personalizada para criar aplicações de IA para implementação de inferência
          • Especificações de imagem personalizada para criar aplicações de IA
          • Criação de uma imagem personalizada e uso dela para criar uma aplicação de IA
        • Perguntas frequentes
          • Como acessar o SWR e carregar imagens para ele?
          • Como configurar variáveis de ambiente para uma imagem?
          • Como usar o Docker para iniciar uma imagem salva usando uma instância de notebook?
          • Como configurar uma fonte de Conda em um ambiente de desenvolvimento de notebook?
          • Quais são as versões de software suportadas para uma imagem personalizada?
        • Histórico de modificações
      • Melhores práticas
        • Pool de recursos
        • Cluster elástico
          • Atualizações abrangentes das funções de gerenciamento do pool de recursos do ModelArts
          • Criação de um pool de recursos
          • Exibição de detalhes sobre um pool de recursos
          • Redimensionamento de um pool de recursos
          • Definição de uma política de renovação
          • Modificação da política de expiração
          • Migração do espaço de trabalho
          • Alteração de tipos de trabalho suportados por um pool de recursos
          • Atualização de um driver de pool de recursos
          • Exclusão de um pool de recursos
          • Status anormal de um pool de recursos dedicados
          • Rede do ModelArts
          • Nós do ModelArts
        • Logs de auditoria
          • Principais operações gravadas pelo CTS
          • Visualização de logs de auditoria
        • Monitoramento de recursos
          • Visão geral
          • Uso do Grafana para exibir as métricas de monitoramento do AOM
            • Procedimento
            • Instalação e configuração do Grafana
              • Instalação e configuração do Grafana no Windows
              • Instalação e configuração do Grafana no Linux
              • Instalação e configuração do Grafana em uma instância de notebook
            • Configuração de uma fonte de dados do Grafana
            • Uso do Grafana para configurar painéis e visualizar dados métricos
          • Exibição de todas as métricas de monitoramento do ModelArts no console do AOM
      • What's New
      • Function Overview
      • Product Bulletin
        • Product Bulletin
      • Billing
        • Overview
        • Billing Modes
          • Overview
          • Yearly/Monthly
          • Pay-per-Use
        • Billing Item (ModelArts Standard)
          • Data Management
          • Development Environment
          • Model Training
          • Model Management
          • Inference Deployment
          • Dedicated Resource Pool
        • Billing Item (ModelArts Studio)
          • Model Inference Billing Items
        • Billing Examples
        • Changing the Billing Mode
        • Renewal
          • Overview
          • Manual Renewal
          • Auto-Renewal
        • Bills
        • About Arrears
        • Stopping Billing
        • Cost Management
        • Billing FAQs
          • How Do I View the ModelArts Jobs Being Billed?
          • How Do I View ModelArts Expenditure Details?
          • How Do I Stop Billing If I Do Not Use ModelArts?
          • Billing FAQs About ModelArts Standard Data Management
          • What Should I Do to Avoid Unnecessary Billing After I Label Datasets and Exit?
          • How Are Training Jobs Billed?
          • Why Does Billing Continue After All Projects Are Deleted?
        • Historical Documents to Be Brought Offline
          • Package
      • Getting Started
        • How to Use ModelArts
        • Building a Handwritten Digit Recognition Model with ModelArts Standard
        • Practices for Beginners
      • ModelArts User Guide (Standard)
        • ModelArts Standard Usage
        • ModelArts Standard Preparations
          • Configuring Access Authorization for ModelArts Standard
            • Configuring Agency Authorization for ModelArts with One Click
            • Creating an IAM User and Granting ModelArts Permissions
          • Creating and Managing a Workspace
          • Creating an OBS Bucket for ModelArts to Store Data
        • ModelArts Standard Resource Management
          • About ModelArts Standard Resource Pools
          • Creating a Standard Dedicated Resource Pool
          • Managing Standard Dedicated Resource Pools
            • Viewing Details About a Standard Dedicated Resource Pool
            • Resizing a Standard Dedicated Resource Pool
            • Upgrading the Standard Dedicated Resource Pool Driver
            • Rectifying a Faulty Node in a Standard Dedicated Resource Pool
            • Modifying the Job Types Supported by a Standard Dedicated Resource Pool
            • Migrating Standard Dedicated Resource Pools and Networks to Other Workspaces
            • Configuring the Standard Dedicated Resource Pool to Access the Internet
            • Using TMS Tags to Manage Resources by Group
            • Managing Logical Subpools of a Standard Dedicated Resource Pool
            • Releasing Standard Dedicated Resource Pools and Deleting the Network
          • Managing Standard Dedicated Resource Pool Plug-ins
            • Overview
            • Node Fault Detection (ModelArts Node Agent)
            • ModelArts Metric Collector
            • AI Suite (NV GPU)
            • AI Suite (Modelarts Device Plugin)
            • Volcano Scheduler
            • NodeLocal DNSCache
            • Cloud Native Log Collection
            • kube-prometheus-stack
        • Using Workflows for Low-Code AI Development
          • What Is Workflow?
          • Managing a Workflow
            • Searching for a Workflow
            • Viewing the Running Records of a Workflow
            • Managing a Workflow
            • Retrying, Stopping, or Running a Workflow Phase
          • Workflow Development Command Reference
            • Core Concepts of Workflow Development
            • Configuring Workflow Parameters
            • Configuring the Input and Output Paths of a Workflow
            • Creating Workflow Phases
              • Creating a Dataset Phase
              • Creating a Dataset Labeling Phase
              • Creating a Dataset Import Phase
              • Creating a Dataset Release Phase
              • Creating a Training Job Phase
              • Creating a Model Registration Phase
              • Creating a Service Deployment Phase
            • Creating a Multi-Branch Workflow
              • Multi-Branch Workflow
              • Creating a Condition Phase to Control Branch Execution
              • Configuring Phase Parameters to Control Branch Execution
              • Configuring Multi-Branch Phase Data
            • Creating a Workflow
            • Publishing a Workflow
              • Publishing a Workflow to ModelArts
              • Publishing a Workflow to AI Gallery
            • Advanced Workflow Capabilities
              • Using Big Data Capabilities (MRS) in a Workflow
              • Specifying Certain Phases to Run in a Workflow
        • Using Notebook for AI Development and Debugging
          • Application Scenarios
          • Creating a Notebook Instance (Default Page)
          • Creating a Notebook Instance (New Page)
          • Managing Notebook Instances
            • Searching for a Notebook Instance
            • Updating a Notebook Instance
            • Starting, Stopping, or Deleting a Notebook Instance
            • Saving a Notebook Instance
            • Dynamically Expanding EVS Disk Capacity
            • Dynamically Mounting an OBS Parallel File System
            • Viewing Notebook Events
            • Notebook Cache Directory Alarm Reporting
          • Using a Notebook Instance for AI Development Through JupyterLab
            • Using JupyterLab to Develop and Debug Code Online
            • Common Functions of JupyterLab
            • Using Git to Clone the Code Repository in JupyterLab
            • Creating a Scheduled Job in JupyterLab
            • Uploading Files to JupyterLab
              • Uploading Files from a Local Path to JupyterLab
              • Cloning GitHub Open-Source Repository Files to JupyterLab
              • Uploading OBS Files to JupyterLab
              • Uploading Remote Files to JupyterLab
            • Downloading a File from JupyterLab to a Local PC
            • Using MindInsight Visualization Jobs in JupyterLab
            • Using TensorBoard Visualization Jobs in JupyterLab
          • Using Notebook Instances Remotely Through PyCharm
            • Connecting to a Notebook Instance Through PyCharm Toolkit
            • Manually Connecting to a Notebook Instance Through PyCharm
            • Uploading Data to a Notebook Instance Through PyCharm
          • Using Notebook Instances Remotely Through VS Code
            • Connecting to a Notebook Instance Through VS Code
            • Connecting to a Notebook Instance Through VS Code Toolkit
            • Manually Connecting to a Notebook Instance Through VS Code
            • Uploading and Downloading Files in VS Code
          • Using a Notebook Instance Remotely with SSH
          • ModelArts CLI Command Reference
            • ModelArts CLI Commands
            • (Optional) Installing ma-cli Locally
            • Autocompletion for ma-cli Commands
            • ma-cli Authentication
            • ma-cli image Commands for Building Images
            • ma-cli ma-job Commands for Training Jobs
            • ma-cli dli-job Commands for Submitting DLI Spark Jobs
            • Using ma-cli to Copy OBS Data
          • Using MoXing Commands in a Notebook Instance
            • MoXing Framework Functions
            • Using MoXing in Notebook
            • Introducing MoXing Framework
            • Mapping Between mox.file and Local APIs and Switchover
            • Sample Code for Common Operations
            • Sample Code for Advanced MoXing Usage
        • Preparing and Processing Data
          • Preparing Data
          • Creating a ModelArts Dataset
          • Importing Data to a ModelArts Dataset
            • Introduction to Data Importing
            • Importing Data from OBS
              • Introduction to Importing Data from OBS
              • Importing Data from an OBS Path to ModelArts
              • Specifications for Importing Data from an OBS Directory
              • Importing a Manifest File to ModelArts
              • Specifications for Importing a Manifest File
            • Importing Data from MRS to ModelArts
            • Importing Data from Local Files
          • Labeling ModelArts Data
            • Scenarios
            • Manual Labeling
              • Creating a Manual Labeling Job
              • Labeling Images
              • Labeling Text
              • Labeling Audio
              • Labeling Video
              • Managing Labeling Data
            • Auto Labeling
              • Creating an Auto Labeling Job
              • Hard Examples of an Auto Labeling Job
              • Auto Grouping for Labeling Jobs
            • Team Labeling
              • Using Team Labeling
              • Creating and Managing Teams
              • Creating a Team Labeling Job
              • Reviewing and Accepting Team Labeling Results
              • Managing Teams and Team Members
            • Managing Labeling Jobs
          • Publishing ModelArts Data
          • Analyzing ModelArts Data Characteristics
          • Exporting Data from a ModelArts Dataset
            • Exporting Data from ModelArts to OBS
            • Exporting Data as a New Dataset
          • Getting Started: Creating an Object Detection Dataset
        • Using ModelArts Standard to Train Models
          • Model Training Process
          • Preparing Model Training Code
            • Starting Training Using a Preset Image's Boot File
            • Developing Code for Training Using a Preset Image
            • Developing Code for Training Using a Custom Image
            • Configuring Password-free SSH Mutual Trust Between Instances for a Training Job Created Using a Custom Image
          • Preparing a Model Training Image
          • Creating an Algorithm
          • Creating a Production Training Job
          • Creating a Production Training Job (New Version)
          • Distributed Model Training
            • Overview
            • Creating a Single-Node Multi-PU Distributed Training Job (DataParallel)
            • Creating a Multiple-Node Multi-PU Distributed Training Job (DistributedDataParallel)
            • Example: Creating a DDP Distributed Training Job (PyTorch + GPU)
            • Example: Creating a DDP Distributed Training Job (PyTorch + NPU)
          • Enabling Dynamic Route Acceleration for Training Jobs
          • Incremental Model Training
          • Automatic Model Tuning (AutoSearch)
            • Overview
            • Creating a Training Job for Automatic Model Tuning
          • High Model Training Reliability
            • Training Job Fault Tolerance Check
            • Training Log Failure Analysis
            • Detecting Training Job Suspension
            • Training Job Restart Upon Suspension
            • Resumable Training
            • Enabling Unconditional Auto Restart
          • Configuring Supernode Affinity Group Instances
          • Managing Model Training Jobs
            • Viewing Training Job Details
            • Visualizing the Training Job Process
            • Viewing the Resource Usage of a Training Job
            • Viewing the Model Evaluation Result
            • Viewing Training Job Events
            • Viewing Training Job Logs
            • Priority of a Training Job
            • Using Cloud Shell to Debug a Production Training Job
            • Saving the Image of a Debug Training Job
            • Copying, Stopping, or Deleting a Training Job
            • Managing Environment Variables of a Training Container
            • Viewing Training Job Tags
            • Managing Training Experiments
            • Viewing Monitoring Metrics of a Training Job
        • Using ModelArts Standard to Deploy Models for Inference and Prediction
          • Overview
          • Creating a Model
            • Creation Methods
            • Importing a Meta Model from a Training Job
            • Importing a Meta Model from OBS
            • Importing a Meta Model from a Container Image
          • Model Creation Specifications
            • Model Package Structure
            • Specifications for Editing a Model Configuration File
            • Specifications for Writing a Model Inference Code File
            • Specifications for Using a Custom Engine to Create a Model
            • Examples of Custom Scripts
          • Deploying a Model as Real-Time Inference Jobs
            • Deploying and Using Real-Time Inference
            • Deploying a Model as a Real-Time Service
            • Authentication Methods for Accessing Real-time Services
              • Accessing a Real-Time Service Through Token-based Authentication
              • Accessing a Real-Time Service Through AK/SK-based Authentication
              • Accessing a Real-Time Service Through App Authentication
            • Accessing a Real-Time Service Through Different Channels
              • Accessing a Real-Time Service Through a Public Network
              • Accessing a Real-Time Service Through a VPC Channel
              • Accessing a Real-Time Service Through a VPC High-Speed Channel
            • Accessing a Real-Time Service Using Different Protocols
              • Accessing a Real-Time Service Using WebSocket
              • Accessing a Real-Time Service Using Server-Sent Events
          • Deploying a Model as a Batch Inference Service
          • Managing ModelArts Models
            • Viewing ModelArts Model Details
            • Viewing ModelArts Model Events
            • Managing ModelArts Model Versions
          • Managing a Synchronous Real-Time Service
            • Viewing Details About a Real-Time Service
            • Viewing Events of a Real-Time Service
            • Managing the Lifecycle of a Real-Time Service
            • Modifying a Real-Time Service
            • Viewing Performance Metrics of a Real-Time Service on Cloud Eye
            • Integrating a Real-Time Service API into the Production Environment
            • Configuring Auto Restart upon a Real-Time Service Fault
          • Managing Batch Inference Jobs
            • Viewing Details About a Batch Service
            • Viewing Events of a Batch Service
            • Managing the Lifecycle of a Batch Service
            • Modifying a Batch Service
        • Creating a Custom Image for ModelArts Standard
          • Applications of Custom Images
          • Preset Images Supported by ModelArts
            • ModelArts Preset Image Updates
            • ModelArts Unified Images
            • Preset Dedicated Images in Notebook Instances
            • Preset Dedicated Images for Training
            • Preset Dedicated Images for Inference
          • Creating a Custom Image for a Notebook Instance
            • Creating a Custom Image
            • Creating a Custom Image on ECS and Using It
            • Creating a Custom Image Using Dockerfile
            • Creating a Custom Image Using the Image Saving Function
          • Creating a Custom Image for Model Training
            • Creating a Custom Training Image
            • Creating a Custom Training Image Using a Preset Image
            • Migrating Existing Images to ModelArts
            • Creating a Custom Training Image (PyTorch + Ascend)
            • Creating a Custom Training Image (PyTorch + CPU/GPU)
            • Creating a Custom Training Image (MPI + CPU/GPU)
            • Creating a Custom Training Image (Tensorflow + GPU)
            • Creating a Custom Training Image (MindSpore + Ascend)
          • Creating a Custom Image for Inference
            • Creating a Custom Image for a Model
            • Creating a Custom Image on ECS
        • Monitoring ModelArts Standard Resources
          • Overview
          • Viewing Monitoring Metrics on the ModelArts Console
          • Viewing All ModelArts Monitoring Metrics on the AOM Console
          • Using Grafana to View AOM Monitoring Metrics
            • Installing and Configuring Grafana
              • Installing and Configuring Grafana on Windows
              • Installing and Configuring Grafana on Linux
              • Installing and Configuring Grafana on a Notebook Instance
            • Configuring a Grafana Data Source
            • Configuring a Dashboard to View Metric Data
        • Using CTS to Audit ModelArts Standard
          • ModelArts Standard Key Operations Traced by CTS
          • Viewing ModelArts Standard Audit Logs
      • ModelArts Studio (MaaS) User Guide
        • ModelArts Studio (MaaS) Usage
        • Configuring ModelArts Studio (MaaS) Access Authorization
          • Creating an IAM User and Granting ModelArts Studio (MaaS) Permissions
          • Configuring ModelArts Agency Authorization for Using ModelArts Studio (MaaS)
          • Configuring the Missing ModelArts Studio (MaaS) Permissions
        • Preparing ModelArts Studio (MaaS) Resources
        • ModelArts Studio (MaaS) Real-Time Inference Services
          • Viewing a Built-in Model in ModelArts Studio (MaaS)
          • Deploying a Model Service in ModelArts Studio (MaaS)
          • Managing My Services in ModelArts Studio (MaaS)
            • Starting, Stopping, Periodically Starting or Stopping, or Deleting a Service in ModelArts Studio (MaaS)
            • Scaling Model Service Instances in ModelArts Studio (MaaS)
            • Modifying the QPS of a Model Service in ModelArts Studio (MaaS)
            • Upgrading a Model Service in ModelArts Studio (MaaS)
          • Calling a Model Service in ModelArts Studio (MaaS)
          • ModelArts Studio (MaaS) API Call Specifications
            • Sending a Chat Request (Chat/POST)
            • Obtaining the Model List (Models/GET)
            • Error Codes
          • Creating a Multi-Turn Dialogue in ModelArts Studio (MaaS)
        • ModelArts Studio (MaaS) Management and Statistics
          • Managing API Keys in ModelArts Studio (MaaS)
      • ModelArts User Guide (Lite Server)
        • Before You Start
          • Using Lite Server
          • High-Risk Operations
          • Mapping Between Compute Resources and Image Versions
        • Provisioning Lite Server Resources (Old Version)
        • Provisioning Lite Server Resources (New Version)
        • Configuring Lite Server Resources
          • Configuration Process
          • Configuring the Network
          • Configuring the Storage
          • Configuring the Software Environment
            • Configuring the Software Environment on the NPU Server
        • Using Lite Server Resources
          • Collecting and Uploading NPU Logs
          • Collecting and Uploading GPU Logs
        • Managing Lite Server Resources
          • Viewing Lite Server Details
          • Starting or Stopping the Lite Server
          • Synchronizing the Lite Server Status
          • Changing or Resetting the Lite Server OS
          • Creating a Lite Server OS
          • Lite Server Hot Standby Nodes
          • Modifying a Lite Server Name
          • Authorizing the Repair of Lite Server Nodes
          • Releasing Lite Server Resources
        • Lite Server Plug-in Management
          • Managing Lite Server AI Plug-ins
          • Upgrading the Ascend Driver and Firmware Version on Lite Server
          • Lite Server Node Fault Diagnosis
          • One-Click Pressure Test for Lite Server Nodes
        • Managing Lite Server Supernodes
          • Expanding and Reducing Lite Server Supernodes
          • Periodic Stress Test on Lite Server Supernodes
          • Enabling HCCL Communication Operator-Level Re-execution for Supernodes
        • Monitoring Lite Server Resources
          • Using Cloud Eye to Monitor NPU Resources of a Single Lite Server Node
          • Using Cloud Eye to Monitor the Health Status of Snt9B23 Supernodes
        • Managing CloudPond NPU Resources for Lite Server
        • Using CTS to Audit Lite Server Operations
      • ModelArts User Guide (Lite Cluster)
        • Before You Start
          • Using Lite Cluster
          • High-Risk Operations
          • Software Versions Required by Different Models
        • Enabling Lite Cluster Resources
        • Configuring Lite Cluster Resources
          • Configuring the Lite Cluster Environment
          • Configuring the Lite Cluster Network
          • Configuring kubectl
          • Configuring Lite Cluster Storage
          • (Optional) Configuring the Driver
          • (Optional) Configuring Image Pre-provisioning
        • Using Lite Cluster Resources
          • Using Snt9B for Distributed Training in a Lite Cluster Resource Pool
          • Performing PyTorch NPU Distributed Training In a ModelArts Lite Resource Pool Using Ranktable-based Route Planning
          • Using Snt9B for Inference in a Lite Cluster Resource Pool
          • Using Ascend FaultDiag to Diagnose Logs in the ModelArts Lite Cluster Resource Pool
          • Mounting an SFS Turbo File System to a Lite Cluster
        • Managing Lite Cluster Resources
          • Managing Lite Cluster Resources
          • Managing Lite Cluster Resource Pools
          • Managing Lite Cluster Node Pools
          • Managing Lite Cluster Nodes
          • Resizing a Lite Cluster Resource Pool
          • Upgrading the Lite Cluster Resource Pool Driver
          • Upgrading the Driver of a Lite Cluster Resource Pool Node
          • Monitoring Lite Cluster Resources
            • Viewing Lite Cluster Metrics on AOM
            • Viewing Lite Cluster Metrics Using Prometheus
          • Releasing Lite Cluster Resources
        • Lite Cluster Plug-in Management
          • Overview
          • Node Fault Detection (ModelArts Node Agent)
          • ModelArts Metric Collector
          • AI Suite (ModelArts Device Plugin)
          • Volcano Scheduler
          • Cluster Autoscaler
      • ModelArts User Guide (AI Gallery)
        • AI Gallery
        • Free Assets
        • My Gallery
        • Subscription & Use
          • Searching for and Adding an Asset to Favorites
          • Subscribing to Free Algorithms
          • Subscribing to a Workflow
        • Publish & Share
          • Publishing a Free Algorithm
          • Publishing a Free Model
      • API Reference
        • Before You Start
        • API Overview
        • Calling APIs
          • Making an API Request
          • Authentication
          • Response
        • Development Environment Management
          • Creating a Notebook Instance
          • Querying Notebook Instances
          • Querying All Notebook Instances
          • Querying Details of a Notebook Instance
          • Updating a Notebook Instance
          • Deleting a Notebook Instance
          • Saving a Running Instance as a Container Image
          • Querying the List of Valid Specifications Supported by Notebook Instances
          • Querying the List of Switchable Specifications Supported by Notebook Instances
          • Querying the Available Duration of a Running Notebook Instance
          • Prolonging a Notebook Instance
          • Starting a Notebook Instance
          • Stopping a Notebook Instance
          • Obtaining the Notebook Instances with OBS Storage Mounted
          • OBS Storage Mounting
          • Obtaining Details About a Notebook Instance with OBS Storage Mounted
          • Unmounting OBS Storage from a Notebook Instance
          • Querying Supported Images
          • Registering a Custom Image
          • Querying the User Image List
          • Obtaining Details of an Image
          • Deleting an Image
        • Training Management
          • Creating an Algorithm
          • Querying the Algorithm List
          • Querying Algorithm Details
          • Modifying an Algorithm
          • Deleting an Algorithm
          • Creating a Training Job
          • Querying the Details About a Training Job
          • Modifying the Description of a Training Job
          • Deleting a Training Job
          • Terminating a Training Job
          • Querying the Logs of a Specified Task in a Given Training Job (Preview)
          • Querying the Logs of a Specified Task in a Training Job (OBS Link)
          • Querying the Running Metrics of a Specified Task in a Training Job
          • Querying a Training Job List
          • Obtaining the Events of a Training Job
          • Obtaining the General Specifications Supported by a Training Job
          • Obtaining the Preset AI Frameworks Supported by a Training Job
        • App Authentication Management
          • Obtaining the App List
          • Creating Apps
          • Obtaining App Details
          • Deleting an App
          • Adding an App Code
          • Resetting an App Code
          • Deleting an App Code
          • Resetting an AppSecret
          • Obtaining the List of APIs Bound to an App
          • Registering an API and Authorizing the API to an App
          • Deleting an API
          • Authorizing an API to an App
          • Updating API Authorization
          • Canceling the Authorization of an API to an App
          • Obtaining API Authorization Relationships
          • Creates an API.
          • API query
          • Querying APIs and Apps
          • Check whether the app exists.
        • Service Management
          • Updating the Service Through the Patch Operation
          • Obtaining Service Monitoring
          • Obtaining Services
          • Deploying Services
          • Obtaining Supported Service Deployment Specifications
          • Obtaining Service Details
          • Updating Service Configurations
          • Deleting a Service
          • Updating a Single Property of a Model Service
          • Obtaining Service Event Logs
          • Obtaining Service Update Logs
          • Adding a Resource Tag
          • Deleting Resource Tags
          • Obtaining Inference Service Tags
          • Obtaining an Inference VPC Access Channel
        • Resource Management
          • Querying OS Configuration Parameters
          • Querying a Plug-in Template
          • Obtaining Nodes in a Resource Pool
          • Deleting nodes in batches
          • Querying a Trace List
          • Creating Network Resources
          • Obtaining Network Resources
          • Obtaining a Network Resource
          • Deleting a Network Resource
          • Updating a Network Resource
          • Querying the Real-Time Resource Usage
          • Creating Resource Pools
          • Obtaining Resource Pools
          • Obtaining a Resource Pool
          • Deleting a Resource Pool
          • Updating a Resource Pool
          • Monitoring a Resource Pool
          • Resource Pool Statistics
          • Obtaining Resource Specifications
          • Obtain Jobs in a Resource Pool
          • Querying dedicated resource pool Job Statistics
        • DevServer Management
          • Obtaining All DevServer Instances of a User
          • Creating a DevServer Instance
          • Obtaining DevServer Instance Details
          • Deleting DevServer Instances
          • Synchronizing the Status of All DevServer Instances of a User in Real Time
          • Starting DevServer Instances
          • Stopping DevServer Instances
          • Creating DevServer Supernode Tags
          • Deleting DevServer Supernode Tags
          • Obtaining the DevServer Supernode Tags
          • Reinstalling the OS Image of the DevServer Server
          • Changing the OS Image of the DevServer Server
          • Changing the OS Image of the DevServer Supernode Server
          • Obtaining Details About All Supernode Instances of a User
          • Deleting a DevServer Supernode Instance
          • Restarting a DevServer Instance
          • Starting a DevServer Supernode Server
          • Stopping a DevServer Supernode Server
        • Authorization Management
          • Viewing an Authorization List
          • Configuring Authorization
          • Deleting Authorization
          • Creating a ModelArts Agency
        • Workspace Management
          • Querying Details About a Workspace
          • Modifying a Workspace
          • Deleting a Workspace
          • Querying a Workspace Quota
          • Modifying a Workspace Quota
          • Querying a Workspace List
          • Creating a Workspace
        • Quota Management
          • Obtaining OS Quotas
        • Resource Tag Management
          • Obtaining All Tags of Resource Pools
          • Obtaining Tags of a Resource Pool
        • Node Pool Management
          • Obtaining Node Pools
          • Creating a Node Pool
          • Obtaining Details About a Specified Node Pool
          • Updating a Node Pool
          • Deleting a Node Pool
          • Obtaining Nodes in a Node Pool
        • Node Management
          • Locking Node Functions in Batches
          • Unlocking Node Functions in Batches
          • Changing the Node Specifications
        • AI Application Management
          • Obtaining the Model Runtime
          • Querying the AI Application List
          • Creating an AI Application
          • Obtaining Details About an AI Application
          • Deleting an AI application
        • Application Authentication Management
          • Querying the API Authentication Information of an Application
        • Use Cases
          • Creating a Development Environment Instance
          • Using PyTorch to Create a Training Job (New-Version Training)
          • Managing ModelArts Authorization
        • Permissions Policies and Supported Actions
          • Introduction
          • Data Management Permissions
          • DevEnviron Permissions
          • Training Job Permissions
          • Model Management Permissions
          • Service Management Permissions
        • Appendix
          • Status Code
          • Error Codes
          • Obtaining a Project ID and Name
          • Obtaining an Account Name and ID
          • Obtaining a Username and ID
        • Historical APIs
          • Data Management (Old Version)
            • Querying the Dataset List
            • Creating a Dataset
            • Querying Details About a Dataset
            • Modifying a Dataset
            • Deleting a Dataset
            • Obtaining Dataset Statistics
            • Querying the Monitoring Data of a Dataset
            • Querying the Dataset Version List
            • Creating a Dataset Labeling Version
            • Querying Details About a Dataset Version
            • Deleting a Dataset Labeling Version
            • Obtaining a Sample List
            • Adding Samples in Batches
            • Deleting Samples in Batches
            • Obtaining Details About a Sample
            • Obtaining Sample Search Condition
            • Obtaining a Sample List of a Team Labeling Task by Page
            • Obtaining Details About a Team Labeling Sample
            • Querying the Dataset Label List
            • Creating a Dataset Label
            • Modifying Labels in Batches
            • Deleting Labels in Batches
            • Updating a Label by Label Names
            • Deleting a Label and the Files that Only Contain the Label
            • Updating Sample Labels in Batches
            • Querying the Team Labeling Task List of a Dataset
            • Creating a Team Labeling Task
            • Querying Details About a Team Labeling Task
            • Starting a Team Labeling Task
            • Updating a Team Labeling Task
            • Deleting a Team Labeling Task
            • Creating a Team Labeling Acceptance Task
            • Querying the Report of a Team Labeling Acceptance Task
            • Updating Status of a Team Labeling Acceptance Task
            • Querying Details About Team Labeling Task Statistics
            • Querying Details About the Progress of a Team Labeling Task Member
            • Querying the Team Labeling Task List by a Team Member
            • Submitting Sample Review Comments of an Acceptance Task
            • Reviewing Team Labeling Results
            • Updating Labels of Team Labeling Samples in Batches
            • Querying the Labeling Team List
            • Creating a Labeling Team
            • Querying Details About a Labeling Team
            • Updating a Labeling Team
            • Deleting a Labeling Team
            • Sending an Email to a Labeling Team Member
            • Querying the List of All Labeling Team Members
            • Querying the List of Labeling Team Members
            • Creating a Labeling Team Member
            • Deleting Labeling Team Members in Batches
            • Querying Details About Labeling Team Members
            • Updating a Labeling Team Member
            • Deleting a Labeling Team Member
            • Querying the Dataset Import Task List
            • Creating an Import Task
            • Querying Details About a Dataset Import Task
            • Querying the Dataset Export Task List
            • Creating a Dataset Export Task
            • Querying the Status of a Dataset Export Task
            • Synchronizing a Dataset
            • Querying the Status of a Dataset Synchronization Task
            • Obtaining an Auto Labeling Sample List
            • Querying Details About an Auto Labeling Sample
            • Obtaining an Auto Labeling Task List by Page
            • Starting Intelligent Tasks
            • Obtaining Details About an Auto Labeling Task
            • Stopping an Intelligent Task
            • Querying the List of a Processing Task
            • Creating a Processing Task
            • Querying Details About a Processing Task
            • Updating a Processing Task
            • Deleting a Processing Task
          • DevEnviron (Old Version)
            • Creating a Development Environment Instance
            • Obtaining Development Environment Instances
            • Obtaining Details About a Development Environment Instance
            • Modifying the Description of a Development Environment Instance
            • Deleting a Development Environment Instance
            • Managing a Development Environment Instance
          • Training Management (Old Version)
            • Training Jobs
              • Creating a Training Job
              • Querying a Training Job List
              • Querying the Details About a Training Job Version
              • Deleting a Version of a Training Job
              • Obtaining Training Job Versions
              • Creating a Version of a Training Job
              • Stopping a Training Job
              • Modifying the Description of a Training Job
              • Deleting a Training Job
              • Obtaining the Name of a Training Job Log File
              • Querying a Built-in Algorithm
              • Querying Training Job Logs
            • Training Job Parameter Configuration
              • Creating a Training Job Configuration
              • Querying a List of Training Job Configurations
              • Modifying a Training Job Configuration
              • Deleting a Training Job Configuration
              • Querying the Details About a Training Job Configuration
            • Visualization Jobs
              • Creating a Visualization Job
              • Querying a Visualization Job List
              • Querying the Details About a Visualization Job
              • Modifying the Description of a Visualization Job
              • Deleting a Visualization Job
              • Stopping a Visualization Job
              • Restarting a Visualization Job
            • Resource and Engine Specifications
              • Querying Job Resource Specifications
              • Querying Job Engine Specifications
            • Job Statuses
      • SDK Reference
        • Before You Start
        • SDK Overview
        • Getting Started
        • (Optional) Installing the ModelArts SDK Locally
        • Session Authentication
          • (Optional) Session Authentication
          • Authentication Using the Username and Password
          • AK/SK-based Authentication
        • OBS Management
          • Overview of OBS Management
          • Transferring Files (Recommended)
          • Uploading a File to OBS
          • Uploading a Folder to OBS
          • Downloading a File from OBS
          • Downloading a Folder from OBS
        • Data Management
          • Managing Datasets
            • Querying a Dataset List
            • Creating a Dataset
            • Querying Details About a Dataset
            • Modifying a Dataset
            • Deleting a Dataset
          • Managing Dataset Versions
            • Obtaining a Dataset Version List
            • Creating a Dataset Version
            • Querying Details About a Dataset Version
            • Deleting a Dataset Version
          • Managing Samples
            • Querying a Sample List
            • Querying Details About a Sample
            • Deleting Samples in a Batch
          • Managing Dataset Import Tasks
            • Querying a Dataset Import Task List
            • Creating a Dataset Import Task
            • Querying the Status of a Dataset Import Task
          • Managing Export Tasks
            • Querying a Dataset Export Task List
            • Creating a Dataset Export Task
            • Querying the Status of a Dataset Export Task
          • Managing Manifest Files
            • Overview of Manifest Management
            • Parsing a Manifest File
            • Creating and Saving a Manifest File
            • Parsing a Pascal VOC File
            • Creating and Saving a Pascal VOC File
          • Managing Labeling Jobs
            • Creating a Labeling Job
            • Obtaining the Labeling Job List of a Dataset
            • Obtaining Details About a Labeling Job
        • Training Management (New Version)
          • Training Jobs
            • Creating a Training Job
            • Debugging a Training Job
              • Using the SDK to Debug a Multi-Node Distributed Training Job
              • Using the SDK to Debug a Single-Node Training Job
            • Obtaining Training Jobs
            • Obtaining the Details About a Training Job
            • Modifying the Description of a Training Job
            • Deleting a Training Job
            • Terminating a Training Job
            • Obtaining Training Logs
            • Obtaining the Runtime Metrics of a Training Job
          • APIs for Resources and Engine Specifications
            • Obtaining Resource Flavors
            • Obtaining Engine Types
        • Training Management (Old Version)
          • Training Jobs
            • Creating a Training Job
            • Debugging a Training Job
            • Querying the List of Training Jobs
            • Querying the Details About a Training Job
            • Modifying the Description of a Training Job
            • Obtaining the Name of a Training Job Log File
            • Querying Training Job Logs
            • Deleting a Training Job
          • Training Job Versions
            • Creating a Training Job Version
            • Querying the List of Training Job Versions
            • Querying the Details About a Training Job Version
            • Stopping a Training Job Version
            • Deleting a Training Job Version
          • Training Job Parameter Configuration
            • Creating a Training Job Configuration
            • Querying the List of Training Job Parameter Configuration Objects
            • Querying the List of Training Job Configurations
            • Querying the Details About a Training Job Configuration
            • Modifying a Training Job Configuration
            • Deleting a Training Job Configuration
          • Visualization Jobs
            • Creating a Visualization Job
            • Querying the List of Visualization Job Objects
            • Querying the List of Visualization Jobs
            • Querying the Details About a Visualization Job
            • Modifying the Description of a Visualization Job
            • Stopping a Visualization Job
            • Restarting a Visualization Job
            • Deleting a Visualization Job
          • Resource and Engine Specifications
            • Querying a Built-in Algorithm
            • Querying the List of Resource Flavors
            • Querying the List of Engine Types
          • Job Statuses
        • Model Management
          • Debugging a Model
          • Importing a Model
          • Obtaining Models
          • Obtaining Model Objects
          • Obtaining Details About a Model
          • Deleting a Model
        • Service Management
          • Service Management Overview
          • Deploying a Local Service for Debugging
          • Deploying a Real-Time Service
          • Obtaining Details About a Service
          • Testing an Inference Service
          • Obtaining Services
          • Obtaining Service Objects
          • Updating Service Configurations
          • Obtaining Service Monitoring Information
          • Obtaining Service Logs
          • Delete a Service
        • Change History
      • FAQs
        • Permissions
          1. What Do I Do If a Message Indicating Insufficient Permissions Is Displayed When I Use ModelArts?
          2. How Do I Isolate IAM Users on a Notebook Instance?
          3. How Do I Obtain an Access Key?
        • Storage
          1. How Do I View All Files Stored in OBS on ModelArts?
        • Standard Workflow
          1. How Do I Locate Workflow Running Errors?
        • ModelArts Standard Data Preparation
          1. Is There a File Size Limit for Images to Be Added to a ModelArts Dataset?
          2. How Do I Import Local Labeled Data to ModelArts?
          3. Where Are the Data Labeling Results Stored in ModelArts?
          4. How Do I Download Labeling Results from ModelArts to a Local PC?
          5. Why Can't Team Members Receive Emails for a Team Labeling Task in ModelArts?
          6. How Is Data Distributed Between Team Members During Team Labeling in ModelArts?
          7. How Do I Merge Two Datasets in ModelArts?
          8. Why Are Images Displayed in Different Angles Under the Same Account in ModelArts?
          9. Do I Need to Train the Model Again Using the Data Newly Added After Auto Labeling Is Complete in ModelArts?
          10. How Do I Split an Image Dataset into Training and Validation Sets in ModelArts?
          11. Can I Customize Labels During Object Detection Labeling in ModelArts?
          12. What Should I Do If I Can't Find a New Dataset Version in ModelArts?
          13. How Do I Split a Dataset in ModelArts?
          14. How Do I Delete Images from a Dataset in ModelArts?
        • ModelArts Standard Notebook
          1. Is the Keras Engine Supported by ModelArts Notebook Instances?
          2. How Do I Upload a File from a Notebook Instance to OBS or Download a File from OBS to a Notebook Instance in ModelArts?
          3. Where Is Data Uploaded from a ModelArts Notebook Instance?
          4. How Do I Copy Data from Notebook A to Notebook B in ModelArts?
          5. How Do I Rename an OBS File on a ModelArts Notebook Instance?
          6. How Do I Use the pandas Library to Process Data in OBS Buckets on a ModelArts Notebook Instance?
          7. How Do I Access the OBS Bucket of Another Account from a ModelArts Notebook Instance?
          8. What Is the Default Working Directory of JupyterLab on ModelArts Notebook Instances?
          9. How Do I Check the CUDA Version Used by a ModelArts Notebook Instance?
          10. How Do I Obtain the External IP Address of the Local Host from a ModelArts Notebook Instance?
          11. Is There a Proxy for ModelArts Notebook Instances? How Do I Disable It?
          12. How Do I Customize Engine IPython Kernel If the Built-in Engines of ModelArts Notebook Instances Do Not Meet My Requirements?
          13. What Should I Do If It Is Unstable to Install the Remote Plug-in on a ModelArts Notebook Instance?
          14. How Do I Connect to a Restarted ModelArts Notebook Instance?
          15. What Should I Do If the Source Code Cannot Be Accessed When I Use VS Code to Debug Code on a ModelArts Notebook Instance?
          16. How Do I View Remote Logs Using VS Code on a ModelArts Notebook Instance?
          17. How Do I Open the VS Code Configuration File settings.json on a ModelArts Notebook Instance?
          18. How Do I Set the Background Color of VS Code to Bean Green on a ModelArts Notebook Instance?
          19. How Do I Configure the Default Plug-in Remotely Installed for VS Code on a ModelArts Notebook Instance?
          20. How Do I Install a Local Plug-in Remotely or a Remote Plug-in Locally in ModelArts VS Code?
          21. How Do I Use Multiple Ascend Cards for Debugging on a ModelArts Notebook Instance?
          22. Why Are the Training Speeds Similar When Different Resource Flavors Are Used for Training on ModelArts Notebook Instances?
          23. How Do I Perform Incremental Training When Using MoXing on a ModelArts Notebook Instance?
          24. How Do I View the GPU Usage on a ModelArts Notebook Instance?
          25. How Can I Print the GPU Usage in Code on a ModelArts Notebook Instance?
          26. What Are the Relationships Among JupyterLab Directories, Terminal Files, and OBS Files on ModelArts Notebook Instances?
          27. How Do I Use ModelArts Datasets on a ModelArts Notebook Instance?
          28. pip and Common Commands
          29. What Are the Sizes of the /cache Directories for Resources with Varying Specifications on ModelArts Notebook Instances?
          30. What Is the Impact of Resource Overcommitment on ModelArts Notebook Instances?
          31. How Do I Install External Libraries in a Notebook Instance?
          32. How Do I Handle Unstable Internet Access Speed in ModelArts Notebook?
          33. Can I Use GDB in a Notebook Instance?
        • ModelArts Standard Model Training
          1. What Should I Do If the Model Trained in ModelArts Is Underfitting?
          2. How Do I Obtain a Trained Model in ModelArts?
          3. How Do I Obtain RANK_TABLE_FILE for Distributed Training in ModelArts?
          4. How Do I Configure Input and Output Data for Model Training in ModelArts?
          5. How Do I Improve Training Efficiency While Reducing Interaction with OBS in ModelArts?
          6. How Do I Define Path Variables When Using MoXing to Copy Data in ModelArts?
          7. How Do I Create a Training Job That References a Third-Party Dependency Package in ModelArts?
          8. How Do I Install C++ Dependent Libraries During ModelArts Training?
          9. How Do I Check Whether a Folder Copy Is Complete During Job Training in ModelArts?
          10. How Do I Load Some Well Trained Parameters During Job Training in ModelArts?
          11. What Should I Do If I Cannot Access the Folder Using os.system ('cd xxx') During Training in ModelArts?
          12. How Do I Obtain the Dependency File Path from Training Code in ModelArts?
          13. How Do I Obtain the Actual File Path in a Training Container in ModelArts?
          14. What Are the Sizes of the /cache Directories for Resources with Varying Specifications in Training Jobs in ModelArts?
          15. Why Do Training Jobs Have Two Hyperparameter Directories /work and /ma-user in ModelArts?
          16. How Do I View the Resource Usage of a Training Job in ModelArts?
          17. How Do I Download a Well Trained Model in ModelArts or Migrate It to Another Account?
          18. What Should I Do If RuntimeError: Socket Timeout Is Displayed During Distributed Process Group Initialization using torchrun?
          19. What Should I Do If an Error Is Reported Indicating that the .so File in the $ANACONDA_DIR/envs/$DEFAULT_CONDA_ENV_NAME/lib Directory Cannot Be Found During Training?
        • ModelArts Standard Inference Deployment
          1. How Do I Import a Keras .h5 Model to ModelArts?
          2. How Do I Edit the Installation Package Dependency Parameters in the Model Configuration File When Importing a Model to ModelArts?
          3. How Do I Change the Default Port When I Create a Real-Time Service Using a Custom Image in ModelArts?
          4. Does ModelArts Support Multi-Model Import?
          5. What Are the Restrictions on the Image Size for Importing AI Applications to ModelArts?
          6. What Are the Differences Between Real-Time Services and Batch Services in ModelArts?
          7. Why Can't I Select Ascend Snt3 Resources When Deploying Models in ModelArts?
          8. Can I Locally Deploy Models Trained on ModelArts?
          9. What Is the Maximum Size of a ModelArts Real-Time Service Prediction Request Body?
          10. How Do I Prevent Python Dependency Package Conflicts in a Custom Prediction Script When Deploying a Real-Time Service in ModelArts?
          11. How Do I Speed Up Real-Time Service Prediction in ModelArts?
          12. Can a New-Version AI Application Still Use the Original API in ModelArts?
          13. What Is the Format of a Real-Time Service API in ModelArts?
          14. How Do I Fill in the Request Header and Request Body When a ModelArts Real-Time Service Is Running?
        • ModelArts Standard Images
          1. How Do I Use the Image Customized by a User Under a Different Tenant Account to Create a Notebook Instance?
          2. How Do I Log In to SWR and Upload Images to It?
          3. How Do I Configure Environment Variables for an Image in a Dockerfile?
          4. How Do I Start a Container Using a Docker Image?
          5. How Do I Configure a Conda Source on a ModelArts Notebook Instance?
          6. What Are the Software Version Requirements for a Custom Image?
          7. Why Is an Image Reported as Larger Than 35 GB When I'm Saving It But Its Size Is Displayed as 13 GB in SWR?
          8. How Do I Prevent the Save Failure of a Custom Image Larger Than 35 GB?
          9. How Do I Reduce the Size of the Target Image Created on the Local PC or ECS?
          10. Will an Oversized Image Become Smaller If I Uninstall and Reinstall Its Packages?
          11. What Do I Do If Error "ModelArts.6787" Is Reported When I Register an Image in ModelArts?
          12. How Do I Set the Default Kernel?
        • ModelArts Standard Dedicated Resource Pools
          1. Can I Use ECSs to Create a Dedicated Resource Pool for ModelArts?
          2. Can I Deploy Multiple Services on One Dedicated Resource Pool Node in ModelArts?
          3. What Are the Differences Between Public Resource Pools and Dedicated Resource Pools in ModelArts?
          4. Why Does a Job in ModelArts Stay in the Pending State?
          5. Why Can I View the Deleted Dedicated Resource Pools That Failed to Be Created on the ModelArts Console?
          6. How Do I Add a VPC Peering Connection Between a Dedicated Resource Pool and an SFS in ModelArts?
        • ModelArts Studio (MaaS)
          1. How Long Does It Take for an API Key to Become Valid After It Is Created in MaaS?
          2. Can I Use a MaaS API Key Across Regions?
          3. What Are the Format Requirements for Configuring the Model Service API URL in MaaS?
          4. How Do I Obtain the Model Name in MaaS?
        • API/SDK
          1. Can ModelArts APIs or SDKs Be Used to Download Models to a Local PC?
          2. Does ModelArts Use the OBS API to Access OBS Files over an Intranet or the Internet?
        • History
          1. How Do I Upload Data to OBS?
          2. Which AI Frameworks Does ModelArts Support?
          3. How Does ModelArts Use Tags to Manage Resources by Group?
          4. How Do I View ModelArts Expenditure Details?
          5. What Do I Do If the VS Code Window Is Not Displayed?
          6. What Do I Do If a Remote Connection Failed After VS Code Is Opened?
          7. What Do I Do If Error Message "Could not establish connection to xxx" Is Displayed During a Remote Connection?
          8. What Do I Do If Error Message "Bad owner or permissions on C:\Users\Administrator/.ssh/config" or "Connection permission denied (publickey)" Is Displayed?
          9. What Do I Do If Error Message "ssh: connect to host xxx.pem port xxxxx: Connection refused" Is Displayed?
          10. What Do I Do If Error Message "no such identity: C:/Users/xx /test.pem: No such file or directory" Is Displayed?
          11. What Are the Precautions for Switching Training Jobs from the Old Version to the New Version?
      • Troubleshooting
        • General Issues
          • OBS Errors on ModelArts
          • ModelArts.7211: Restricted Account
        • DevEnviron
          • Environment Configuration Faults
            • Disk Space Used Up
            • An Error Is Reported When Conda Is Used to Install Keras 2.3.1 in Notebook
            • Error "HTTP error 404 while getting xxx" Is Reported During Dependency Installation in a Notebook
            • The numba Library Has Been Installed in a Notebook Instance and Error "import numba ModuleNotFoundError: No module named 'numba'" Is Reported
            • Failed to Save Files in JupyterLab
            • "Server Connection Error" Is Displayed After the Kernelgateway Process Is Stopped
            • SSH Access Is Occasionally Denied, and the Error Message "Not allowed at this time" Is Displayed
          • Instance Faults
            • Failed to Create a Notebook Instance and JupyterProcessKilled Is Displayed in Events
            • Failed to Access a Notebook Instance
            • An Error Is Displayed Indicating No Space Left After the pip install Command Is Executed
            • Code Can Be Run But Cannot Be Saved and Error Message "save error" Is Displayed
            • A Request Timeout Error Is Reported When the Open Button of a Notebook Instance Is Clicked
            • ModelArts.6333 Error Occurs
            • What Can I Do If a Message Is Displayed Indicating that the Token Does Not Exist or Is Lost When I Open a Notebook Instance?
          • Code Running Failures
            • An Error Occurs When You Run Code on a Notebook Instance Because No File Is Found in /tmp
            • Notebook Instance Failed to Run Code
            • "dead kernel" Is Displayed and the Instance Breaks Down When Training Code Is Run
            • cudaCheckError Occurs During Training
            • What Do I Do If Insufficient Space Is Displayed in DevEnviron?
            • Notebook Instance Breaks Down When opencv.imshow Is Used
            • Path of a Text File Generated in the Windows OS Cannot Be Found on a Notebook Instance
            • What Do I Do If No Kernel Is Displayed After a Notebook File Is Created?
          • JupyterLab Plug-in Faults
            • Invalid Git Plug-in Password
          • Failures to Access the Development Environment Through VS Code
            • VS Code Window Is Not Displayed
            • Remote Connection Failed After VS Code Is Opened
            • Failed to Connect to the Development Environment Via VS Code
            • Error Message "Could not establish connection to xxx" Is Displayed During a Remote Connection
            • Connection to a Remote Development Environment Remains in the "Setting up SSH Host xxx: Downloading VS Code Server locally" State for More Than 10 Minutes
            • What Do I Do If the Connection to a Remote Development Environment Remains in the State of "Setting up SSH Host xxx: Copying VS Code Server to host with scp" for More Than 10 Minutes?
            • Connection to a Remote Development Environment Remains in the State of "ModelArts Remote Connect: Connecting to instance xxx..." for More Than 10 Minutes
            • Remote Connection Is in the Retry State
            • Error Message "The VS Code Server failed to start" Is Displayed
            • Error Message "Permissions for 'x:/xxx.pem' are too open" Is Displayed
            • Error Message "Bad owner or permissions on C:\Users\Administrator/.ssh/config" Is Displayed
            • Error Message "Connection permission denied (publickey)" Is Displayed
            • What Do I Do If Error Message "ssh: connect to host xxx.pem port xxxxx: Connection refused" Is Displayed?
            • What Do I Do If Error Message "ssh: connect to host ModelArts-xxx port xxx: Connection timed out" Is Displayed?
            • Error Message "Load key "C:/Users/xx/test1/xxx.pem": invalid format" Is Displayed
            • Error Message "An SSH installation couldn't be found" or "Could not establish connection to instance xxx: 'ssh' ..." Is Displayed
            • Error Message "no such identity: C:/Users/xx /test.pem: No such file or directory" Is Displayed
            • Error Message "Host key verification failed" or "Port forwarding is disabled" Is Displayed
            • Error Message "Failed to install the VS Code Server" or "tar: Error is not recoverable: exiting now" Is Displayed
            • Error Message "XHR failed" Is Displayed During VS Code's Connection to a Remote Notebook Instance
            • VS Code Connection Automatically Disconnected If No Operation Is Performed for a Long Time
            • Remote Connection Takes a Long Time After VS Code Is Automatically Upgraded
            • Error Message "Connection reset" Is Displayed During an SSH Connection
            • Notebook Instance Is Frequently Disconnected or Stuck After It Is Connected with MobaXterm Using SSH
            • Error Message "Missing GLIBC, Missing required dependencies" Is Displayed When VS Code Is Used to Connect to a Development Environment
            • Error Message Is Displayed Indicating That ms-vscode-remote.remot-sdh Is Uninstalled Due to a Reported Issue When VSCode-huawei Is Used
            • Instance Directory in VS Code Does Not Match That on the Cloud When VS Code Is Used to Connect to an Instance
          • Custom Image Faults
            • Faults of Custom Images on Notebook Instances
            • What If the Error Message "there are processes in 'D' status, please check process status using'ps -aux' and kill all the 'D' status processes" or "Buildimge,False,Error response from daemon,Cannot pause container xxx" Is Displayed When I Save an Image?
            • What Do I Do If Error "container size %dG is greater than threshold %dG" Is Displayed When I Save an Image?
            • What Do I Do If Error "too many layers in your image" Is Displayed When I Save an Image?
            • What Do I Do If Error "The container size (xG) is greater than the threshold (25G)" Is Reported When I Save an Image?
            • Error Message "BuildImage,True,Commit successfully|PushImage,False,Task is running." Is Displayed When an Image Is Saved
            • No Kernel Is Displayed After a Notebook Instance Created Using a Custom Image Is Started
            • Some Extra Packages Are Found in the Conda Environment Built Using a Custom Image
            • Failed to Create a Custom Image Using ma-cli and an Error Is Displayed Indicating that the File Does Not Exist
            • Error Message "Unexpected error from cudaGetDeviceCount" Is Displayed When Torch Is Used
            • Unable to Access a Notebook Instance Created Using an Old Image
          • Other Faults
            • Failed to Open the checkpoints Folder in Notebook
            • Failed to Use a Purchased Dedicated Resource Pool to Create New-Version Notebook Instances
            • Error Message "Permission denied" Is Displayed When the tensorboard Command Is Used to Open a Log File on a Notebook Instance
        • Training Jobs
          • OBS Operation Issues
            • Failed to Read Files
            • Error Message Is Displayed Repeatedly When a TensorFlow-1.8 Job Is Connected to OBS
            • TensorFlow Stops Writing TensorBoard to OBS When the Size of Written Data Reaches 5 GB
            • Error "Unable to connect to endpoint" Error Occurs When a Model Is Saved
            • Error Message "BrokenPipeError: Broken pipe" Is Displayed When OBS Data Is Copied
            • Error Message "ValueError: Invalid endpoint: obs.xxxx.com" Is Displayed in Logs
            • Error Message "errorMessage:The specified key does not exist" Displayed in Logs
          • In-Cloud Migration Adaptation Issues
            • Failed to Import a Module
            • Error Message "No module named .*" Is Displayed in Training Job Logs
            • Failed to Install a Third-Party Package
            • Failed to Download the Code Directory
            • Error Message "No such file or directory" Is Printed in Training Job Logs
            • Failed to Find the .so File During Training
            • ModelArts Training Job Failed to Parse Parameters and an Error Is Displayed in the Log
            • Training Output Path Is Used by Another Job
            • Error Message "RuntimeError: std:exception" Is Displayed for a PyTorch 1.0 Engine
            • Error Message "retCode=0x91, [the model stream execute failed]" Displayed in MindSpore Logs
            • Error Occurred When Pandas Reads Data from an OBS File If MoXing Is Used to Adapt to an OBS Path
            • Error Message "Please upgrade numpy to >= xxx to use this pandas version" Is Displayed in Logs
            • Reinstalled CUDA Version Does Not Match the One in the Target Image
            • Error ModelArts.2763 Occurred During Training Job Creation
            • Error Message "AttributeError: module '***' has no attribute '***'" Is Displayed Training Job Logs
            • System Container Exits Unexpectedly
          • Hard Faults Due to Space Limit
            • Downloading Files Timed Out or No Space Left for Reading Data
            • Insufficient Container Space for Copying Data
            • Error Message "No space left" Displayed When a TensorFlow Multi-node Job Downloads Data to /cache
            • Size of the Log File Has Reached the Limit
            • Error Message "write line error" Is Displayed in Logs
            • Error Message "No space left on device" Is Displayed in Logs
            • Training Job Failed Due to OOM
            • Insufficient Disk Space
          • Internet Access Issues
            • Error Message "Network is unreachable" Is Displayed in Logs
            • URL Connection Timed Out in a Running Training Job
          • Permission Issues
            • Error "stat:403 reason:Forbidden" Is Displayed in Logs When a Training Job Accesses OBS
            • Error Message "Permission denied" Is Displayed in Logs
          • GP Issues
            • Error Message "No CUDA-capable device is detected" Is Displayed in Logs
            • Error Message "RuntimeError: connect() timed out" Is Displayed in Logs
            • Error Message "cuda runtime error (10) : invalid device ordinal at xxx" Is Displayed in Logs
            • Error Message "RuntimeError: Cannot re-initialize CUDA in forked subprocess" Is Displayed in Logs
            • No GP Detected in a Training Job
          • Service Code Issues
            • Error Message "pandas.errors.ParserError: Error tokenizing data. C error: Expected .* fields" Is Displayed in Logs
            • Error Message "max_pool2d_with_indices_out_cuda_frame failed with error code 0" Is Displayed in Logs
            • Training Job Failed with Error Code 139
            • Debugging Training Code in a Development Environment
            • Error Message "'(slice(0, 13184, None), slice(None, None, None))' is an invalid key" Is Displayed in Logs
            • Error Message "DataFrame.dtypes for data must be int, float or bool" Is Displayed in Logs
            • Error Message "CUDNN_STATUS_NOT_SUPPORTED" Is Displayed in Logs
            • Error Message "Out of bounds nanosecond timestamp" Is Displayed in Logs
            • Error Message "Unexpected keyword argument passed to optimizer" Is Displayed in Logs
            • Error Message "no socket interface found" Is Displayed in Logs
            • Error Message "Runtimeerror: Dataloader worker (pid 46212) is killed by signal: Killed BP" Displayed in Logs
            • Error Message "AttributeError: 'NoneType' object has no attribute 'dtype'" Displayed in Logs
            • Error Message "No module name 'unidecode'" Is Displayed in Logs
            • Distributed TensorFlow Cannot Use tf.variable
            • When MXNet Creates kvstore, the Program Is Blocked and No Error Is Reported
            • ECC Error Occurs in the Log, Causing Training Job Failure
            • Training Job Failed Because the Maximum Recursion Depth Is Exceeded
            • Training Using a Built-in Algorithm Failed Due to a bndbox Error
            • Training Job Status Is Reviewing Job Initialization
            • Training Job Process Exits Unexpectedly
            • Stopped Training Job Process
          • Training Job Suspensions
            • Locating Training Job Suspension
            • Data Replication Suspension
            • Suspension Before Training
            • Suspension During Training
            • Suspension in the Last Training Epoch
          • Running a Training Job Failed
            • Troubleshooting a Training Job Failure
            • An NCCL Error Occurs When a Training Job Fails to Be Executed
            • Troubleshooting Process
            • A Training Job Created Using a Custom Image Is Always in the Running State
            • Failed to Find the Boot File When a Training Job Is Created Using a Custom Image
            • Running a Job Failed Due to Persistently Rising Memory Usage
          • Training Jobs Created in a Dedicated Resource Pool
            • No Cloud Storage Name or Mount Path Displayed on the Page for Creating a Training Job
            • Storage Volume Failed to Be Mounted to the Pod During Training Job Creation
          • Training Performance Issues
            • Training Performance Deteriorated
        • Inference Deployment
          • Model Management
            • Failed to Create a Model
            • Suspended Account or Insufficient Permission to Import Models
            • Failed to Build an Image or Import a File During Model Creation
            • Failed to Obtain the Directory Structure in the Target Image When Creating a Model Through OBS
            • Failed to Obtain Certain Logs on the ModelArts Log Query Page
            • Failed to Download a pip Package When a Model Is Created Using OBS
            • Failed to Use a Custom Image to Create a Model
            • Insufficient Disk Space Is Displayed When a Service Is Deployed After a Model Is Imported
            • Error Occurred When a Created Model Is Deployed as a Service
            • Invalid Runtime Dependency Configured in an Imported Custom Image
            • Garbled Characters Displayed in a Model Name Returned When Model Details Are Obtained Through an API
            • Failed to Import a Model Due to Oversized Model or Image
            • A Single Model File to Be Imported Exceeds the Size Limit (5 GB)
            • Creating a Model Failed Due to Image Building Timeout
          • Service Deployment
            • Error Occurred When a Custom Image Model Is Deployed as a Real-Time Service
            • Alarm Status of a Deployed Real-Time Service
            • Failed to Start a Service
            • Failed to Pull an Image When a Service Is Deployed, Started, Upgraded, or Modified
            • Image Restarts Repeatedly When a Service Is Deployed, Started, Upgraded, or Modified
            • Container Health Check Fails When a Service Is Deployed, Started, Upgraded, or Modified
            • Resources Are Insufficient When a Service Is Deployed, Started, Upgraded, or Modified
            • Error Occurred When a CV2 Model Package Is Used to Deploy a Real-Time Service
            • Service Is Consistently Being Deployed
            • A Started Service Is Intermittently in the Alarm State
            • Failed to Deploy a Service and Error "No Module named XXX" Occurred
            • Insufficient Permission to or Unavailable Input/Output OBS Path of a Batch Service
            • Error "No CUDA runtime is found" Occurred When a Real-Time Service Is Deployed
            • What Can I Do if the Memory Is Insufficient?
            • ModelArts.3520 The Number of Real-Time Services Cannot Exceed 11
            • "pod has unbound immediate PersistentVolumeClaims" Is Displayed During Service Deployment
          • Service Prediction
            • Service Prediction Failed
            • Error "APIG.XXXX" Occurred in a Prediction Failure
            • Error ModelArts.4206 Occurred in Real-Time Service Prediction
            • Error ModelArts.4302 Occurred in Real-Time Service Prediction
            • Error ModelArts.4503 Occurred in Real-Time Service Prediction
            • Error MR.0105 Occurred in Real-Time Service Prediction
            • Method Not Allowed
            • Request Timed Out
            • Error Occurred When an API Is Called for Deploying a Model Created Using a Custom Image
            • Error "DL.0105" Occurred During Real-Time Inference
        • MoXing
          • Error Occurs When MoXing Is Used to Copy Data
          • How Do I Disable the Warmup Function of the Mox?
          • Pytorch Mox Logs Are Repeatedly Generated
          • Failed to Perform Local Fine Tuning on the Checkpoint Generated by moxing.tensorflow
          • Copying Data Using MoXing Is Slow and the Log Is Repeatedly Printed in a Training Job
          • Failed to Access a Folder Using MoXing and Read the Folder Size Using get_size
        • APIs or SDKs
          • "ERROR: Could not install packages due to an OSError" Occurred During ModelArts SDK Installation
          • Error Occurred During Service Deployment After the Target Path to a File Downloaded Through a ModelArts SDK Is Set to a File Name
          • A Training Job Created Using an API Is Abnormal
          • Execution of a huaweicloud.com API Times Out
        • Resource Pool
          • Failed to Create a Resource Pool
          • Faulty Nodes in a Standard Resource Pool
        • Lite Cluster
          • Failed to Create a Resource Pool
          • How Do I Locate and Rectify a Node Fault in a Cluster Resource Pool?
          • All Privilege Pool Data Is Displayed as 0%
          • A Reset Node Cannot Be Used
          • How Do I Automatically Restore Services When Cluster Node Faults Occur?
      • Videos
      • More Documents
        • Preparations (To Be Offline)
          • Creating a Huawei ID and Enabling Huawei Cloud Services
          • Logging In to the ModelArts Management Console
          • Configuring Access Authorization (Global Configuration)
          • Creating an OBS Bucket
          • Enabling ModelArts Resources
            • ModelArts Resources
            • Pay-Per-Use
        • DevEnviron
          • Introduction to DevEnviron
          • Application Scenarios
          • Managing Notebook Instances
            • Creating a Notebook Instance
            • Accessing a Notebook Instance
            • Searching for, Starting, Stopping, or Deleting a Notebook Instance
            • Changing a Notebook Instance Image
            • Changing the Flavor of a Notebook Instance
            • Selecting Storage in DevEnviron
            • Dynamically Mounting an OBS Parallel File System
            • Dynamically Expanding EVS Disk Capacity
            • Modifying the SSH Configuration for a Notebook Instance
            • Viewing the Notebook Instances of All IAM Users Under One Tenant Account
            • Viewing Notebook Events
            • Notebook Cache Directory Alarm Reporting
          • JupyterLab
            • Operation Process in JupyterLab
            • JupyterLab Overview and Common Operations
            • Code Parametrization Plug-in
            • Using ModelArts SDK
            • Using the Git Plug-in
            • Visualized Model Training
              • Introduction to Training Job Visualization
              • MindInsight Visualization Jobs
              • TensorBoard Visualization Jobs
            • Uploading and Downloading Data in Notebook
              • Uploading Files to JupyterLab
                • Scenarios
                • Uploading Files from a Local Path to JupyterLab
                  • Upload Scenarios and Entries
                  • Uploading a Local File Less Than 100 MB to JupyterLab
                  • Uploading a Local File with a Size Ranging from 100 MB to 5 GB to JupyterLab
                  • Uploading a Local File Larger Than 5 GB to JupyterLab
                • Cloning an Open-Source Repository in GitHub
                • Uploading OBS Files to JupyterLab
                • Uploading Remote Files to JupyterLab
              • Downloading a File from JupyterLab to a Local Path
          • Local IDE
            • Operation Process in a Local IDE
            • Local IDE (PyCharm)
              • Connecting to a Notebook Instance Through PyCharm Toolkit
                • PyCharm Toolkit
                • Downloading and Installing PyCharm Toolkit
                • Connecting to a Notebook Instance Through PyCharm Toolkit
              • Manually Connecting to a Notebook Instance Through PyCharm
              • Submitting a Training Job Using PyCharm Toolkit
                • Submitting a Training Job (New Version)
                • Stopping a Training Job
                • Viewing Training Logs
              • Uploading Data to a Notebook Instance Using PyCharm
            • Local IDE (VS Code)
              • Connecting to a Notebook Instance Through VS Code
              • Installing VS Code
              • Connecting to a Notebook Instance Through VS Code Toolkit
              • Manually Connecting to a Notebook Instance Through VS Code
              • Remotely Debugging in VS Code
              • Uploading and Downloading Files in VS Code
            • Local IDE (Accessed Using SSH)
          • ModelArts CLI Command Reference
            • ModelArts CLI Overview
            • (Optional) Installing ma-cli Locally
            • Autocompletion for ma-cli Commands
            • ma-cli Authentication
            • ma-cli Image Building Command
              • ma-cli Image Building Command
              • Obtaining an Image Creation Template
              • Loading an Image Creation Template
              • Obtaining Registered ModelArts Images
              • Creating an Image in ModelArts Notebook
              • Obtaining Image Creation Caches in ModelArts Notebook
              • Clearing Image Creation Caches in ModelArts Notebook
              • Registering SWR Images with ModelArts Image Management
              • Deregistering a Registered Image from ModelArts Image Management
              • Debugging an SWR Image on an ECS
            • Using the ma-cli ma-job Command to Submit a ModelArts Training Job
              • ma-cli ma-job Command Overview
              • Obtaining ModelArts Training Jobs
              • Submitting a ModelArts Training Job
              • Obtaining ModelArts Training Job Logs
              • Obtaining ModelArts Training Job Events
              • Obtaining ModelArts AI Engines for Training
              • Obtaining ModelArts Resource Specifications for Training
              • Stopping a ModelArts Training Job
            • Using the ma-cli dli-job Command to Submit a DLI Spark Job
              • Overview
              • Querying DLI Spark Jobs
              • Submitting a DLI Spark Job
              • Querying DLI Spark Run Logs
              • Querying DLI Queues
              • Obtaining DLI Group Resources
              • Uploading Local Files or OBS Files to a DLI Group
              • Stopping a DLI Spark Job
            • Using ma-cli to Copy OBS Data
        • Model Development (To Be Offline)
          • Introduction to Model Development
          • Preparing Data
          • Preparing Algorithms
            • Introduction to Algorithm Preparation
            • Using a Preset Image (Custom Script)
              • Overview
              • Developing a Custom Script
              • Creating an Algorithm
            • Using Custom Images
            • Viewing Algorithm Details
            • Searching for an Algorithm
            • Deleting an Algorithm
          • Performing a Training
            • Creating a Training Job
            • Viewing Training Job Details
            • Viewing Training Job Events
            • Training Job Logs
              • Introduction to Training Job Logs
              • Common Logs
              • Viewing Training Job Logs
              • Locating Faults by Analyzing Training Logs
            • Cloud Shell
              • Logging In to a Training Container Using Cloud Shell
              • Keeping a Training Job Running
              • Preventing Cloud Shell Session from Disconnection
            • Viewing the Resource Usage of a Training Job
            • Evaluation Results
            • Viewing Training Tags
            • Viewing Fault Recovery Details
            • Viewing Environment Variables of a Training Container
            • Stopping, Rebuilding, or Searching for a Training Job
            • Releasing Training Job Resources
          • Advanced Training Operations
            • Automatic Recovery from a Training Fault
              • Training Fault Tolerance Check
              • Unconditional Auto Restart
            • Resumable Training and Incremental Training
            • Detecting Training Job Suspension
            • Priority of a Training Job
            • Permission to Set the Highest Job Priority
          • Distributed Training
            • Distributed Training Functions
            • Single-Node Multi-Card Training Using DataParallel
            • Multi-Node Multi-Card Training Using DistributedDataParallel
            • Distributed Debugging Adaptation and Code Example
            • Sample Code of Distributed Training
            • Example of Starting PyTorch DDP Training Based on a Training Job
          • Automatic Model Tuning (AutoSearch)
            • Introduction to Hyperparameter Search
            • Search Algorithm
              • Bayesian Optimization (SMAC)
              • TPE Algorithm
              • Simulated Annealing Algorithm
            • Creating a Hyperparameter Search Job
        • Image Management
          • Image Management
          • Using a Preset Image
            • Unified Mirroring
            • Images Preset in Notebook
              • Notebook Base Images
              • Notebook Base Image List
              • PyTorch (x86)-powered Notebook Base Image
              • Tensorflow (x86)-powered Notebook Base Image
              • MindSpore (x86)-powered Notebook Base Image
              • Custom Dedicated Image (x86)-powered Notebook Base Image
            • Training Base Image
              • Available Training Base Images
              • Training Base Image (PyTorch)
              • Training Base Image (TensorFlow)
              • Training Base Image (Horovod)
              • Training Base Image (MPI)
              • Starting Training with a Preset Image
                • PyTorch
                • TensorFlow
                • Horovod/MPI/MindSpore-GPU
            • Inference Base Images
              • Available Inference Base Images
              • TensorFlow (CPU/GPU)-powered Inference Base Images
              • PyTorch (CPU/GPU)-powered Inference Base Images
              • MindSpore (CPU/GPU)-powered Inference Base Images
          • Using Custom Images in Notebook Instances
            • Constraints on Custom Images in Notebook Instances
            • Registering an Image in ModelArts
            • Creating a Custom Image
            • Saving a Notebook Instance as a Custom Image
              • Saving a Notebook Environment Image
              • Using a Custom Image to Create a Notebook Instance
            • Creating and Using a Custom Image in Notebook
              • Application Scenarios and Process
              • Step 1 Creating a Custom Image
              • Step 2 Registering a New Image
              • Step 3 Using a New Image to Create a Development Environment
            • Creating a Custom Image on an ECS and Using It in Notebook
              • Application Scenarios and Process
              • Step 1 Preparing a Docker Server and Configuring an Environment
              • Step 2 Creating a Custom Image
              • Step 3 Registering a New Image
              • Step 5 Creating and Starting a Development Environment
            • Troubleshooting for Custom Images in Notebook Instances
          • Using a Custom Image to Train Models (Model Training)
            • Overview
            • Example: Creating a Custom Image for Training
              • Example: Creating a Custom Image for Training (PyTorch + CPU/GPU)
              • Example: Creating a Custom Image for Training (MPI + CPU/GPU)
              • Example: Creating a Custom Image for Training (Horovod-PyTorch and GPUs)
              • Example: Creating a Custom Image for Training (MindSpore and GPUs)
              • Example: Creating a Custom Image for Training (TensorFlow and GPUs)
            • Preparing a Training Image
              • Specifications for Custom Images for Training Jobs
              • Migrating an Image to ModelArts Training
              • Using a Base Image to Create a Training Image
              • Installing MLNX_OFED in a Container Image
            • Creating an Algorithm Using a Custom Image
            • Using a Custom Image to Create a CPU- or GPU-based Training Job
            • Troubleshooting Process
          • Using a Custom Image to Create AI applications for Inference Deployment
            • Custom Image Specifications for Creating AI Applications
            • Creating a Custom Image and Using It to Create an AI Application
          • FAQs
            • How Can I Log In to SWR and Upload Images to It?
            • How Do I Configure Environment Variables for an Image?
            • How Do I Use Docker to Start an Image Saved Using a Notebook Instance?
            • How Do I Configure a Conda Source in a Notebook Development Environment?
            • What Are Supported Software Versions for a Custom Image?
            • Why Does an Error Occur When I Try to Save an Image That Is Reported as Larger Than 35 GB, Even Though It Is Only Displayed as 13 GB in SWR?
            • How Do I Ensure That an Image Can be Saved Correctly Without Being Too Large?
            • How Do I Reduce the Size of an Image Created Locally or on ECS?
            • Will an Image Be Smaller If I Uninstall and Repackage It or Simply Delete Existing Datasets from the Image?
            • What Do I Do If Error "ModelArts.6787" Is Reported When I Register an Image on ModelArts?
          • Modification History
        • Model Inference (To Be Offline)
          • Introduction to Inference
          • Managing AI Applications
            • Introduction to AI Application Management
            • Creating an AI Application
              • Importing a Meta Model from a Training Job
              • Importing a Meta Model from a Template
              • Importing a Meta Model from OBS
              • Importing a Meta Model from a Container Image
            • Viewing the AI Application List
            • Viewing Details About an AI Application
            • Managing AI Application Versions
            • Viewing Events of an AI Application
          • Deploying an AI Application as a Service
            • Deploying AI Applications as Real-Time Services
              • Deploying as a Real-Time Service
              • Viewing Service Details
              • Testing the Deployed Service
              • Accessing Real-Time Services
                • Accessing a Real-Time Service
                • Authentication Mode
                  • Access Authenticated Using a Token
                  • Access Authenticated Using an AK/SK
                  • Access Authenticated Using an Application
                • Access Mode
                  • Accessing a Real-Time Service (Public Network Channel)
                  • Accessing a Real-Time Service (VPC High-Speed Channel)
                • Accessing a Real-Time Service Through WebSocket
                • Server-Sent Events
              • Integrating a Real-Time Service
              • Cloud Shell
            • Deploying AI Applications as Batch Services
              • Deploying as a Batch Service
              • Viewing Details About a Batch Service
              • Viewing the Batch Service Prediction Result
            • Upgrading a Service
            • Starting, Stopping, Deleting, or Restarting a Service
            • Viewing Service Events
          • Inference Specifications
            • Model Package Specifications
              • Introduction to Model Package Specifications
              • Specifications for Editing a Model Configuration File
              • Specifications for Writing Model Inference Code
            • Model Templates
              • Introduction to Model Templates
              • Templates
                • TensorFlow-based Image Classification Template
                • TensorFlow-py27 General Template
                • TensorFlow-py36 General Template
                • MXNet-py27 General Template
                • MXNet-py36 General Template
                • PyTorch-py27 General Template
                • PyTorch-py36 General Template
                • Caffe-CPU-py27 General Template
                • Caffe-GPU-py27 General Template
                • Caffe-CPU-py36 General Template
                • Caffe-GPU-py36 General Template
                • Arm-Ascend Template
              • Input and Output Modes
                • Built-in Object Detection Mode
                • Built-in Image Processing Mode
                • Built-in Predictive Analytics Mode
                • Undefined Mode
            • Examples of Custom Scripts
              • TensorFlow
              • TensorFlow 2.1
              • PyTorch
              • Caffe
              • XGBoost
              • PySpark
              • Scikit-learn
          • ModelArts Monitoring on Cloud Eye
            • ModelArts Metrics
            • Setting Alarm Rules
            • Viewing Monitoring Metrics
        • Resource Management
          • Resource Pool
          • Elastic Cluster
            • Comprehensive Upgrades to ModelArts Resource Pool Management Functions
            • Creating a Resource Pool
            • Viewing Details About a Resource Pool
            • Resizing a Resource Pool
            • Setting a Renewal Policy
            • Modifying the Expiration Policy
            • Migrating the Workspace
            • Changing Job Types Supported by a Resource Pool
            • Upgrading a Resource Pool Driver
            • Deleting a Resource Pool
            • Abnormal Status of a Dedicated Resource Pool
            • ModelArts Network
            • ModelArts Nodes
          • Audit Logs
            • Key Operations Recorded by CTS
            • Viewing Audit Logs
          • Monitoring Resources
            • Overview
            • Using Grafana to View AOM Monitoring Metrics
              • Procedure
              • Installing and Configuring Grafana
                • Installing and Configuring Grafana on Windows
                • Installing and Configuring Grafana on Linux
                • Installing and Configuring Grafana on a Notebook Instance
              • Configuring a Grafana Data Source
              • Using Grafana to Configure Dashboards and View Metric Data
            • Viewing All ModelArts Monitoring Metrics on the AOM Console
        • Data Preparation and Analytics
          • Introduction to Data Preparation
          • Getting Started
          • Creating a Dataset
            • Dataset Overview
            • Creating a Dataset
            • Modifying a Dataset
          • Importing Data
            • Introduction to Data Importing
            • Importing Data from OBS
              • Introduction to Importing Data from OBS
              • Importing Data from an OBS Path
              • Specifications for Importing Data from an OBS Directory
              • Importing a Manifest File
              • Specifications for Importing a Manifest File
            • Importing Data from DLI
            • Importing Data from MRS
            • Importing Data from DWS
            • Importing Data from Local Files
          • Data Analysis and Preview
            • Auto Grouping
            • Data Filtering
            • Data Feature Analysis
          • Labeling Data
          • Publishing Data
            • Introduction to Data Publishing
            • Publishing a Data Version
            • Managing Data Versions
          • Exporting Data
            • Introduction to Exporting Data
            • Exporting Data to a New Dataset
            • Exporting Data to OBS
        • Data Labeling (To Be Offline)
          • Introduction to Data Labeling
          • Manual Labeling
            • Creating a Labeling Job
            • Image Labeling
              • Image Classification
              • Object Detection
              • Image Segmentation
            • Text Labeling
              • Text Classification
              • Named Entity Recognition
              • Text Triplet
            • Audio Labeling
              • Sound Classification
              • Speech Labeling
              • Speech Paragraph Labeling
            • Video Labeling
            • Viewing Labeling Jobs
              • Viewing My Created Labeling Jobs
              • Viewing My Participated Labeling Jobs
          • Auto Labeling
            • Creating an Auto Labeling Job
            • Confirming Hard Examples
          • Team Labeling
            • Team Labeling Overview
            • Creating and Managing Teams
              • Managing Teams
              • Managing Team Members
            • Creating a Team Labeling Job
            • Logging In to ModelArts
            • Starting a Team Labeling Job
            • Reviewing Team Labeling Results
            • Accepting Team Labeling Results