- Visão geral de serviço
- Primeiros passos
-
Guia de usuário
- Operações de alto risco e soluções
-
Clusters
- Visão geral de cluster
- Compra de um cluster
- Conexão a um cluster
-
Atualização de um cluster
- Visão geral de atualização
- Antes de começar
- Atualização in-loco
- Execução de verificação pós-atualização
- Migração de serviços em clusters de versões diferentes
-
Solução de problemas para exceções de verificação de pré-atualização
- Verificação de pré-atualização
- Restrições de nó
- Gerenciamento de atualização
- Complementos
- Gráficos do Helm
- Conectividade SSH de nós principais
- Pools de nós
- Grupos de segurança
- Restrições de nó Arm
- Nós a serem migrados
- Recursos do Kubernetes descartados
- Riscos de compatibilidade
- Versões do agente do CCE de nó
- Uso da CPU do nó
- CRDs
- Discos de nó
- DNS do nó
- Permissões de arquivo de diretório principal de nó
- Kubelet
- Memória do nó
- Servidor de sincronização de relógio de nó
- Sistema operacional do nó
- CPUs do nó
- Comandos Python do nó
- Versão do ASM
- Prontidão do nó
- Nó journald
- containerd.sock
- Erros internos
- Pontos de montagem do nó
- Manchas de nós do Kubernetes
- Restrições de everest
- Restrições de cce-hpa-controller
- Políticas de CPU aprimorada
- Integridade dos componentes do nó de trabalho
- Integridade dos componentes do nó principal
- Limite de recursos de memória dos componentes do Kubernetes
- APIs do Kubernetes descartadas
- Capacidades de IPv6 de um cluster do CCE Turbo
- NetworkManager de nó
- Arquivo de ID do nó
- Consistência da configuração do nó
- Arquivo da configuração de nó
- Consistência da configuração de CoreDNS
- Comandos sudo de um nó
- Comandos principais dos nós
- A montagem do arquivo sock em um nó
- Consistência do certificado do balanceador de carga de HTTPS
- Montagem do nó
- Permissões de logon de usuário paas em um nó
- Endereços IPv4 privados de balanceadores de carga
- Registros históricos de atualização
- Bloco CIDR do plano de gerenciamento do cluster
- Complemento da GPU
- Configurações de parâmetro do sistema dos nós
- Versões de pacotes residuais
- Comandos do nó
- Troca de nó
-
Gerenciamento de um cluster
- Gerenciamento de configuração de cluster
- Controle de sobrecarga do cluster
- Alteração de escala do cluster
- Alteração do grupo de segurança padrão de um nó
- Exclusão de um cluster
- Hibernação e despertar de um cluster (pagamento por uso)
- Renovação de um cluster de cobrança anual/mensal
- Alteração do modo de cobrança de pagamento por uso para anual/mensal
-
Nós
- Visão geral do nó
- Mecanismo de contêiner
- Sistema operacional do nó
- Criação de um nó
- Adição de nós para gerenciamento
- Logon em um nó
- Nós de gerenciamento
- O&M do nó
- Pools de nós
-
Cargas de trabalho
- Visão geral
- Criação de uma carga de trabalho
-
Configuração de um contêiner
- Configuração da sincronização de fuso horário
- Configuração de uma política de extração de imagem
- Uso de imagens de terceiros
- Configuração de especificações do contêiner
- Definição dos parâmetros do ciclo de vida do contêiner
- Configuração da verificação de integridade para um contêiner
- Configuração de uma variável de ambiente
- Configuração de configurações de APM para análise de gargalo de desempenho
- Configuração da política de atualização da carga de trabalho
- Política de agendamento (afinidade/antiafinidade)
- Manchas e tolerâncias
- Rótulos e anotações
- Acesso de um contêiner
- Gerenciamento de cargas de trabalho e tarefas
- Tempo de execução do Kata e tempo de execução comum
- Agendamento
-
Rede
- Visão geral
- Modelos de rede de contêineres
-
Serviço
- Visão geral
- ClusterIP
- NodePort
-
LoadBalancer
- Criação de um Serviço LoadBalancer
- Uso de anotações para configurar o balanceamento de carga
- Serviço usando HTTP
- Configuração da verificação de integridade para várias portas
- Configuração do status pronto para o pod por meio da verificação de integridade do ELB
- Configuração do tempo limite para um Serviço LoadBalancer
- Configuração de rede de passagem para um Serviço de LoadBalancer
- Ativação de regras de grupo de segurança ICMP
- DNAT
- Serviços headless
-
Ingresses
- Visão geral
-
Ingresses do ELB
- Criação de um ingress do ELB no console
- Uso do kubectl para criar um ingress do ELB
- Configuração de ingresses do ELB usando anotações
- Configuração de certificados HTTPS para ingresses do ELB
- Configuração da Indicação de nome do servidor (SNI) para ingresses de ELB
- Roteamento de ingresses do ELB para vários Serviços
- Ingresses do ELB usando HTTP/2
- Interconexão de ingresses do ELB com serviços de back-end HTTPS
- Configuração do tempo limite para um ingress do ELB
-
Ingresses de Nginx
- Criação de ingresses de Nginx no console
- Uso do kubectl para criar um ingress de Nginx
- Configuração de certificados HTTPS para ingresses do Nginx
- Configuração de regras de reescrita de URL para ingresses de Nginx
- Interconexão de ingresses do Nginx com serviços de back-end HTTPS
- Ingresses de Nginx usar hashing consistente para balanceamento de carga
- Configuração de ingresses de Nginx usando anotações
- DNS
- Configurações de rede de contêiner
- Configurações da rede do cluster
- Configuração do acesso dentro da VPC
- Acesso a redes públicas a partir de um contêiner
- Armazenamento
- Observabilidade
- Namespaces
- ConfigMaps e segredos
-
Auto Scaling
- Visão geral
-
Dimensionamento de uma carga de trabalho
- Mecanismos de dimensionamento da carga de trabalho
- Criação de uma política HPA para dimensionamento automático da carga de trabalho
- Políticas CronHPA
- Criação de uma política CustomedHPA para o dimensionamento automático da carga de trabalho
- Gerenciamento de políticas de dimensionamento de carga de trabalho
- Dimensionamento de um nó
- Uso de HPA e CA para dimensionamento automático de cargas de trabalho e nós
-
Complementos
- Visão geral
- CoreDNS
- Armazenamento do contêiner do CCE (Everest)
- Detector de problema de nó do CCE
- Kubernetes Dashboard
- Autoscaler de cluster do CCE
- Nginx Ingress controller
- Kubernetes Metrics Server
- HPA de CCE avançado
- Mecanismo de estouro de nuvem do CCE para CCI
- Suíte IA do CCE (GPU NVIDIA)
- Suíte de IA do CCE (Ascend NPU)
- Volcano scheduler
- Secrets Manager do CCE para DEW
- Exportador de métricas de rede do CCE
- NodeLocal DNSCache
- Monitoramento de cluster da nuvem nativa
- Registro de logs da nuvem nativa
- e-backup (EOM)
- web-terminal (EOM)
- Prometheus (EOM)
- FlexVolume (preterido)
- Gráfico do Helm
-
Permissões
- Visão geral de permissões
- Permissões de cluster (baseadas no IAM)
- Permissões de namespace (com base no RBAC do Kubernetes)
- Exemplo: projetar e configurar permissões para usuários em um departamento
- Dependência de permissão do console do CCE
- Segurança de pod
- Melhoria da segurança do token da conta de serviço
- Descrição da atribuição do sistema
-
Gerenciamento do armazenamento: FlexVolume (preterido)
- Visão geral de FlexVolume
- Alteração da classe de armazenamento usada por um cluster de v1.15 de FlexVolume para CSI Everest
- Uso de discos EVS como volumes de armazenamento
- Usar sistemas de arquivos do SFS Turbo como volumes de armazenamento
- Uso de buckets do OBS como volumes de armazenamento
- Usar sistemas de arquivos do SFS como volumes de armazenamento
-
Referência de API
- Antes de começar
- Visão geral de API
- Chamada das APIs
-
APIs
- URL da API
-
Gerenciamento de cluster
- Criação de um cluster
- Leitura de um cluster especificado
- Listagem de clusters em um projeto especificado
- Atualização de um cluster especificado
- Exclusão de um cluster
- Hibernação de um cluster
- Despertar de um cluster
- Obtenção de um certificado de cluster
- Modificação de especificações do cluster
- Consulta de uma tarefa
- Vinculação/desvinculação endereço do servidor da API público
- Obtenção de endereço de acesso ao cluster
- Gerenciamento de nó
- Gerenciamento de pool de nós
- Gerenciamento de armazenamento
- Gerenciamento de complementos
- Gerenciamento de cota
- Gerenciamento de tags
- Atualização de cluster
- Versões de API
- Descrição de parâmetros de instância do complemento
- APIs do Kubernetes
- APIs desatualizadas
- Políticas de permissões e ações suportadas
-
Apêndice
- Código de status
- Códigos de erro
- Obtenção de um ID de projeto
- Obtenção de um ID de conta
- Especificação dos complementos a serem instalados durante a criação do cluster
- Como obter parâmetros no URI da API
- Criação de uma VPC e uma sub-rede
- Criação de um par de chaves
- Descrição de flavor de nó
- Adição de um sal no campo de senha ao criar um nó
- Número máximo de pods que podem ser criados em um nó
- Sistema operacional do nó
- Alocação de espaço em disco de dados
- Anexação de discos a um nó
- No momento, o conteúdo não está disponível no seu idioma selecionado. Consulte a versão em inglês.
- What's New
- Function Overview
-
Product Bulletin
- Latest Notices
-
Product Change Notices
- EOM of CentOS
- Billing Changes for Huawei Cloud CCE Autopilot Data Plane
- CCE Autopilot for Commercial Use on September 30, 2024, 00:00 GMT+08:00
- Reliability Hardening for Cluster Networks and Storage Functions
- Support for Docker
- Service Account Token Security Improvement
- Upgrade of Helm v2 to Helm v3
- Problems Caused by conn_reuse_mode Settings in the IPVS Forwarding Mode of CCE Clusters
- Optimized Key Authentication of the everest Add-on
-
Cluster Version Release Notes
- End of Maintenance for Clusters 1.23
- End of Maintenance for Clusters 1.21
- End of Maintenance for Clusters 1.19
- End of Maintenance for Clusters 1.17
- End of Maintenance for Clusters 1.15
- End of Maintenance for Clusters 1.13
- Creation of CCE Clusters 1.13 and Earlier Not Supported
- Upgrade for Kubernetes Clusters 1.9
-
Vulnerability Notices
- Vulnerability Fixing Policies
- Notice of Container Escape Vulnerability in NVIDIA Container Toolkit (CVE-2024-0132)
- Notice of Linux Remote Code Execution Vulnerability in CUPS (CVE-2024-47076, CVE-2024-47175, CVE-2024-47176, and CVE-2024-47177)
- Notice of the NGINX Ingress Controller Vulnerability That Allows Attackers to Bypass Annotation Validation (CVE-2024-7646)
- Notice of Docker Engine Vulnerability That Allows Attackers to Bypass AuthZ (CVE-2024-41110)
- Notice of Linux Kernel Privilege Escalation Vulnerability (CVE-2024-1086)
- Notice of OpenSSH Remote Code Execution Vulnerability (CVE-2024-6387)
- Notice of Fluent Bit Memory Corruption Vulnerability (CVE-2024-4323)
- Notice of runC systemd Attribute Injection Vulnerability (CVE-2024-3154)
- Notice of the Impact of runC Vulnerability (CVE-2024-21626)
- Notice on the Kubernetes Security Vulnerability (CVE-2022-3172)
- Privilege Escalation Vulnerability in Linux Kernel openvswitch Module (CVE-2022-2639)
- Notice on nginx-ingress Add-On Security Vulnerability (CVE-2021-25748)
- Notice on nginx-ingress Security Vulnerabilities (CVE-2021-25745 and CVE-2021-25746)
- Notice on the containerd Process Privilege Escalation Vulnerability (CVE-2022-24769)
- Notice on CRI-O Container Runtime Engine Arbitrary Code Execution Vulnerability (CVE-2022-0811)
- Notice on the Container Escape Vulnerability Caused by the Linux Kernel (CVE-2022-0492)
- Notice on the Non-Security Handling Vulnerability of containerd Image Volumes (CVE-2022-23648)
- Linux Kernel Integer Overflow Vulnerability (CVE-2022-0185)
- Linux Polkit Privilege Escalation Vulnerability (CVE-2021-4034)
- Notice on the Vulnerability of Kubernetes subPath Symlink Exchange (CVE-2021-25741)
- Notice of runC Vulnerability That Allows a Container Filesystem Breakout via Directory Traversal (CVE-2021-30465)
- Notice on the Docker Resource Management Vulnerability (CVE-2021-21285)
- Notice of NVIDIA GPU Driver Vulnerability (CVE-2021-1056)
- Notice on the Sudo Buffer Vulnerability (CVE-2021-3156)
- Notice on the Kubernetes Security Vulnerability (CVE-2020-8554)
- Notice of Apache containerd Security Vulnerability (CVE-2020-15257)
- Notice on the Docker Engine Input Verification Vulnerability (CVE-2020-13401)
- Notice of Kubernetes kube-apiserver Input Verification Vulnerability (CVE-2020-8559)
- Notice on the Kubernetes kubelet Resource Management Vulnerability (CVE-2020-8557)
- Notice on the Kubernetes kubelet and kube-proxy Authorization Vulnerability (CVE-2020-8558)
- Notice on Fixing Kubernetes HTTP/2 Vulnerability
- Notice on Fixing Linux Kernel SACK Vulnerabilities
- Notice on Fixing the Docker Command Injection Vulnerability (CVE-2019-5736)
- Notice on Fixing the Kubernetes Permission and Access Control Vulnerability (CVE-2018-1002105)
- Notice of Fixing the Kubernetes Dashboard Security Vulnerability (CVE-2018-18264)
-
Product Release Notes
-
Cluster Versions
- Kubernetes Version Policy
-
Kubernetes Version Release Notes
- Kubernetes 1.30 Release Notes
- Kubernetes 1.29 Release Notes
- Kubernetes 1.28 Release Notes
- Kubernetes 1.27 Release Notes
- Kubernetes 1.25 Release Notes
- Kubernetes 1.23 Release Notes
- Kubernetes 1.21 (EOM) Release Notes
- Kubernetes 1.19 (EOM) Release Notes
- Kubernetes 1.17 (EOM) Release Notes
- Kubernetes 1.15 (EOM) Release Notes
- Kubernetes 1.13 (EOM) Release Notes
- Kubernetes 1.11 (EOM) Release Notes
- Kubernetes 1.9 (EOM) and Earlier Versions Release Notes
- Patch Versions
- OS Images
-
Add-on Versions
- CoreDNS Release History
- CCE Container Storage (Everest) Release History
- CCE Node Problem Detector Release History
- Kubernetes Dashboard Release History
- CCE Cluster Autoscaler Release History
- NGINX Ingress Controller Release History
- Kubernetes Metrics Server Release History
- CCE Advanced HPA Release History
- CCE Cloud Bursting Engine for CCI Release History
- CCE AI Suite (NVIDIA GPU) Release History
- CCE AI Suite (Ascend NPU) Release History
- Volcano Scheduler Release History
- CCE Secrets Manager for DEW Release History
- CCE Network Metrics Exporter Release History
- NodeLocal DNSCache Release History
- Cloud Native Cluster Monitoring Release History
- Cloud Native Logging Release History
- Container Image Signature Verification Release History
- Grafana Release History
- OpenKruise Release History
- Gatekeeper Release History
- Vertical Pod Autoscaler Release History
- CCE Cluster Backup & Recovery (End of Maintenance) Release History
- Kubernetes Web Terminal (End of Maintenance) Release History
- Prometheus (End of Maintenance) Release History
-
Cluster Versions
- Billing
- Kubernetes Basics
-
Best Practices
- Checklist for Deploying Containerized Applications in the Cloud
- Containerization
- Migration
- DevOps
- Disaster Recovery
-
Security
- Overview
- Configuration Suggestions on CCE Cluster Security
- Configuration Suggestions on CCE Node Security
- Configuration Suggestions on CCE Container Runtime Security
- Configuration Suggestions on CCE Container Security
- Configuration Suggestions on CCE Container Image Security
- Configuration Suggestions on CCE Secret Security
- Configuration Suggestions on CCE Workload Identity Security
- Auto Scaling
- Monitoring
-
Cluster
- Suggestions on CCE Cluster Selection
- Creating an IPv4/IPv6 Dual-Stack Cluster in CCE
- Creating a Custom CCE Node Image
- Executing the Pre- or Post-installation Commands During Node Creation
- Using OBS Buckets to Implement Custom Script Injection During Node Creation
- Connecting to Multiple Clusters Using kubectl
- Selecting a Data Disk for the Node
- Implementing Cost Visualization for a CCE Cluster
- Creating a CCE Turbo Cluster Using a Shared VPC
- Protecting a CCE Cluster Against Overload
- Managing Costs for a Cluster
-
Networking
- Planning CIDR Blocks for a Cluster
- Selecting a Network Model
- Enabling Cross-VPC Network Communications Between CCE Clusters
- Implementing Network Communications Between Containers and IDCs Using VPC and Direct Connect
- Enabling a CCE Cluster to Resolve Domain Names on Both On-Premises IDCs and HUAWEI CLOUD
- Implementing Sticky Session Through Load Balancing
- Obtaining the Client Source IP Address for a Container
- Increasing the Listening Queue Length by Configuring Container Kernel Parameters
- Configuring Passthrough Networking for a LoadBalancer Service
- Accessing an External Network from a Pod
- Deploying Nginx Ingress Controllers Using a Chart
- CoreDNS Configuration Optimization
- Pre-Binding Container ENI for CCE Turbo Clusters
- Connecting a Cluster to the Peer VPC Through an Enterprise Router
- Accessing an IP Address Outside a Cluster That Uses a VPC Network Using Source Pod IP Addresses in the Cluster
-
Storage
- Expanding the Storage Space
- Mounting Object Storage Across Accounts
- Dynamically Creating an SFS Turbo Subdirectory Using StorageClass
- Changing the Storage Class Used by a Cluster of v1.15 from FlexVolume to CSI Everest
- Using Custom Storage Classes
- Scheduling EVS Disks Across AZs Using csi-disk-topology
- Automatically Collecting JVM Dump Files That Exit Unexpectedly Using SFS 3.0
- Deploying Storage Volumes in Multiple AZs
-
Container
- Recommended Configurations for Workloads
- Properly Allocating Container Computing Resources
- Upgrading Pods Without Interrupting Services
- Modifying Kernel Parameters Using a Privileged Container
- Using Init Containers to Initialize an Application
- Setting Time Zone Synchronization
- Configuration Suggestions on Container Network Bandwidth Limit
- Configuring the /etc/hosts File of a Pod Using hostAliases
- Configuring Domain Name Resolution for CCE Containers
- Using Dual-Architecture Images (x86 and Arm) in CCE
- Locating Container Faults Using the Core Dump File
- Configuring Parameters to Delay the Pod Startup in a CCE Turbo Cluster
- Automatically Updating a Workload Version Using SWR Triggers
- Permission
- Release
- Batch Computing
- SDK Reference
-
FAQs
- Common FAQ
-
Billing
- How Is CCE Billed?
- How Do I Change the Billing Mode of a CCE Cluster from Pay-per-Use to Yearly/Monthly?
- Can I Change the Billing Mode of CCE Nodes from Pay-per-Use to Yearly/Monthly?
- Which Invoice Modes Are Supported by Huawei Cloud?
- Will I Be Notified When My Balance Is Insufficient?
- Will I Be Notified When My Account Balance Changes?
- Can I Delete a Yearly/Monthly-Billed CCE Cluster Directly When It Expires?
- How Do I Unsubscribe From CCE?
-
Cluster
- Cluster Creation
-
Cluster Running
- How Do I Locate the Fault When a Cluster Is Unavailable?
- How Do I Reset or Reinstall a CCE Cluster?
- How Do I Check Whether a Cluster Is in Multi-Master Mode?
- Can I Directly Connect to the Master Node of a CCE Cluster?
- How Do I Retrieve Data After a CCE Cluster Is Deleted?
- Why Does CCE Display Node Disk Usage Inconsistently with Cloud Eye?
- How Do I Change the Name of a CCE Cluster?
- Cluster Deletion
- Cluster Upgrade
-
Node
- Node Creation
-
Node Running
- What Should I Do If a Cluster Is Available But Some Nodes Are Unavailable?
- How Do I Troubleshoot the Failure to Remotely Log In to a Node in a CCE Cluster?
- How Do I Log In to a Node Using a Password and Reset the Password?
- How Do I Collect Logs of Nodes in a CCE Cluster?
- What Can I Do If the Container Network Becomes Unavailable After yum update Is Used to Upgrade the OS?
- What Should I Do If the vdb Disk of a Node Is Damaged and the Node Cannot Be Recovered After Reset?
- Which Ports Are Used to Install kubelet on CCE Cluster Nodes?
- How Do I Configure a Pod to Use the Acceleration Capability of a GPU Node?
- What Should I Do If I/O Suspension Occasionally Occurs When SCSI EVS Disks Are Used?
- What Should I Do If Excessive Docker Audit Logs Affect the Disk I/O?
- How Do I Fix an Abnormal Container or Node Due to No Thin Pool Disk Space?
- Where Can I Get the Listening Ports of CCE Worker Nodes?
- How Do I Rectify Failures When the NVIDIA Driver Is Used to Start Containers on GPU Nodes?
- What Can I Do If the Time of CCE Nodes Is Not Synchronized with the NTP Server?
- What Should I Do If the Data Disk Usage Is High Because a Large Volume of Data Is Written Into the Log File?
- Why Does My Node Memory Usage Obtained by Running the kubelet top node Command Exceed 100%?
- What Should I Do If "Failed to reclaim image" Is Displayed in the Node Events?
- Specification Change
-
OSs
- What Can I Do If cgroup kmem Leakage Occasionally Occurs When an Application Is Repeatedly Created or Deleted on a Node Running CentOS with an Earlier Kernel Version?
- What Should I Do If There Is a Service Access Failure After a Backend Service Upgrade or a 1-Second Latency When a Service Accesses a CCE Cluster?
- Why Are Pods Evicted by kubelet Due to Abnormal cgroup Statistics?
- When Container OOM Occurs on the CentOS Node with an Earlier Kernel Version, the Ext4 File System Is Occasionally Suspended
- What Should I Do If a DNS Resolution Failure Occurs Due to a Defect in IPVS?
- What Should I Do If the Number of ARP Entries Exceeds the Upper Limit?
- What Should I Do If a VM Is Suspended Due to an EulerOS 2.9 Kernel Defect?
-
Node Pool
- What Should I Do If a Node Pool Is Abnormal?
- What Should I Do If No Node Creation Record Is Displayed When the Node Pool Is Being Expanding?
- What Should I Do If a Node Pool Scale-Out Fails?
- What Should I Do If Some Kubernetes Events Fail to Display After Nodes Were Added to or Deleted from a Node Pool in Batches?
- How Do I Modify ECS Configurations When an ECS Cannot Be Managed by a Node Pool?
-
Workload
-
Workload Exception Troubleshooting
- How Can I Find the Fault for an Abnormal Workload?
- What Should I Do If Pod Scheduling Fails?
- What Should I Do If a Pod Fails to Pull the Image?
- What Should I Do If Container Startup Fails?
- What Should I Do If a Pod Fails to Be Evicted?
- What Should I Do If a Storage Volume Cannot Be Mounted or the Mounting Times Out?
- What Should I Do If a Workload Remains in the Creating State?
- What Should I Do If a Pod Remains in the Terminating State?
- What Should I Do If a Workload Is Stopped Caused by Pod Deletion?
- What Should I Do If an Error Occurs When I Deploy a Service on the GPU Node?
- What Should I Do If a Workload Exception Occurs Due to a Storage Volume Mount Failure?
- Why Does Pod Fail to Write Data?
- Why Is Pod Creation or Deletion Suspended on a Node Where File Storage Is Mounted?
- How Can I Locate Faults Using an Exit Code?
- Container Configuration
- Monitoring Log
-
Scheduling Policies
- How Do I Evenly Distribute Multiple Pods to Each Node?
- How Do I Prevent a Container on a Node from Being Evicted?
- Why Are Pods Not Evenly Distributed on Nodes?
- How Do I Evict All Pods on a Node?
- How Do I Check Whether a Pod Is Bound with vCPUs?
- What Should I Do If Pods Cannot Be Rescheduled After the Node Is Stopped?
- How Do I Prevent a Non-GPU or Non-NPU Workload from Being Scheduled to a GPU or NPU Node?
- Why Cannot a Pod Be Scheduled to a Node?
-
Others
- What Should I Do If a Cron Job Cannot Be Restarted After Being Stopped for a Period of Time?
- What Is a Headless Service When I Create a StatefulSet?
- What Should I Do If Error Message "Auth is empty" Is Displayed When a Private Image Is Pulled?
- What Is the Image Pull Policy for Containers in a CCE Cluster?
- Why Is the Mount Point of a Docker Container in the Kunpeng Cluster Uninstalled?
- What Can I Do If a Layer Is Missing During Image Pull?
- Why the File Permission and User in the Container Are Question Marks?
-
Workload Exception Troubleshooting
-
Networking
-
Network Exception Troubleshooting
- How Do I Locate a Workload Networking Fault?
- Why the ELB Address Cannot Be used to Access Workloads in a Cluster?
- Why the Ingress Cannot Be Accessed Outside the Cluster?
- Why Does the Browser Return Error Code 404 When I Access a Deployed Application?
- What Should I Do If a Container Fails to Access the Internet?
- What Can I Do If a VPC Subnet Cannot Be Deleted?
- How Do I Restore a Faulty Container NIC?
- What Should I Do If a Node Fails to Connect to the Internet (Public Network)?
- How Do I Resolve a Conflict Between the VPC CIDR Block and the Container CIDR Block?
- What Should I Do If the Java Error "Connection reset by peer" Is Reported During Layer-4 ELB Health Check
- How Do I Locate the Service Event Indicating That No Node Is Available for Binding?
- Why Does "Dead loop on virtual device gw_11cbf51a, fix it urgently" Intermittently Occur When I Log In to a VM using VNC?
- Why Does a Panic Occasionally Occur When I Use Network Policies on a Cluster Node?
- Why Are Lots of source ip_type Logs Generated on the VNC?
- What Should I Do If Status Code 308 Is Displayed When the Nginx Ingress Controller Is Accessed Using the Internet Explorer?
- What Should I Do If Nginx Ingress Access in the Cluster Is Abnormal After the NGINX Ingress Controller Add-on Is Upgraded?
- What Should I Do If An Error Occurred During a LoadBalancer Update?
- What Could Cause Access Exceptions After Configuring an HTTPS Certificate for a LoadBalancer Ingress?
-
Network Planning
- What Is the Relationship Between Clusters, VPCs, and Subnets?
- How Do I View the VPC CIDR Block?
- How Do I Set the VPC CIDR Block and Subnet CIDR Block for a CCE Cluster?
- How Do I Set a Container CIDR Block for a CCE Cluster?
- When Should I Use Cloud Native Network 2.0?
- What Is an ENI?
- How Can I Configure a Security Group Rule in a Cluster?
- How Do I Configure the IPv6 Service CIDR Block When Creating a CCE Turbo Cluster?
- Can Multiple NICs Be Bound to a Node in a CCE Cluster?
- Security Hardening
-
Network Configuration
- How Does CCE Communicate with Other Huawei Cloud Services over an Intranet?
- How Do I Set the Port When Configuring the Workload Access Mode on CCE?
- How Can I Achieve Compatibility Between Ingress's property and Kubernetes client-go?
- How Do I Obtain the Actual Source IP Address of a Client After a Service Is Added into Istio?
- Why Cannot an Ingress Be Created After the Namespace Is Changed?
- Why Is the Backend Server Group of an ELB Automatically Deleted After a Service Is Published to the ELB?
- How Can Container IP Addresses Survive a Container Restart?
- How Can I Check Whether an ENI Is Used by a Cluster?
- How Can I Delete a Security Group Rule Associated with a Deleted Subnet?
- How Can I Synchronize Certificates When Multiple Ingresses in Different Namespaces Share a Listener?
- How Can I Determine Which Ingress the Listener Settings Have Been Applied To?
-
Network Exception Troubleshooting
-
Storage
- How Do I Expand the Storage Capacity of a Container?
- What Are the Differences Among CCE Storage Classes in Terms of Persistent Storage and Multi-Node Mounting?
- Can I Create a CCE Node Without Adding a Data Disk to the Node?
- Can EVS Volumes in a CCE Cluster Be Restored After They Are Deleted or Expired?
- What Should I Do If the Host Cannot Be Found When Files Need to Be Uploaded to OBS During the Access to the CCE Service from a Public Network?
- How Can I Achieve Compatibility Between ExtendPathMode and Kubernetes client-go?
- What Can I Do If a Storage Volume Fails to Be Created?
- Can CCE PVCs Detect Underlying Storage Faults?
- An Error Is Reported When the Owner Group and Permissions of the Mount Point of the SFS 3.0 File System in the OS Are Modified
- Why Cannot I Delete a PV or PVC Using the kubectl delete Command?
- What Should I Do If "target is busy" Is Displayed When a Pod with Cloud Storage Mounted Is Being Deleted?
- What Should I Do If a Yearly/Monthly EVS Disk Cannot Be Automatically Created?
- Namespace
-
Chart and Add-on
- What Should I Do If the nginx-ingress Add-on Fails to Be Installed on a Cluster and Remains in the Creating State?
- What Should I Do If Residual Process Resources Exist Due to an Earlier npd Add-on Version?
- What Should I Do If a Chart Release Cannot Be Deleted Because the Chart Format Is Incorrect?
- Does CCE Support nginx-ingress?
- What Should I Do If Installation of an Add-on Fails and "The release name is already exist" Is Displayed?
- What Should I Do If a Release Creation or Upgrade Fails and "rendered manifests contain a resource that already exists" Is Displayed?
- What Can I Do If the kube-prometheus-stack Add-on Instance Fails to Be Scheduled?
- What Can I Do If a Chart Fails to Be Uploaded?
- How Do I Configure the Add-on Resource Quotas Based on Cluster Scale?
- How Can I Clean Up Residual Resources After the NGINX Ingress Controller Add-on in the Unknown State Is Deleted?
- Why TLS v1.0 and v1.1 Cannot Be Used After the NGINX Ingress Controller Add-on Is Upgraded?
-
API & kubectl FAQs
- How Can I Access a Cluster API Server?
- Can the Resources Created Using APIs or kubectl Be Displayed on the CCE Console?
- How Do I Download kubeconfig for Connecting to a Cluster Using kubectl?
- How Do I Rectify the Error Reported When Running the kubectl top node Command?
- Why Is "Error from server (Forbidden)" Displayed When I Use kubectl?
-
DNS FAQs
- What Should I Do If Domain Name Resolution Fails in a CCE Cluster?
- Why Does a Container in a CCE Cluster Fail to Perform DNS Resolution?
- Why Cannot the Domain Name of the Tenant Zone Be Resolved After the Subnet DNS Configuration Is Modified?
- How Do I Optimize the Configuration If the External Domain Name Resolution Is Slow or Times Out?
- How Do I Configure a DNS Policy for a Container?
- Image Repository FAQs
- Permissions
- Related Services
- Videos
-
More Documents
-
User Guide (ME-Abu Dhabi Region)
- Service Overview
- Getting Started
- High-Risk Operations and Solutions
-
Clusters
- Cluster Overview
- Buying a Cluster
- Connecting to a Cluster
-
Upgrading a Cluster
- Upgrade Overview
- Before You Start
- Performing an In-place Upgrade
- Performing Post-Upgrade Verification
- Migrating Services Across Clusters of Different Versions
-
Troubleshooting for Pre-upgrade Check Exceptions
- Pre-upgrade Check
- Node Restrictions
- Upgrade Management
- Add-ons
- Helm Charts
- SSH Connectivity of Master Nodes
- Node Pools
- Security Groups
- Arm Node Restrictions
- To-Be-Migrated Nodes
- Discarded Kubernetes Resources
- Compatibility Risks
- Node CCE Agent Versions
- Node CPU Usage
- CRDs
- Node Disks
- Node DNS
- Node Key Directory File Permissions
- Kubelet
- Node Memory
- Node Clock Synchronization Server
- Node OS
- Node CPUs
- Node Python Commands
- ASM Version
- Node Readiness
- Node journald
- containerd.sock
- Internal Errors
- Node Mount Points
- Kubernetes Node Taints
- Everest Restrictions
- cce-hpa-controller Restrictions
- Enhanced CPU Policies
- Health of Worker Node Components
- Health of Master Node Components
- Memory Resource Limit of Kubernetes Components
- Discarded Kubernetes APIs
- IPv6 Capabilities of a CCE Turbo Cluster
- Node NetworkManager
- Node ID File
- Node Configuration Consistency
- Node Configuration File
- CoreDNS Configuration Consistency
- sudo Commands of a Node
- Key Commands of Nodes
- Mounting of a Sock File on a Node
- HTTPS Load Balancer Certificate Consistency
- Node Mounting
- Login Permissions of User paas on a Node
- Private IPv4 Addresses of Load Balancers
- Historical Upgrade Records
- CIDR Block of the Cluster Management Plane
- GPU Add-on
- Nodes' System Parameter Settings
- Residual Package Versions
- Node Commands
- Node Swap
- nginx-ingress Upgrade
- Managing a Cluster
- Nodes
- Node Pools
-
Workloads
- Overview
- Creating a Workload
-
Configuring a Container
- Configuring Time Zone Synchronization
- Configuring an Image Pull Policy
- Using Third-Party Images
- Configuring Container Specifications
- Configuring Container Lifecycle Parameters
- Configuring Container Health Check
- Configuring Environment Variables
- Configuring APM Settings for Performance Bottleneck Analysis
- Workload Upgrade Policies
- Scheduling Policies (Affinity/Anti-affinity)
- Taints and Tolerations
- Labels and Annotations
- Accessing a Container
- Managing Workloads and Jobs
- Kata Runtime and Common Runtime
- Scheduling
-
Network
- Overview
- Container Network Models
-
Service
- Overview
- ClusterIP
- NodePort
-
LoadBalancer
- Creating a LoadBalancer Service
- Using Annotations to Configure Load Balancing
- Service Using HTTP or HTTPS
- Configuring Health Check for Multiple Ports
- Setting the Pod Ready Status Through the ELB Health Check
- Configuring Timeout for a LoadBalancer Service
- Enabling Passthrough Networking for LoadBalancer Services
- Enabling ICMP Security Group Rules
- DNAT
- Headless Service
-
Ingresses
- Overview
-
ELB Ingresses
- Creating an ELB Ingress on the Console
- Using kubectl to Create an ELB Ingress
- Configuring ELB Ingresses Using Annotations
- Configuring HTTPS Certificates for ELB Ingresses
- Configuring the Server Name Indication (SNI) for ELB Ingresses
- ELB Ingresses Routing to Multiple Services
- ELB Ingresses Using HTTP/2
- Interconnecting ELB Ingresses with HTTPS Backend Services
- Configuring Timeout for an ELB Ingress
-
Nginx Ingresses
- Creating Nginx Ingresses on the Console
- Using kubectl to Create an Nginx Ingress
- Configuring Nginx Ingresses Using Annotations
- Configuring HTTPS Certificates for Nginx Ingresses
- Configuring URL Rewriting Rules for Nginx Ingresses
- Interconnecting Nginx Ingresses with HTTPS Backend Services
- Nginx Ingresses Using Consistent Hashing for Load Balancing
- DNS
- Container Network Settings
- Cluster Network Settings
- Configuring Intra-VPC Access
- Accessing Public Networks from a Container
- Storage
- Observability
- Namespaces
- ConfigMaps and Secrets
- Auto Scaling
-
Add-ons
- Overview
- CoreDNS
- CCE Container Storage (Everest)
- CCE Node Problem Detector
- Kubernetes Dashboard
- CCE Cluster Autoscaler
- Nginx Ingress Controller
- Kubernetes Metrics Server
- CCE Advanced HPA
- CCE AI Suite (NVIDIA GPU)
- CCE AI Suite (Ascend NPU)
- Volcano Scheduler
- CCE Secrets Manager for DEW
- CCE Network Metrics Exporter
- NodeLocal DNSCache
- Prometheus
- Helm Chart
- Permissions
-
Best Practices
- Checklist for Deploying Containerized Applications in the Cloud
- Containerization
- Disaster Recovery
- Security
- Auto Scaling
- Monitoring
- Cluster
- Networking
- Storage
- Container
- Permission
- Release
-
FAQs
- Common Questions
- Cluster
-
Node
- Node Creation
-
Node Running
- What Should I Do If a Cluster Is Available But Some Nodes Are Unavailable?
- How Do I Log In to a Node Using a Password and Reset the Password?
- How Do I Collect Logs of Nodes in a CCE Cluster?
- What Should I Do If the vdb Disk of a Node Is Damaged and the Node Cannot Be Recovered After Reset?
- What Should I Do If I/O Suspension Occasionally Occurs When SCSI EVS Disks Are Used?
- How Do I Fix an Abnormal Container or Node Due to No Thin Pool Disk Space?
- How Do I Rectify Failures When the NVIDIA Driver Is Used to Start Containers on GPU Nodes?
- Specification Change
- Node Pool
-
Workload
-
Workload Abnormalities
- How Do I Use Events to Fix Abnormal Workloads?
- What Should I Do If Pod Scheduling Fails?
- What Should I Do If a Pod Fails to Pull the Image?
- What Should I Do If Container Startup Fails?
- What Should I Do If a Pod Fails to Be Evicted?
- What Should I Do If a Storage Volume Cannot Be Mounted or the Mounting Times Out?
- What Should I Do If a Workload Remains in the Creating State?
- What Should I Do If Pods in the Terminating State Cannot Be Deleted?
- What Should I Do If a Workload Is Stopped Caused by Pod Deletion?
- What Should I Do If an Error Occurs When Deploying a Service on the GPU Node?
-
Container Configuration
- When Is Pre-stop Processing Used?
- How Do I Set an FQDN for Accessing a Specified Container in the Same Namespace?
- What Should I Do If Health Check Probes Occasionally Fail?
- How Do I Set the umask Value for a Container?
- What Can I Do If an Error Is Reported When a Deployed Container Is Started After the JVM Startup Heap Memory Parameter Is Specified for ENTRYPOINT in Dockerfile?
- What Is the Retry Mechanism When CCE Fails to Start a Pod?
- Scheduling Policies
-
Others
- What Should I Do If a Scheduled Task Cannot Be Restarted After Being Stopped for a Period of Time?
- What Is a Headless Service When I Create a StatefulSet?
- What Should I Do If Error Message "Auth is empty" Is Displayed When a Private Image Is Pulled?
- Why Cannot a Pod Be Scheduled to a Node?
- What Is the Image Pull Policy for Containers in a CCE Cluster?
- What Can I Do If a Layer Is Missing During Image Pull?
-
Workload Abnormalities
- Networking
-
Storage
- What Are the Differences Among CCE Storage Classes in Terms of Persistent Storage and Multi-node Mounting?
- Can I Add a Node Without a Data Disk?
- What Should I Do If the Host Cannot Be Found When Files Need to Be Uploaded to OBS During the Access to the CCE Service from a Public Network?
- How Can I Achieve Compatibility Between ExtendPathMode and Kubernetes client-go?
- Can CCE PVCs Detect Underlying Storage Faults?
- Namespace
- Chart and Add-on
-
API & kubectl FAQs
- How Can I Access a CCE Cluster?
- Can the Resources Created Using APIs or kubectl Be Displayed on the CCE Console?
- How Do I Download kubeconfig for Connecting to a Cluster Using kubectl?
- How Do I Rectify the Error Reported When Running the kubectl top node Command?
- Why Is "Error from server (Forbidden)" Displayed When I Use kubectl?
- DNS FAQs
- Image Repository FAQs
- Permissions
- Reference
-
API Reference (ME-Abu Dhabi Region)
- Before You Start
- API Overview
- Calling APIs
-
APIs
- API URL
-
Cluster Management
- Creating a Cluster
- Reading a Specified Cluster
- Listing Clusters in a Specified Project
- Updating a Specified Cluster
- Deleting a Cluster
- Hibernating a Cluster
- Waking Up a Cluster
- Obtaining a Cluster Certificate
- Querying a Job
- Binding/Unbinding Public API Server Address
- Obtaining Cluster Access Address
- Node Management
- Node Pool Management
- Storage Management
- Add-on Management
- Quota Management
- API Versions
- Kubernetes APIs
- Permissions Policies and Supported Actions
-
Appendix
- Status Code
- Error Codes
- Obtaining a Project ID
- Obtaining the Account ID
- Specifying Add-ons to Be Installed During Cluster Creation
- How to Obtain Parameters in the API URI
- Creating a VPC and Subnet
- Creating a Key Pair
- Node Flavor Description
- Adding a Salt in the password Field When Creating a Node
- Maximum Number of Pods That Can Be Created on a Node
- Node OS
- Data Disk Space Allocation
- Attaching Disks to a Node
- Change History
-
User Guide (Paris Regions)
- Service Overview
-
Product Bulletin
- Risky Operations on Cluster Nodes
- CCE Security Guide
- Cluster Node OS Patch Notes
-
Vulnerability Notice
- Notice on the Kubernetes Security Vulnerability (CVE-2022-3172)
- Privilege Escalation Vulnerability in Linux openvswitch Kernel Module (CVE-2022-2639)
- Notice on CRI-O Container Runtime Engine Arbitrary Code Execution Vulnerability (CVE-2022-0811)
- Notice on the Container Escape Vulnerability Caused by the Linux Kernel (CVE-2022-0492)
- Linux Kernel Integer Overflow Vulnerability (CVE-2022-0185)
- Kubernetes Basics
- Getting Started
- High-Risk Operations and Solutions
-
Clusters
- Cluster Overview
- Creating a Cluster
- Connecting to a Cluster
-
Upgrading a Cluster
- Upgrade Overview
- Before You Start
- Performing In-place Upgrade
- Performing Post-Upgrade Verification
- Migrating Services Across Clusters of Different Versions
-
Troubleshooting for Pre-upgrade Check Exceptions
- Pre-upgrade Check
- Node Restrictions
- Upgrade Management
- Add-ons
- Helm Charts
- SSH Connectivity of Master Nodes
- Node Pools
- Security Groups
- Arm Node Restrictions
- To-Be-Migrated Nodes
- Discarded Kubernetes Resources
- Compatibility Risks
- Node CCE Agent Versions
- Node CPU Usage
- CRDs
- Node Disks
- Node DNS
- Node Key Directory File Permissions
- Kubelet
- Node Memory
- Node Clock Synchronization Server
- Node OS
- Node CPUs
- Node Python Commands
- ASM Version
- Node Readiness
- Node journald
- containerd.sock
- Internal Errors
- Node Mount Points
- Kubernetes Node Taints
- everest Restrictions
- cce-hpa-controller Restrictions
- Enhanced CPU Policies
- Health of Worker Node Components
- Health of Master Node Components
- Memory Resource Limit of Kubernetes Components
- Discarded Kubernetes APIs
- IPv6 Capabilities of a CCE Turbo Cluster
- Node NetworkManager
- Node ID File
- Node Configuration Consistency
- Node Configuration File
- CoreDNS Configuration Consistency
- sudo Commands of a Node
- Key Commands of Nodes
- Mounting of a Sock File on a Node
- HTTPS Load Balancer Certificate Consistency
- Node Mounting
- Login Permissions of User paas on a Node
- Private IPv4 Addresses of Load Balancers
- Historical Upgrade Records
- CIDR Block of the Cluster Management Plane
- GPU Add-on
- Nodes' System Parameter Settings
- Residual Package Versions
- Node Commands
- Node Swap
- nginx-ingress Upgrade
- Managing a Cluster
- Nodes
- Node Pools
-
Workloads
- Overview
- Creating a Workload
-
Configuring a Container
- Configuring Time Zone Synchronization
- Configuring an Image Pull Policy
- Using Third-Party Images
- Setting Container Specifications
- Setting Container Lifecycle Parameters
- Setting Health Check for a Container
- Setting an Environment Variable
- Configuring the Workload Upgrade Policy
- Scheduling Policy (Affinity/Anti-affinity)
- Taints and Tolerations
- Labels and Annotations
- Accessing a Container
- Managing Workloads and Jobs
- Scheduling
-
Network
- Overview
- Container Network Models
- Service
-
Ingresses
- Overview
-
ELB Ingresses
- Creating an ELB Ingress on the Console
- Using kubectl to Create an ELB Ingress
- Configuring ELB Ingresses Using Annotations
- Configuring HTTPS Certificates for ELB Ingresses
- Configuring the Server Name Indication (SNI) for ELB Ingresses
- ELB Ingresses Routing to Multiple Services
- ELB Ingresses Using HTTP/2
- Interconnecting ELB Ingresses with HTTPS Backend Services
- Configuring Timeout for an ELB Ingress
-
Nginx Ingresses
- Creating Nginx Ingresses on the Console
- Using kubectl to Create an Nginx Ingress
- Configuring HTTPS Certificates for Nginx Ingresses
- Configuring URL Rewriting Rules for Nginx Ingresses
- Interconnecting Nginx Ingresses with HTTPS Backend Services
- Nginx Ingresses Using Consistent Hashing for Load Balancing
- Configuring Nginx Ingresses Using Annotations
- DNS
- Container Network Settings
- Cluster Network Settings
- Configuring Intra-VPC Access
- Accessing Public Networks from a Container
- Storage
- Observability
- Namespaces
- ConfigMaps and Secrets
- Auto Scaling
- Add-ons
- Helm Chart
- Permissions
-
FAQs
- Common Questions
- Billing
- Cluster
-
Node
- Node Creation
-
Node Running
- What Should I Do If a Cluster Is Available But Some Nodes Are Unavailable?
- How Do I Log In to a Node Using a Password and Reset the Password?
- How Do I Collect Logs of Nodes in a CCE Cluster?
- What Should I Do If the vdb Disk of a Node Is Damaged and the Node Cannot Be Recovered After Reset?
- What Should I Do If I/O Suspension Occasionally Occurs When SCSI EVS Disks Are Used?
- How Do I Fix an Abnormal Container or Node Due to No Thin Pool Disk Space?
- How Do I Rectify Failures When the NVIDIA Driver Is Used to Start Containers on GPU Nodes?
- Specification Change
- Node Pool
-
Workload
-
Workload Abnormalities
- How Do I Use Events to Fix Abnormal Workloads?
- What Should I Do If Pod Scheduling Fails?
- What Should I Do If a Pod Fails to Pull the Image?
- What Should I Do If Container Startup Fails?
- What Should I Do If a Pod Fails to Be Evicted?
- What Should I Do If a Storage Volume Cannot Be Mounted or the Mounting Times Out?
- What Should I Do If a Workload Remains in the Creating State?
- What Should I Do If Pods in the Terminating State Cannot Be Deleted?
- What Should I Do If a Workload Is Stopped Caused by Pod Deletion?
- What Should I Do If an Error Occurs When Deploying a Service on the GPU Node?
- What Should I Do If Sandbox-Related Errors Are Reported When the Pod Remains in the Creating State?
-
Container Configuration
- When Is Pre-stop Processing Used?
- How Do I Set an FQDN for Accessing a Specified Container in the Same Namespace?
- What Should I Do If Health Check Probes Occasionally Fail?
- How Do I Set the umask Value for a Container?
- What Can I Do If an Error Is Reported When a Deployed Container Is Started After the JVM Startup Heap Memory Parameter Is Specified for ENTRYPOINT in Dockerfile?
- What Is the Retry Mechanism When CCE Fails to Start a Pod?
- Scheduling Policies
-
Others
- What Should I Do If a Scheduled Task Cannot Be Restarted After Being Stopped for a Period of Time?
- What Is a Headless Service When I Create a StatefulSet?
- What Should I Do If Error Message "Auth is empty" Is Displayed When a Private Image Is Pulled?
- Why Cannot a Pod Be Scheduled to a Node?
- What Is the Image Pull Policy for Containers in a CCE Cluster?
- What Can I Do If a Layer Is Missing During Image Pull?
-
Workload Abnormalities
-
Networking
- Network Planning
-
Network Fault
- How Do I Locate a Workload Networking Fault?
- Why Does the Browser Return Error Code 404 When I Access a Deployed Application?
- What Should I Do If a Container Fails to Access the Internet?
- What Should I Do If a Node Fails to Connect to the Internet (Public Network)?
- What Should I Do If an Nginx Ingress Access in the Cluster Is Abnormal After the Add-on Is Upgraded?
-
Storage
- What Are the Differences Among CCE Storage Classes in Terms of Persistent Storage and Multi-node Mounting?
- Can I Add a Node Without a Data Disk?
- What Should I Do If the Host Cannot Be Found When Files Need to Be Uploaded to OBS During the Access to the CCE Service from a Public Network?
- How Can I Achieve Compatibility Between ExtendPathMode and Kubernetes client-go?
- Can CCE PVCs Detect Underlying Storage Faults?
- Namespace
- Chart and Add-on
-
API & kubectl FAQs
- How Can I Access a Cluster API Server?
- Can the Resources Created Using APIs or kubectl Be Displayed on the CCE Console?
- How Do I Download kubeconfig for Connecting to a Cluster Using kubectl?
- How Do I Rectify the Error Reported When Running the kubectl top node Command?
- Why Is "Error from server (Forbidden)" Displayed When I Use kubectl?
- DNS FAQs
- Image Repository FAQs
- Permissions
- Reference
-
Best Practices
- Checklist for Deploying Containerized Applications in the Cloud
- Containerization
- Disaster Recovery
- Security
- Auto Scaling
- Monitoring
- Cluster
- Networking
-
Storage
- Expanding the Storage Space
- Mounting an Object Storage Bucket of a Third-Party Tenant
- Dynamically Creating and Mounting Subdirectories of an SFS Turbo File System
- How Do I Change the Storage Class Used by a Cluster of v1.15 from FlexVolume to CSI Everest?
- Custom Storage Classes
- Enabling Automatic Topology for EVS Disks When Nodes Are Deployed in Different AZs (csi-disk-topology)
- Container
- Permission
- Release
- Migrating Data from CCE 1.0 to CCE 2.0
-
API Reference (Paris Regions)
- Before You Start
- API Overview
- Calling APIs
-
APIs
- API URL
-
Cluster Management
- Creating a Cluster
- Reading a Specified Cluster
- Listing Clusters in a Specified Project
- Updating a Specified Cluster
- Deleting a Cluster
- Hibernating a Cluster
- Waking Up a Cluster
- Obtaining a Cluster Certificate
- Querying a Job
- Binding/Unbinding Public API Server Address
- Obtaining Cluster Access Address
- Node Management
- Node Pool Management
- Add-on Management
- Quota Management
- API Versions
- Kubernetes APIs
- Permissions Policies and Supported Actions
-
Appendix
- Status Code
- Error Codes
- Obtaining a Project ID
- Obtaining the Account ID
- Specifying Add-ons to Be Installed During Cluster Creation
- How to Obtain Parameters in the API URI
- Creating a VPC and Subnet
- Creating a Key Pair
- Node Flavor Description
- Adding a Salt in the password Field When Creating a Node
- Maximum Number of Pods That Can Be Created on a Node
- Node OS
- Data Disk Space Allocation
- Attaching Disks to a Node
- Change History
-
User Guide (Kuala Lumpur Region)
- Service Overview
- CCE Console Upgrade
- Getting Started
- High-Risk Operations
-
Clusters
- Cluster Overview
- Buying a Cluster
- Connecting to a Cluster
- Managing a Cluster
-
Upgrading a Cluster
- Process and Method of Upgrading a Cluster
- Before You Start
- Performing Post-Upgrade Verification
- Migrating Services Across Clusters of Different Versions
-
Troubleshooting for Pre-upgrade Check Exceptions
- Pre-upgrade Check
- Node Restrictions
- Upgrade Management
- Add-ons
- Helm Charts
- SSH Connectivity of Master Nodes
- Node Pools
- Security Groups
- Arm Node Restrictions
- Residual Nodes
- Discarded Kubernetes Resources
- Compatibility Risks
- CCE Agent Versions
- Node CPU Usage
- CRDs
- Node Disks
- Node DNS
- Node Key Directory File Permissions
- kubelet
- Node Memory
- Node Clock Synchronization Server
- Node OS
- Node CPUs
- Node Python Commands
- ASM Version
- Node Readiness
- Node journald
- containerd.sock
- Internal Errors
- Node Mount Points
- Kubernetes Node Taints
- Everest Restrictions
- cce-hpa-controller Restrictions
- Enhanced CPU Policies
- Health of Worker Node Components
- Health of Master Node Components
- Memory Resource Limit of Kubernetes Components
- Discarded Kubernetes APIs
- NetworkManager
- Node ID File
- Node Configuration Consistency
- Node Configuration File
- CoreDNS Configuration Consistency
- sudo
- Key Node Commands
- Mounting of a Sock File on a Node
- HTTPS Load Balancer Certificate Consistency
- Node Mounting
- Login Permissions of User paas on a Node
- Private IPv4 Addresses of Load Balancers
- Historical Upgrade Records
- CIDR Block of the Cluster Management Plane
- GPU Add-on
- Nodes' System Parameters
- Residual Package Version Data
- Node Commands
- Node Swap
- nginx-ingress Upgrade
- Upgrade of Cloud Native Cluster Monitoring
- containerd Pod Restart Risks
- Key GPU Add-on Parameters
- GPU or NPU Pod Rebuild Risks
- ELB Listener Access Control
- Master Node Flavor
- Subnet Quota of Master Nodes
- Node Runtime
- Node Pool Runtime
- Number of Node Images
- Nodes
- Node Pools
-
Workloads
- Overview
- Creating a Workload
-
Configuring a Workload
- Configuring Time Zone Synchronization
- Configuring an Image Pull Policy
- Using Third-Party Images
- Configuring Container Specifications
- Configuring Container Lifecycle Parameters
- Configuring Container Health Check
- Configuring Environment Variables
- Configuring Workload Upgrade Policies
- Scheduling Policies (Affinity/Anti-affinity)
- Configuring Tolerance Policies
- Configuring Labels and Annotations
- Logging In to a Container
- Managing Workloads
- Managing Custom Resources
- Pod Security
- Scheduling
-
Network
- Overview
- Container Network
-
Service
- Overview
- ClusterIP
- NodePort
-
LoadBalancer
- Creating a LoadBalancer Service
- Using Annotations to Balance Load
- Configuring HTTP/HTTPS for a LoadBalancer Service
- Configuring SNI for a LoadBalancer Service
- Configuring HTTP/2 for a LoadBalancer Service
- Configuring Timeout for a LoadBalancer Service
- Configuring Health Check on Multiple Ports of a LoadBalancer Service
- Configuring Passthrough Networking for a LoadBalancer Service
- Enabling ICMP Security Group Rules
- DNAT
- Headless Services
-
Ingresses
- Overview
-
LoadBalancer Ingresses
- Creating a LoadBalancer Ingress on the Console
- Using kubectl to Create a LoadBalancer Ingress
- Configuring a LoadBalancer Ingress Using Annotations
- Configuring an HTTPS Certificate for a LoadBalancer Ingress
- Configuring SNI for a LoadBalancer Ingress
- Routing a LoadBalancer Ingress to Multiple Services
- Configuring HTTP/2 for a LoadBalancer Ingress
- Configuring HTTPS Backend Services for a LoadBalancer Ingress
- Configuring Timeout for a LoadBalancer Ingress
-
Nginx Ingresses
- Creating Nginx Ingresses on the Console
- Using kubectl to Create an Nginx Ingress
- Configuring Nginx Ingresses Using Annotations
- Configuring an HTTPS Certificate for an Nginx Ingress
- Configuring HTTPS Backend Services for an Nginx Ingress
- Configuring Consistent Hashing for Load Balancing of an Nginx Ingress
- DNS
- Configuring Intra-VPC Access
- Accessing the Internet from a Container
- Storage
- Observability
- Auto Scaling
- Namespaces
- ConfigMaps and Secrets
- Add-ons
- Helm Chart
- Permissions
-
Best Practices
- Checklist for Deploying Containerized Applications in the Cloud
- Containerization
- Disaster Recovery
- Security
- Auto Scaling
- Monitoring
- Cluster
- Networking
- Storage
- Container
- Permission
- Release
-
FAQs
- Common Questions
- Cluster
-
Node
- Node Creation
-
Node Running
- What Should I Do If a Cluster Is Available But Some Nodes Are Unavailable?
- How Do I Log In to a Node Using a Password and Reset the Password?
- How Do I Collect Logs of Nodes in a CCE Cluster?
- What Should I Do If the vdb Disk of a Node Is Damaged and the Node Cannot Be Recovered After Reset?
- What Should I Do If I/O Suspension Occasionally Occurs When SCSI EVS Disks Are Used?
- How Do I Fix an Abnormal Container or Node Due to No Thin Pool Disk Space?
- How Do I Rectify Failures When the NVIDIA Driver Is Used to Start Containers on GPU Nodes?
- Specification Change
- OSs
- Node Pool
-
Workload
-
Workload Abnormalities
- How Do I Use Events to Fix Abnormal Workloads?
- What Should I Do If Pod Scheduling Fails?
- What Should I Do If a Pod Fails to Pull the Image?
- What Should I Do If Container Startup Fails?
- What Should I Do If a Pod Fails to Be Evicted?
- What Should I Do If a Storage Volume Cannot Be Mounted or the Mounting Times Out?
- What Should I Do If a Workload Remains in the Creating State?
- What Should I Do If Pods in the Terminating State Cannot Be Deleted?
- What Should I Do If a Workload Is Stopped Caused by Pod Deletion?
- What Should I Do If an Error Occurs When Deploying a Service on the GPU Node?
- Container Configuration
- Scheduling Policies
-
Others
- What Should I Do If a Scheduled Task Cannot Be Restarted After Being Stopped for a Period of Time?
- What Is a Headless Service When I Create a StatefulSet?
- What Should I Do If Error Message "Auth is empty" Is Displayed When a Private Image Is Pulled?
- What Is the Image Pull Policy for Containers in a CCE Cluster?
- What Can I Do If a Layer Is Missing During Image Pull?
-
Workload Abnormalities
-
Networking
- Network Planning
-
Network Fault
- How Do I Locate a Workload Networking Fault?
- Why Does the Browser Return Error Code 404 When I Access a Deployed Application?
- What Should I Do If a Container Fails to Access the Internet?
- What Should I Do If a Node Fails to Connect to the Internet (Public Network)?
- What Should I Do If an Nginx Ingress Access in the Cluster Is Abnormal After the Add-on Is Upgraded?
- Security Hardening
- Network Configuration
-
Storage
- How Do I Expand the Storage Capacity of a Container?
- What Are the Differences Among CCE Storage Classes in Terms of Persistent Storage and Multi-node Mounting?
- Can I Create a CCE Node Without Adding a Data Disk to the Node?
- What Should I Do If the Host Cannot Be Found When Files Need to Be Uploaded to OBS During the Access to the CCE Service from a Public Network?
- How Can I Achieve Compatibility Between ExtendPathMode and Kubernetes client-go?
- Can CCE PVCs Detect Underlying Storage Faults?
- Namespace
-
Chart and Add-on
- What Should I Do If Installation of an Add-on Fails and "The release name is already exist" Is Displayed?
- How Do I Configure the Add-on Resource Quotas Based on Cluster Scale?
- How Can I Clean Up Residual Resources After the NGINX Ingress Controller Add-on in the Unknown State Is Deleted?
- Why TLS v1.0 and v1.1 Cannot Be Used After the NGINX Ingress Controller Add-on Is Upgraded?
-
API & kubectl FAQs
- How Can I Access a Cluster API Server?
- Can the Resources Created Using APIs or kubectl Be Displayed on the CCE Console?
- How Do I Download kubeconfig for Connecting to a Cluster Using kubectl?
- How Do I Rectify the Error Reported When Running the kubectl top node Command?
- Why Is "Error from server (Forbidden)" Displayed When I Use kubectl?
- DNS FAQs
- Image Repository FAQs
- Permissions
-
API Reference (Kuala Lumpur Region)
- Before You Start
- API Overview
- Calling APIs
-
APIs
- API URL
-
Cluster Management
- Creating a Cluster
- Reading a Specified Cluster
- Listing Clusters in a Specified Project
- Updating a Specified Cluster
- Deleting a Cluster
- Hibernating a Cluster
- Waking Up a Cluster
- Obtaining a Cluster Certificate
- Modifying Cluster Specifications
- Querying a Job
- Binding/Unbinding Public API Server Address
- Node Management
- Node Pool Management
- Storage Management
- Add-on Management
- Tag Management
- Configuration Management
-
Chart Management
- Uploading a Chart
- Obtaining a Chart List
- Obtaining a Release List
- Updating a Chart
- Creating a Release
- Deleting a Chart
- Updating a Release
- Obtaining a Chart
- Deleting a Release
- Downloading a Chart
- Obtaining a Release
- Obtaining Chart Values
- Obtaining Historical Records of a Release
- Obtaining the Quota of a User Chart
- Kubernetes APIs
- Permissions Policies and Supported Actions
-
Appendix
- Status Code
- Error Codes
- Obtaining a Project ID
- Obtaining a Domain ID
- Specifying Add-ons to Be Installed During Cluster Creation
- How to Obtain Parameters in the API URI
- Creating a VPC and Subnet
- Creating a Key Pair
- Node Flavor Description
- Adding a Salt in the password Field When Creating a Node
- Maximum Number of Pods That Can Be Created on a Node
- Node OS
- Data Disk Space Allocation
- Attaching Disks to a Node
-
User Guide (Ankara Region)
- Service Overview
- Product Bulletin
- Getting Started
- High-Risk Operations and Solutions
-
Clusters
- Cluster Overview
- Creating a Cluster
- Connecting to a Cluster
-
Upgrading a Cluster
- Upgrade Overview
- Before You Start
- Performing Post-Upgrade Verification
- Migrating Services Across Clusters of Different Versions
-
Troubleshooting for Pre-upgrade Check Exceptions
- Pre-upgrade Check
- Node Restrictions
- Upgrade Management
- Add-ons
- Helm Charts
- SSH Connectivity of Master Nodes
- Node Pools
- Security Groups
- Arm Node Restrictions
- To-Be-Migrated Nodes
- Discarded Kubernetes Resources
- Compatibility Risks
- CCE Agent Versions
- Node CPU Usage
- CRDs
- Node Disks
- Node DNS
- Node Key Directory File Permissions
- Kubelet
- Node Memory
- Node Clock Synchronization Server
- Node OS
- Node CPU Cores
- Node Python Commands
- ASM Version
- Node Readiness
- Node journald
- containerd.sock
- Internal Error
- Node Mount Points
- Kubernetes Node Taints
- Everest Restrictions
- cce-hpa-controller Limitations
- Enhanced CPU Policies
- Health of Worker Node Components
- Health of Master Node Components
- Memory Resource Limit of Kubernetes Components
- Discarded Kubernetes APIs
- Node NetworkManager
- Node ID File
- Node Configuration Consistency
- Node Configuration File
- CoreDNS Configuration Consistency
- sudo Commands of a Node
- Key Commands of Nodes
- Mounting of a Sock File on a Node
- HTTPS Load Balancer Certificate Consistency
- Node Mounting
- Login Permissions of User paas on a Node
- Private IPv4 Addresses of Load Balancers
- Historical Upgrade Records
- CIDR Block of the Cluster Management Plane
- GPU Add-on
- Nodes' System Parameters
- Residual Package Versions
- Node Commands
- Node Swap
- nginx-ingress Upgrade
- Upgrade of Cloud Native Cluster Monitoring
- containerd Pod Restart Risks
- Key GPU Add-on Parameters
- GPU or NPU Pod Rebuild Risks
- ELB Listener Access Control
- Master Node Flavor
- Subnet Quota of Master Nodes
- Node Runtime
- Node Pool Runtime
- Number of Node Images
- Managing a Cluster
- Nodes
- Node Pools
-
Workloads
- Overview
- Creating a Workload
-
Configuring a Container
- Configuring Time Zone Synchronization
- Configuring an Image Pull Policy
- Using Third-Party Images
- Configuring Container Specifications
- Configuring Container Lifecycle Parameters
- Configuring Container Health Check
- Configuring Environment Variables
- Workload Upgrade Policies
- Scheduling Policies (Affinity/Anti-affinity)
- Taints and Tolerations
- Labels and Annotations
- Accessing a Container
- Managing Workloads and Jobs
- Managing Custom Resources
- Scheduling
-
Network
- Overview
- Container Network Models
-
Service
- Overview
- ClusterIP
- NodePort
-
LoadBalancer
- Creating a LoadBalancer Service
- Using Annotations to Balance Load
- Configuring an HTTP or HTTPS Service
- Configuring SNI for a Service
- Configuring HTTP/2 for a Service
- Configuring Timeout for a Service
- Configuring Health Check on Multiple Service Ports
- Enabling Passthrough Networking for LoadBalancer Services
- Enabling ICMP Security Group Rules
- Headless Services
-
Ingresses
- Overview
-
LoadBalancer Ingresses
- Creating a LoadBalancer Ingress on the Console
- Using kubectl to Create a LoadBalancer Ingress
- Configuring a LoadBalancer Ingress Using Annotations
- Configuring an HTTPS Certificate for a LoadBalancer Ingress
- Configuring SNI for a LoadBalancer Ingress
- LoadBalancer Ingresses to Multiple Services
- Configuring HTTP/2 for a LoadBalancer Ingress
- Configuring URL Redirection for a LoadBalancer Ingress
- Configuring URL Rewriting for a LoadBalancer Ingress
- Configuring Timeout for a LoadBalancer Ingress
- Configuring a Custom Header Forwarding Policy for a LoadBalancer Ingress
-
Nginx Ingresses
- Creating Nginx Ingresses on the Console
- Using kubectl to Create an Nginx Ingress
- Configuring Nginx Ingresses Using Annotations
- Configuring HTTPS Certificates for Nginx Ingresses
- Configuring Redirection Rules for an Nginx Ingress
- Configuring URL Rewriting Rules for Nginx Ingresses
- Nginx Ingresses Using Consistent Hashing for Load Balancing
- DNS
- Container Network Settings
- Cluster Network Settings
- Configuring Intra-VPC Access
- Accessing the Internet from a Container
- Storage
- Observability
- Namespaces
- ConfigMaps and Secrets
- Auto Scaling
-
Add-ons
- Overview
- CoreDNS
- CCE Container Storage (Everest)
- CCE Node Problem Detector
- Kubernetes Dashboard
- CCE Cluster Autoscaler
- Nginx Ingress Controller
- Kubernetes Metrics Server
- CCE Advanced HPA
- CCE AI Suite (NVIDIA GPU)
- CCE AI Suite (Ascend NPU)
- Volcano Scheduler
- NodeLocal DNSCache
- Cloud Native Cluster Monitoring
- Cloud Native Logging
- Grafana
- Prometheus
- Helm Chart
- Permissions
-
FAQs
- Common Questions
- Cluster
-
Node
- Node Creation
-
Node Running
- What Should I Do If a Cluster Is Available But Some Nodes Are Unavailable?
- How Do I Log In to a Node Using a Password and Reset the Password?
- How Do I Collect Logs of Nodes in a CCE Cluster?
- What Should I Do If the vdb Disk of a Node Is Damaged and the Node Cannot Be Recovered After Reset?
- What Should I Do If I/O Suspension Occasionally Occurs When SCSI EVS Disks Are Used?
- How Do I Fix an Abnormal Container or Node Due to No Thin Pool Disk Space?
- How Do I Rectify Failures When the NVIDIA Driver Is Used to Start Containers on GPU Nodes?
- Specification Change
- OSs
- Node Pool
-
Workload
-
Workload Abnormalities
- How Do I Use Events to Fix Abnormal Workloads?
- What Should I Do If Pod Scheduling Fails?
- What Should I Do If a Pod Fails to Pull the Image?
- What Should I Do If Container Startup Fails?
- What Should I Do If a Pod Fails to Be Evicted?
- What Should I Do If a Storage Volume Cannot Be Mounted or the Mounting Times Out?
- What Should I Do If a Workload Remains in the Creating State?
- What Should I Do If Pods in the Terminating State Cannot Be Deleted?
- What Should I Do If a Workload Is Stopped Caused by Pod Deletion?
- What Should I Do If an Error Occurs When Deploying a Service on the GPU Node?
- Container Configuration
- Scheduling Policies
-
Others
- What Should I Do If a Scheduled Task Cannot Be Restarted After Being Stopped for a Period of Time?
- What Is a Headless Service When I Create a StatefulSet?
- What Should I Do If Error Message "Auth is empty" Is Displayed When a Private Image Is Pulled?
- Why Cannot a Pod Be Scheduled to a Node?
- What Is the Image Pull Policy for Containers in a CCE Cluster?
- What Can I Do If a Layer Is Missing During Image Pull?
-
Workload Abnormalities
-
Networking
- Network Planning
-
Network Fault
- How Do I Locate a Workload Networking Fault?
- Why Does the Browser Return Error Code 404 When I Access a Deployed Application?
- What Should I Do If a Container Fails to Access the Internet?
- What Should I Do If a Node Fails to Connect to the Internet (Public Network)?
- What Should I Do If an Nginx Ingress Access in the Cluster Is Abnormal After the Add-on Is Upgraded?
- Security Hardening
- Others
-
Storage
- What Are the Differences Among CCE Storage Classes in Terms of Persistent Storage and Multi-node Mounting?
- Can I Add a Node Without a Data Disk?
- What Should I Do If the Host Cannot Be Found When Files Need to Be Uploaded to OBS During the Access to the CCE Service from a Public Network?
- How Can I Achieve Compatibility Between ExtendPathMode and Kubernetes client-go?
- Can CCE PVCs Detect Underlying Storage Faults?
- Namespace
-
Chart and Add-on
- Why Does Add-on Installation Fail and Prompt "The release name is already exist"?
- How Do I Configure the Add-on Resource Quotas Based on Cluster Scale?
- What Should I Do If the Helm Chart Uploaded Before the Tenant Account Name Is Changed Is Abnormal?
- How Can I Clean Up Residual Resources After the NGINX Ingress Controller Add-on in the Unknown State Is Deleted?
- Why TLS v1.0 and v1.1 Cannot Be Used After the NGINX Ingress Controller Add-on Is Upgraded?
-
API & kubectl FAQs
- How Can I Access a Cluster API Server?
- Can the Resources Created Using APIs or kubectl Be Displayed on the CCE Console?
- How Do I Download kubeconfig for Connecting to a Cluster Using kubectl?
- How Do I Rectify the Error Reported When Running the kubectl top node Command?
- Why Is "Error from server (Forbidden)" Displayed When I Use kubectl?
- DNS FAQs
- Permissions
- Reference
-
Best Practices
- Checklist for Deploying Containerized Applications in the Cloud
- Containerization
- Disaster Recovery
- Security
- Auto Scaling
- Monitoring
- Cluster
- Networking
- Storage
- Container
- Permission
- Release
-
API Reference (Ankara Region)
- Before You Start
- API Overview
- Calling APIs
-
APIs
- API URL
-
Cluster Management
- Creating a Cluster
- Reading a Specified Cluster
- Listing Clusters in a Specified Project
- Updating a Specified Cluster
- Deleting a Cluster
- Hibernating a Cluster
- Waking Up a Cluster
- Obtaining a Cluster Certificate
- Modifying Cluster Specifications
- Querying a Job
- Binding/Unbinding Public API Server Address
- Obtaining Cluster Access Address
- Node Management
- Node Pool Management
- Storage Management
- Add-on Management
- Quota Management
- API Versions
- Tag Management
- Configuration Management
-
Chart Management
- Uploading a Chart
- Obtaining a Chart List
- Obtaining a Release List
- Updating a Chart
- Creating a Release
- Deleting a Chart
- Updating a Release
- Obtaining a Chart
- Deleting a Release
- Downloading a Chart
- Obtaining a Release
- Obtaining Chart Values
- Obtaining Historical Records of a Release
- Obtaining the Quota of a User Chart
- Kubernetes APIs
- Permissions and Supported Actions
-
Appendix
- Status Code
- Error Codes
- Obtaining a Project ID
- Obtaining an Account ID
- Specifying Add-ons to Be Installed During Cluster Creation
- How to Obtain Parameters in the API URI
- Creating a VPC and Subnet
- Creating a Key Pair
- Node Flavor Description
- Adding a Salt in the password Field When Creating a Node
- Maximum Number of Pods That Can Be Created on a Node
- Node OS
- Data Disk Space Allocation
- Attaching Disks to a Node
-
User Guide (ME-Abu Dhabi Region)
Copiado.
Autoscaler de cluster do CCE
Introdução
O Autoscaler é um controlador importante do Kubernetes. Ele suporta escalabilidade de microsserviços e é fundamental para o design sem servidor.
Quando o uso da CPU ou da memória de um microsserviço é muito alto, o escalonamento automático de pods horizontais é acionado para adicionar pods e reduzir a carga. Esses pods podem ser reduzidos automaticamente quando a carga é baixa, permitindo que o microsserviço seja executado da forma mais eficiente possível.
O CCE simplifica a criação, a atualização e o dimensionamento manual de clusters do Kubernetes, nos quais as cargas de tráfego mudam ao longo do tempo. Para equilibrar o uso de recursos e o desempenho da carga de trabalho dos nós, o Kubernetes introduz o complemento Autoscaler para ajustar automaticamente o número de nós de um cluster com base no uso de recursos necessário para cargas de trabalho implementadas no cluster. Para mais detalhes, consulte Criação de uma política de dimensionamento de nós.
Comunidade de código aberto: https://github.com/kubernetes/autoscaler
Como funciona o complemento
Autoscaler controla expansão e redução automáticas.
- Expansão automática
Você pode escolher um dos seguintes métodos:
- Se os pods em um cluster não puderem ser agendados devido a nós de trabalho insuficientes, o dimensionamento do cluster será acionado para adicionar nós. Os nós a serem adicionados têm a mesma especificação configurada para o pool de nós ao qual os nós pertencem.
Expansão automática será realizada quando:
- Os recursos de nó são insuficientes.
- Nenhuma política de afinidade de nó é definida na configuração de agendamento do pod. Se um nó tiver sido configurado como um nó de afinidade para pods, nenhum nó não será adicionado automaticamente quando os pods não puderem ser agendados. Para obter detalhes sobre como configurar a política de afinidade de nó, consulte Política de agendamento (afinidade/antiafinidade).
- Quando o cluster atende à política de dimensionamento do nó, a expansão de cluster também é acionada. Para mais detalhes, consulte Criação de uma política de dimensionamento de nós.
O complemento segue a política "No Less, No More". Por exemplo, se três núcleos forem necessários para criar um pod e o sistema oferecer suporte a nós de quatro e oito núcleos, o Autoscaler criará preferencialmente um nó de quatro núcleos.
- Se os pods em um cluster não puderem ser agendados devido a nós de trabalho insuficientes, o dimensionamento do cluster será acionado para adicionar nós. Os nós a serem adicionados têm a mesma especificação configurada para o pool de nós ao qual os nós pertencem.
- Redução automática
Quando um nó de cluster fica ocioso por um período (10 minutos por padrão), o dimensionamento de cluster é acionado e o nó é excluído automaticamente. No entanto, um nó não pode ser excluído de um cluster se os seguintes pods existirem:
- Pods que não atendem aos requisitos específicos definidos nos Orçamentos de interrupção de pods (PodDisruptionBudget)
- Pods que não podem ser programados para outros nós devido a restrições, como políticas de afinidade e antiafinidade
- Pods que têm a anotação cluster-autoscaler.kubernetes.io/safe-to-evict: 'false'
- Pods (exceto aqueles criados por DaemonSets no namespace do kube-system) que existem no namespace do kube-system no nó
- Pods que não são criados pelo controlador (Implementação/ReplicaSet/tarefa/StatefulSet)
Quando um nó atende às condições de dimensionamento, o Autoscaler adiciona a mancha DeletionCandidateOfClusterAutoscaler ao nó com antecedência para evitar que pods sejam programados para o nó. Após a desinstalação do complemento Autoscaler, se a mancha ainda existir no nó, exclua-a manualmente.
Restrições
- Verifique se há recursos suficientes para instalar o complemento.
- Somente os nós de VM com pagamento por uso podem ser adicionados ou removidos pelo Autoscaler.
- O pool de nós padrão não oferece suporte ao dimensionamento automático. Para mais detalhes, consulte Descrição de DefaultPool.
- Quando o Autoscaler é usado, algumas manchas ou anotações podem afetar o dimensionamento automático. Portanto, não use as seguintes manchas ou anotações em clusters:
- ignore-taint.cluster-autoscaler.kubernetes.io: a mancha funciona em nós. O Autoscaler do Kubernetes nativo suporta proteção contra expansões anormais e avalia periodicamente a proporção de nós disponíveis no cluster. Quando a proporção de nós não prontos exceder 45%, a proteção será acionada. Nesse caso, todos os nós com a mancha ignore-taint.cluster-autoscaler.kubernetes.io no cluster são filtrados do modelo do Autoscaler e registrados como nós não prontos, que afeta o dimensionamento do cluster.
- cluster-autoscaler.kubernetes.io/enable-ds-eviction: a anotação funciona em pods, o que determina se os pods de DaemonSet podem ser removidos pelo Autoscaler. Para mais detalhes, veja Rótulos, anotações e manchas bem conhecidos.
Instalar o complemento
- Efetue logon no console do CCE e clique no nome do cluster para acessar o console do cluster. Escolha Add-ons no painel de navegação, localize o CCE Cluster Autoscaler à direita e clique em Install.
- Na página Install Add-on, configure as especificações.
Tabela 1 Configuração de especificações Parâmetro
Descrição
Pods
Número de pods para o complemento.
A alta disponibilidade não é possível com um único pod. Se ocorrer um erro no nó em que o pod do complemento é executado, o complemento falhará.
Containers
Ajuste o número dos pods do Autoscaler e suas cotas de CPU e memória com base na escala de cluster. Para mais detalhes, consulte Tabela 2.
- Configure os parâmetros do complemento.
Tabela 3 Parâmetros Parâmetro
Descrição
Total Nodes
Número máximo de nós que podem ser gerenciados pelo cluster, dentro dos quais o dimensionamento do cluster é realizado.
Total CPUs
Soma máxima de núcleos de CPU de todos os nós em um cluster, dentro do qual a escalabilidade do cluster é executada.
Total Memory (GB)
A soma máxima da memória de todos os nós em um cluster, dentro do qual a expansão de cluster é executado.
- Configure as políticas de agendamento para o complemento.
- As políticas de agendamento não entram em vigor em instâncias complementares do tipo DaemonSet.
- Ao configurar a implementação de várias AZs ou a afinidade de nó, verifique se há nós que atendem à política de agendamento e se os recursos são suficientes no cluster. Caso contrário, o complemento não pode ser executado.
Tabela 4 Configurações para programação de complementos Parâmetro
Descrição
Multi-AZ Deployment
- Preferred: os pods de Implementação do complemento serão agendados preferencialmente para nós em diferentes AZs. Se todos os nós no cluster forem implementados na mesma AZ, os pods serão agendados para essa AZ.
- Forcible: os pods de Implementação do complemento serão forçosamente agendados para nós em diferentes AZs. Se houver menos AZs do que pods, os pods extras não funcionarão.
Node Affinity
- Incompatibility: a afinidade de nó está desabilitada para o complemento.
- Node Affinity: especifique os nós em que o complemento é implementado. Se você não especificar os nós, o complemento será agendado aleatoriamente com base na política de agendamento de cluster padrão.
- Specified Node Pool Scheduling: especifique o pool de nós em que o complemento é implementado. Se você não especificar o pool de nós, o complemento será agendado aleatoriamente com base na política de agendamento de cluster padrão.
- Custom Policies: insira os rótulos dos nós em que o complemento será implementado para políticas de agendamento mais flexíveis. Se você não especificar rótulos de nó, o complemento será agendado aleatoriamente com base na política de agendamento de cluster padrão.
Se várias políticas de afinidade personalizadas estiverem configuradas, certifique-se de que existem nós que atendam a todas as políticas de afinidade no cluster. Caso contrário, o complemento não pode ser executado.
Taints and Tolerations
O uso de manchas e tolerâncias permite (não forçosamente) que Implementação do complemento seja agendada para um nó com as manchas correspondentes e controla as políticas de despejo de Implementação depois que o nó onde a Implementação está localizada é contaminado.
O complemento adiciona a política de tolerância padrão para as manchas node.kubernetes.io/not-ready e node.kubernetes.io/unreachable, respectivamente. A janela de tempo de tolerância é 60s.
Para mais detalhes, consulte Manchas e tolerâncias.
- Após a conclusão da configuração, clique em Install.
Componentes
Componente |
Descrição |
Tipo de recurso |
---|---|---|
Autoscaler |
Dimensionamento automático para clusters do Kubernetes |
Implementação |
Período de resfriamento de redução
Os intervalos de resfriamento de redução podem ser configurados nas configurações do pool de nós e nas configurações do complemento Autoscaler.
Intervalo de resfriamento de redução configurado em um pool de nós
Esse intervalo indica o período durante o qual os nós adicionados ao pool de nós atual após uma operação de expansão não podem ser excluídos. Esse intervalo entra em vigor no nível do pool de nós.
Intervalo de resfriamento de redução configurado no complemento Autoscaler
O intervalo após uma expansão indica o período durante o qual todo o cluster não pode ser dimensionado após o complemento Autoscaler disparar a expansão (devido aos pods, métricas e políticas de dimensionamento não programáveis). Esse intervalo entra em vigor no nível do cluster.
O intervalo depois que um nó é excluído indica o período durante o qual o cluster não pode ser dimensionado após o complemento Autoscaler disparar a redução. Esse intervalo entra em vigor em todo o cluster.
O intervalo após uma falha de redução indica o período durante o qual o cluster não pode ser dimensionado após o complemento Autoscaler disparar a redução. Esse intervalo entra em vigor em todo o cluster.