Updated on 2023-12-07 GMT+08:00

Upgrading a Real-Time Service

For a deployed service, you can change the AI application version to upgrade it.

Services can be upgraded in three modes: full upgrade, rolling upgrade (increase instances), and rolling upgrade (decrease instances). For details, see Figure 1.

  • Full upgrade

    Twice the number of resources required by the service will be used to create new-version instances in full mode.

  • Rolling upgrade (increase instances)

    Extra resources will be used for a rolling upgrade. The more the instances, the faster the upgrade.

  • Rolling upgrade (decrease instances)

    Certain nodes that were reserved to run services will be used for a rolling upgrade. The more the instances for upgrade, the faster the upgrade, but with a higher probability of service interruption.

    Figure 1 Upgrade process

Prerequisites

You can upgrade a service in the Running, Abnormal, Alarm, or Stop status.

Constraints

  • Improper upgrade operations will interrupt services during the upgrade.
  • ModelArts supports hitless rolling upgrade of real-time services in some scenarios. Prepare for and fully verify the upgrade.

    For details about scenarios that support hitless upgrade of real-time services, see Table 1.

    Table 1 Scenarios for hitless upgrade

    Meta Model Source for Creating an AI Application

    Public Resource Pool

    Dedicated Resource Pool

    Training job

    Not supported

    Not supported

    Template

    Not supported

    Not supported

    Container image

    Not supported

    Supported

    The image must meet custom image specifications for creating AI applications.

    NOTICE:

    If any of the following operations have been performed on an AI application version, hitless rolling upgrade is not supported:

    • Health check is not configured.
    • The protocol has been changed. For example, the HTTP protocol has been changed to the HTTPS protocol.
    • The port of the model has been changed.

    OBS

    Not supported

    Not supported

Procedure

  1. Log in to the ModelArts management console. In the navigation pane, choose Service Deployment > Real-Time Services. The list is sorted by Update.
  2. Click the down arrow on the left of the target service name to show all AI application versions, and then click Upgrade in the Operation column of the target version.
  3. On the Upgrade Version page, set parameters. For details, see Table 2.
    There are three upgrade scenarios: upgrade using the same public resource pool, upgrade using the same dedicated resource pool, and upgrade from a dedicated resource pool to another.

    Services cannot be upgraded from a public resource pool to a dedicated resource pool, and vice versa.

    Table 2 Parameters

    Parameter

    Description

    Service Name

    Name of the real-time service, which cannot be modified.

    Current AI Application

    Current AI application version, which cannot be modified.

    Resource Pool

    This parameter is available for real-time services deployed in a dedicated resource pool.

    Resource Pool Spec

    Select a dedicated resource pool for running the service.

    AI Application and Version

    New version of the AI application. Only the version can be selected.

    Specifications

    Select available specifications based on the list displayed on the console. The specifications in gray cannot be used in the current environment.

    Environment Variable

    Set environment variables and inject them to the container instance. To ensure data security, do not enter sensitive information in environment variables.

    Upgrade Mode

    This parameter is available for real-time services deployed in a dedicated resource pool.

    • Full upgrade: Twice the number of resources required by the service will be used for a one-time full upgrade.
    • Rolling upgrade (increase instances): Extra resources will be used for a rolling upgrade. The more the instances, the faster the upgrade. You can increase instances by quantity or proportion.
      • By quantity: The rolling upgrade is performed based on the specified number of new instances.
      • By proportion: The rolling upgrade is performed based on the specified proportion of new instances (rounded up).
    • Rolling upgrade (decrease instances): Certain nodes that were reserved to run services will be used for a rolling upgrade. The more the instances for upgrade, the faster the upgrade, but with a higher probability of service interruption. You can increase instances by quantity or proportion.
      • By quantity: The rolling upgrade is performed based on the specified number of instances that were reserved to run services.
      • By proportion: The rolling upgrade is performed based on the specified proportion of instances that were reserved to run services (rounded up).
      NOTE:

      The rolling upgrade (increase instances) and rolling upgrade (decrease instances) modes are supported only when services are upgraded in the same dedicated resource pool.

  4. Click Next. Then, confirm the information and click Submit. If the dedicated resource pool resources are insufficient, a message is displayed in the upper right corner of the page, notifying you to expand the capacity of the dedicated resource pool. If the resources are sufficient, the task is submitted and the real-time service list page is displayed.
  5. View the service upgrade status in the real-time service list. During the service upgrade, the upgrade status and upgrade progress are displayed in the Status column. If the service upgrade fails, the upgrade status and rollback progress are displayed in the Status column. The upgrade failure time is displayed in the lower part of the status bar.