Updated on 2025-07-08 GMT+08:00

Creating a Ray Cluster

Ray is a high-performance distributed execution framework that provides distributed computing abstractions using an architecture different from traditional distributed computing systems.

Ray clusters are fully managed and exclusively used. You do not need to worry about background resource management. Ray clusters provide Ray-based distributed job execution capabilities and are fully compatible with open-source versions, so you can use Ray clusters without complex script adaptation. In addition, Ray clusters natively support dashboard capabilities that are user-friendly. Compared with open-source Ray, DataArtsFabric provides a series of security hardening measures to ensure user data security, such as gRPC channel encryption and dashboard authentication access.

Prerequisites

  • You have a valid Huawei Cloud account.
  • You have at least one workspace available.
  • You have purchased the required Ray resources.

Procedure

  1. Log in to Workspace Management Console.
  2. Select the created workspace and click Access Workspace. In the navigation pane on the left, choose Resources and Assets > Ray Clusters. Click Create Ray Cluster in the upper right corner.
  3. On the displayed page, set Head Specifications and Worker Specifications as required by referring to Table 1. Then, click Create Now.

    Table 1 Parameters for creating a Ray cluster

    Parameter

    Description

    Cluster Name

    Name of the Ray cluster to be created.

    Ray Image Package Type

    Select a public Ray image package.

    Ray Image Package

    You can select different Ray versions as required. The version number is the same as that of the Ray community.

    Head Specifications

    Head node specifications of the Ray cluster to be created. Set this parameter as required.

    All specifications are displayed in the specification list. The selected specification can be downward compatible with the created Ray resource. For example, if the fabric.ray.dpu.d4x resource is created, you can select fabric.ray.dpu.d1x, fabric.ray.dpu.d2x, or fabric.ray.dpu.d4x for Head Specifications. That is, a large resource specification can be split into multiple smaller resource specifications.

    Worker Specifications

    Worker group specifications of the Ray cluster to be created. Multiple worker groups can be created.

    Select a specification from the resource specification list for worker node deployment, and set the minimum and maximum number of worker nodes. The minimum number must be at least 1, and the maximum number can be set based on workloads. When the Ray cluster is initialized, the minimum number of worker nodes are created. The number of worker nodes is dynamically scaled to the maximum number based on workloads. You can also add worker nodes of different specifications. The selected worker node specification must be also downward compatible with the existing resource. For example, if the purchased Ray resource is fabric.ray.dpu.d4x and fabric.ray.dpu.d1x is selected for Head Specifications, you can also select fabric.ray.dpu.d1x for Worker Specifications and set the maximum number of worker nodes to 3.

You can manually refresh the page to monitor the Ray cluster creation progress. The creation takes approximately 3 to 5 minutes.

If a Ray cluster fails to be created, delete the failed cluster before creating another one. This prevents the failed cluster from occupying resources.