Creating a Ray Service

Prerequisites

You have a valid Huawei Cloud account. For details, see Creating an IAM User and Assigning Permissions to Use DataArtsFabric and Configuring DataArtsFabric Service Agency Permissions.
You have at least one workspace available. For details, see Creating a Workspace.
You have purchased the required Ray resources. For details, see Purchasing a Ray Resource.
You have created a Ray service image package version and an inference deployment file. For details, see Creating an Image Package.
If you use Log Tank Service (LTS) to view logs, you need to obtain the LTS permissions. For details, see Granting LTS Permissions to IAM Users.

Creating a Ray Service

Log in to Workspace Management Console.
Select the created workspace and click Access Workspace. In the navigation pane, choose Resources and Assets > Ray Services, and click Create Ray Service in the upper right corner.

On the displayed page, set required parameters, including Basic Settings, Log Settings, Ray Cluster Settings, Data, and Ray Serve Settings.

For details, see Table 1.

**Table 1** Parameters for creating a Ray service
Parameter		Description
Basic Settings	Ray Service Name	Name of the Ray service to be created.
	Add Description	Click Add Description and enter the introduction to the Ray service in the text box. It can contain a maximum of 1,000 characters.
	Image Package Source	The options are Public Ray image package and My Ray service image package. Select My Ray service image package. Public Ray image package: Public image packages provided by DataArtsFabric. They are open-source Ray images and support enhanced DataArtsFabric features such as channel encryption, secure dashboard access, and key encryption and decryption. My Ray service image package: Tenants can customize Ray images as required and use image package management provided by DataArtsFabric to create and deploy image packages.
	Image Package Name	Name of the service image package to be used.
	Image Package Version	Select a Ray service version as required.
Log Settings	Enabling LTS	Whether to store Ray service run logs in the log service provided by Huawei Cloud LTS. After this function is enabled, logs in the following paths are collected: /tmp/ray/session_latest/logs/*/ /var/log/service-log/*/
	Log Group	Select a log group of Huawei Cloud LTS. You can create a log group on the LTS console. For details, see Creating a Log Group.
	Log Stream	Select a log stream of Huawei Cloud LTS. You can create a log stream on the LTS console. For details, see Creating a Log Stream.
Ray Cluster Settings	Head Specifications	Head node specifications of the Ray cluster to be created. Set this parameter as required. All specifications are displayed in the specification list. The selected specification can be downward compatible with the created Ray resource. For example, if the fabric.ray.dpu.d4x resource is created, you can select fabric.ray.dpu.d1x, fabric.ray.dpu.d2x, or fabric.ray.dpu.d4x for Head Specifications. That is, a large resource specification can be split into multiple smaller resource specifications.
Ray Cluster Settings	Worker Specifications	Worker group specifications of the Ray cluster to be created. You can click Add Worker Group to create multiple worker groups of different specifications. Select a specification from the resource specification list for worker node deployment, and set the minimum and maximum number of worker nodes. The minimum number must be at least 1, and the maximum number can be set based on workloads. When the Ray cluster is initialized, the minimum number of worker nodes are created. The number of worker nodes is dynamically scaled to the maximum number based on workloads. The selected worker node specification must be also downward compatible with the existing resource. For example, if the purchased Ray resource is fabric.ray.dpu.d4x and fabric.ray.dpu.d1x is selected for Head Specifications, you can also select fabric.ray.dpu.d1x for Worker Specifications and set the maximum number of worker nodes to 3.
Data	Data Input	Model path used for running the inference service. After the Ray service is created, the model files in this path are copied to the Ray service cluster.
Ray Serve Settings	Add Application	You can click Add Application to configure and customize deployment files, running environments, and scheduling parameters. A maximum of five applications can be added.
	Application Name	Name of the application to be created.
	Code Directory	Code directory required for inference. You can select OBS, Image Path, or Other.
	Deployment File Path	Path of the inference instance in the code.
	Routing Prefix	Routing prefix for inference. The routing prefix of each application must be unique.
	Environment Variables	Select Environment Variables as required and click Add to configure environment variables. For details, see Managing Environment Variables of a Training Container.
	Deployment	Inference instance corresponding to the application. Select Deployment and set this parameter based on the specifications of each application. Multiple deployments can be created in a single application. Configurations of the Ray actor, automatic scaling, and inference can be customized for each deployment. Resources required by the Ray actor of each deployment can be configured separately. However, the total number of resources required for deployments in a single application cannot exceed the worker specifications in the basic settings. You can configure the fixed and maximum number of replicas, as well as the automatic scaling range for a deployment. If the fixed number of replicas has been configured for a deployment, automatic scaling cannot be configured.

Viewing Ray Service Details

Log in to Workspace Management Console.
Select the created workspace and click Access Workspace. In the navigation pane on the left, choose Resources and Assets > Ray Services.

On the displayed page, click the name of the target Ray service to access its details page.

On the displayed page, you can view the Ray service overview and Ray Serve settings. For details, see Table 2 and Table 3.

**Table 2** Parameters on the Overview tab
Parameter	Description
Ray Service Name	User-defined Ray service name.
Ray Service ID	Unique ID of the Ray service.
Status	Status of the current Ray service.
Description	Custom description of the Ray service.
Created By	Creator of the Ray service.
Created	Time when the Ray service is created.
Image Package Version	Version of the Ray service image required in the current Ray service deployment.
Head Specifications	Resource specifications and quantity required by head nodes in the Ray service deployment.
Worker Specifications	Resource specifications and quantity required by worker nodes in the Ray service deployment.
Dashboard	Link for accessing the Ray dashboard.
Data	Path and environment variables generated based on the user-defined input path.
Log Transfer to LTS	You can select Yes or No. If you enable LTS in the log settings when creating the Ray service, set this parameter to Yes.
View LTS Logs	If Log Transfer to LTS is enabled, you can click the link to go to the LTS log stream to view logs.

**Table 3** Parameters on the Ray Serve Settings tab
Parameter	Description
Application name	Name of the created application.
Inference Address	Address for calling the inference service. For details, see Running an Inference Service.
Code Directory	Directory of the code required for inference.
Deployment File Path	Path of the inference instance in the code.
Routing Prefix	Routing prefix for inference. The routing prefix of each application must be unique.
Environment Variables	Environment variables in the container, which are generated based on the code directory and model directory.
Deployment	Inference instance corresponding to the application. Multiple deployments can be created in a single application. Configurations of the Ray actor, automatic scaling, and inference can be customized for each deployment. Resources required by the Ray actor of each deployment can be configured separately. However, the total number of resources required for deployments in a single application cannot exceed the worker specifications in the basic settings. You can configure the fixed and maximum number of replicas, as well as the automatic scaling range for a deployment. If the fixed number of replicas has been configured for a deployment, automatic scaling cannot be configured.