Creating a Ray Service
Prerequisites
- You have a valid Huawei Cloud account. For details, see Creating an IAM User and Assigning Permissions to Use DataArtsFabric and Configuring DataArtsFabric Service Agency Permissions.
- You have at least one workspace available. For details, see Creating a Workspace.
- You have purchased the required Ray resources. For details, see Purchasing a Ray Resource.
- You have created a Ray service image package version and an inference deployment file. For details, see Creating an Image Package.
- If you use Log Tank Service (LTS) to view logs, you need to obtain the LTS permissions. For details, see Granting LTS Permissions to IAM Users.
Creating a Ray Service
- Log in to Workspace Management Console.
- Select the created workspace and click Access Workspace. In the navigation pane, choose Resources and Assets > Ray Services, and click Create Ray Service in the upper right corner.
- On the displayed page, set required parameters, including Basic Settings, Log Settings, Ray Cluster Settings, Data, and Ray Serve Settings.
For details, see Table 1.
Table 1 Parameters for creating a Ray service Parameter
Description
Basic Settings
Ray Service Name
Name of the Ray service to be created.
Add Description
Click Add Description and enter the introduction to the Ray service in the text box. It can contain a maximum of 1,000 characters.
Image Package Source
The options are Public Ray image package and My Ray service image package. Select My Ray service image package.
- Public Ray image package: Public image packages provided by DataArtsFabric. They are open-source Ray images and support enhanced DataArtsFabric features such as channel encryption, secure dashboard access, and key encryption and decryption.
- My Ray service image package: Tenants can customize Ray images as required and use image package management provided by DataArtsFabric to create and deploy image packages.
Image Package Name
Name of the service image package to be used.
Image Package Version
Select a Ray service version as required.
Log Settings
Enabling LTS
Whether to store Ray service run logs in the log service provided by Huawei Cloud LTS.
After this function is enabled, logs in the following paths are collected:
- /tmp/ray/session_latest/logs/**/*
- /var/log/service-log/**/*
Log Group
Select a log group of Huawei Cloud LTS. You can create a log group on the LTS console. For details, see Creating a Log Group.
Log Stream
Select a log stream of Huawei Cloud LTS. You can create a log stream on the LTS console. For details, see Creating a Log Stream.
Ray Cluster Settings
Head Specifications
Head node specifications of the Ray cluster to be created. Set this parameter as required.
All specifications are displayed in the specification list. The selected specification can be downward compatible with the created Ray resource. For example, if the fabric.ray.dpu.d4x resource is created, you can select fabric.ray.dpu.d1x, fabric.ray.dpu.d2x, or fabric.ray.dpu.d4x for Head Specifications. That is, a large resource specification can be split into multiple smaller resource specifications.
Worker Specifications
Worker group specifications of the Ray cluster to be created. You can click Add Worker Group to create multiple worker groups of different specifications.
Select a specification from the resource specification list for worker node deployment, and set the minimum and maximum number of worker nodes. The minimum number must be at least 1, and the maximum number can be set based on workloads.
When the Ray cluster is initialized, the minimum number of worker nodes are created. The number of worker nodes is dynamically scaled to the maximum number based on workloads.
The selected worker node specification must be also downward compatible with the existing resource. For example, if the purchased Ray resource is fabric.ray.dpu.d4x and fabric.ray.dpu.d1x is selected for Head Specifications, you can also select fabric.ray.dpu.d1x for Worker Specifications and set the maximum number of worker nodes to 3.
Data
Data Input
Model path used for running the inference service. After the Ray service is created, the model files in this path are copied to the Ray service cluster.
Ray Serve Settings
Add Application
You can click Add Application to configure and customize deployment files, running environments, and scheduling parameters. A maximum of five applications can be added.
Application Name
Name of the application to be created.
Code Directory
Code directory required for inference. You can select OBS, Image Path, or Other.
Deployment File Path
Path of the inference instance in the code.
Routing Prefix
Routing prefix for inference. The routing prefix of each application must be unique.
Environment Variables
Select Environment Variables as required and click Add to configure environment variables. For details, see Managing Environment Variables of a Training Container.
Deployment
Inference instance corresponding to the application. Select Deployment and set this parameter based on the specifications of each application.
Multiple deployments can be created in a single application. Configurations of the Ray actor, automatic scaling, and inference can be customized for each deployment.
Resources required by the Ray actor of each deployment can be configured separately. However, the total number of resources required for deployments in a single application cannot exceed the worker specifications in the basic settings.
You can configure the fixed and maximum number of replicas, as well as the automatic scaling range for a deployment. If the fixed number of replicas has been configured for a deployment, automatic scaling cannot be configured.
Viewing Ray Service Details
- Log in to Workspace Management Console.
- Select the created workspace and click Access Workspace. In the navigation pane on the left, choose Resources and Assets > Ray Services.
- On the displayed page, click the name of the target Ray service to access its details page.
On the displayed page, you can view the Ray service overview and Ray Serve settings. For details, see Table 2 and Table 3.
Table 2 Parameters on the Overview tab Parameter
Description
Ray Service Name
User-defined Ray service name.
Ray Service ID
Unique ID of the Ray service.
Status
Status of the current Ray service.
Description
Custom description of the Ray service.
Created By
Creator of the Ray service.
Created
Time when the Ray service is created.
Image Package Version
Version of the Ray service image required in the current Ray service deployment.
Head Specifications
Resource specifications and quantity required by head nodes in the Ray service deployment.
Worker Specifications
Resource specifications and quantity required by worker nodes in the Ray service deployment.
Dashboard
Link for accessing the Ray dashboard.
Data
Path and environment variables generated based on the user-defined input path.
Log Transfer to LTS
You can select Yes or No. If you enable LTS in the log settings when creating the Ray service, set this parameter to Yes.
View LTS Logs
If Log Transfer to LTS is enabled, you can click the link to go to the LTS log stream to view logs.
Table 3 Parameters on the Ray Serve Settings tab Parameter
Description
Application name
Name of the created application.
Inference Address
Address for calling the inference service. For details, see Running an Inference Service.
Code Directory
Directory of the code required for inference.
Deployment File Path
Path of the inference instance in the code.
Routing Prefix
Routing prefix for inference. The routing prefix of each application must be unique.
Environment Variables
Environment variables in the container, which are generated based on the code directory and model directory.
Deployment
Inference instance corresponding to the application.
Multiple deployments can be created in a single application. Configurations of the Ray actor, automatic scaling, and inference can be customized for each deployment.
Resources required by the Ray actor of each deployment can be configured separately. However, the total number of resources required for deployments in a single application cannot exceed the worker specifications in the basic settings.
You can configure the fixed and maximum number of replicas, as well as the automatic scaling range for a deployment. If the fixed number of replicas has been configured for a deployment, automatic scaling cannot be configured.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot