Ray and XDS Restrictions

Open-source large models are subject to varying license restrictions. Refer to the table below for details.

**Table 1** Large model license restrictions
Model Name	License Address
Llama 3 8B Chinese Instruct	https://github.com/meta-llama/llama/blob/main/LICENSE
Llama 3 70B	https://github.com/meta-llama/llama/blob/main/LICENSE
Llama 3.1 8B Chinese Chat	https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/main/LICENSE
Llama 3.1 70B	https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/main/LICENSE
Qwen 2 72B Instruct	https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
Glm 4 9B Chat	https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/LICENSE

Token quota: Each public inference service includes a free token quota. Once depleted, the service becomes unavailable, and additional tokens cannot be purchased. This quota is shared across all workspaces of the current user within the current region.
Time restriction: Services are valid for 90 days. Upon expiration, the service becomes invalid. If the same inference service is enabled across multiple workspaces, the validity period commences from the first activation.
Different models have varying context length restrictions.
Service Level Agreements (SLAs) are not guaranteed. For enhanced performance, it is recommended to deploy a dedicated inference service.

Parent topic: Constraints