Updated on 2025-06-12 GMT+08:00

Ray and XDS Restrictions

Large Model License Restrictions

Open-source large models are subject to varying license restrictions. Refer to the table below for details.

Table 1 Large model license restrictions

Model Name

License Address

Llama 3 8B Chinese Instruct

https://github.com/meta-llama/llama/blob/main/LICENSE

Llama 3 70B

https://github.com/meta-llama/llama/blob/main/LICENSE

Llama 3.1 8B Chinese Chat

https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/main/LICENSE

Llama 3.1 70B

https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/main/LICENSE

Qwen 2 72B Instruct

https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE

Glm 4 9B Chat

https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/LICENSE

Restrictions on the Common Inference Services

  • Token quota: Each public inference service includes a free token quota. Once depleted, the service becomes unavailable, and additional tokens cannot be purchased. This quota is shared across all workspaces of the current user within the current region.
  • Time restriction: Services are valid for 90 days. Upon expiration, the service becomes invalid. If the same inference service is enabled across multiple workspaces, the validity period commences from the first activation.
  • Different models have varying context length restrictions.
  • Service Level Agreements (SLAs) are not guaranteed. For enhanced performance, it is recommended to deploy a dedicated inference service.