Help Center/ DataArts Fabric/ API Reference/ Examples/ Creating an Inference Service

Updated on 2025-09-15 GMT+08:00

View PDF

Creating an Inference Service

Overview

This section describes how to create an inference service by calling APIs.

This process assumes that the end tenant has been authorized on the console to use the DataArtsFabric service. For details about how to call APIs, see Calling APIs.

Preparations

hostname: Obtain the value from Regions and Endpoints.

Procedure

Call the API for creating a workspace to create a workspace and record the workspace ID returned by the API.

Example request

POST https://{hostname}/v1/workspaces

Body:

{
  "name": "apieworkspace",
  "description": "apie test workspace"
}
Example response
{   
    "id": "e935d0ef-f4eb-4b95-aff1-9d33ae9f57a6",   
    "name": "fabric",   
    "description": "fabric",   
    "create_time": "2023-05-30T12:24:30.401Z",   
    "create_domain_name": "admin",   
    "create_user_name": "user",   
    "metastore_id": "2180518f-42b8-4947-b20b-adfc53981a25",   
    "access_url": "https://:test.fabric.com/",   
    "enterprise_project_id": "01049549-82cd-4b2b-9733-ddb94350c125" 
}

Call the API for creating an endpoint to create an inference endpoint and record the endpoint ID returned by the API.

Example request

POST https://{hostname}/v1/workspaces/{workspace_id}/endpoints

workspace_id: the workspace ID recorded in step 1.

Body:

{
  "name": "apie_test",
  "description": "apie test endpoint",
  "type": "inference",
  "reserved_resource": {
    "mu": {
      "spec_code": "mu.llama3.8b",
      "min": 0,
      "max": 1
    }
  }
}

Example response

{
  "visibility": "PRIVATE",
  "id": "0b5633ba2b904511ad514346f4d23d4b",
  "name": "endpoint1",
  "type": "inference",
  "status": "CREATING",
  "description": "description",
  "create_time": "2023-05-30T12:24:30.401Z",
  "update_time": "2023-05-30T12:24:30.401Z",
  "owner": {
    "domain_name": "string",
    "domain_id": "xxx",
    "user_name": "string",
    "user_id": "xxx"
  }
  "reserved_resource": {
    "mu": {
      "spec_code": "mu.llama3.8b",
      "min": 0,
      "max": 1
    }
  }
}

Call the API for creating a model to create a private model and record the model ID returned by the API.

Example request

POST https://{hostname}/v1/workspaces/{workspace_id}/models

workspace_id: the workspace ID recorded in step 1.

Body:

{
  "name": "LLama3-8b",
  "description": "this is a apie test model",
  "type": "LLM_MODEL",
  "version": {
    "name": "v1",
    "description": "test description",
    "config": {
      "llm_model_config": {
        "base_model_type": "",
        "model_path": ""
      }
    }
  }
}

Example response

{   
   "id": "ac8111bf-3601-4905-8ddd-b41d3e636a4e"}
}

Call the API for creating an inference service to create an inference service and record the inference service ID returned by the API.
Example request
```
POST https://{hostname}/v1/workspaces/{workspace_id}/services/instances
```
workspace_id: the workspace ID recorded in step 1.

Body:
```
{
  "source": {
    "id": ""
  },
  "name": "test_serviceInstanceName",
  "description": "description",
  "endpoint_id": ""}
```
- id: the model ID returned by the API in step 3.
- endpoint_id: the inference endpoint ID returned by the API in step 2.
Example response
```
{   
   "id": "b935d0ef-f4eb-4b95-aff1-9d33ae9f57b6" 
}
```

Call the API for inference request to initiate an inference request.

Example request

POST https://{hostname}/v1/workspaces/{workspace_id}/services/instances/{instance_id}/invocations

workspace_id: the workspace ID recorded in step 1.
instance_id: the inference service ID recorded in step 4.

Body:

{
  "messages": [
    {
      "role": "user",
      "content": "hello"
    }
  ]
}

Response example: The API for inference request returns results in streaming mode.

{
  "id": "chatcmpl-62dda7304f53451c9477e0",
  "object": "chat.completion.chunk",
  "created": 1730120529,
  "model": "ada1d67d-f2a1-4e77-838f-0d8688d756f4",
  "choices": [
    {
      "index": 0,
      "delta": {
        "role": "assistant",
        "content": "\n\nHello! LLM stands for Large Language Model. It refers to artificial intelligence models, like myself,"
      },
      "finish_reason": null
    }
  ],
  "system_fingerprint": null,
  "usage": null
}

Parent topic: Examples

Previous topic: Examples

Next topic: Permissions and Supported Actions