Help Center/ ModelArts/ ModelArts User Guide (Standard)/ Inference Deployment/ Deploying an AI Application as a Batch Inference Service
Updated on 2024-10-29 GMT+08:00

Deploying an AI Application as a Batch Inference Service

After an AI application is prepared, you can deploy it as a batch service. The Model Deployment > Batch Services page lists all batch services.

Prerequisites

  • An AI application in the Normal state is available in ModelArts.
  • Data to be processed in batches has been uploaded to OBS.
  • At least one empty folder has been created in OBS for storing the output.

Context

  • You can create up to 1,000 batch services.
  • Based on the input request (JSON or file) defined by the AI application, different parameters are entered. If the AI application input is a JSON file, a configuration file is required to generate a mapping file. If the AI application input is a file, no mapping file is required.
  • Batch services can only be deployed in a public resource pool, but not a dedicated resource pool.

Procedure

  1. Log in to the ModelArts console. In the navigation pane on the left, choose Model Deployment > Batch Services.
  2. Click Deploy in the upper left corner.
  3. Configure parameters.
    1. Enter basic information, including Name and Description. A name is generated by default, for example, service-bc0d, which you can modify.
    2. Configure other parameters, including the resource pool and AI application configurations.
      Table 1 Parameters

      Parameter

      Description

      AI Application Source

      Choose My AI Applications or My Subscriptions as needed.

      AI Application and Version

      Select an AI application and version that are running properly.

      Input Path

      Select the OBS directory where the uploaded data is stored. Select a folder or a .manifest file. For details about the specifications of the .manifest file, see Manifest File Specifications.

      NOTE:
      • If the input data is an image, ensure that the size of a single image is less than 12 MB.
      • If the input data is in CSV format, ensure that no Chinese character is included.
      • If the input data is in CSV format, ensure that the file size does not exceed 12 MB.
      • If an image or CSV file is larger than 12 MB, an error is reported. In this case, resize the file or contact technical support to adjust the file size limit.

      Request Path

      URL used for calling the AI application API in a batch service, and also the request path of the AI application service. Its value is obtained from the url field of apis in the AI application configuration file.

      Mapping Relationship

      If the AI application input is in JSON format, the system automatically generates the mapping based on the configuration file corresponding to the AI application. If the AI application input is other file, mapping is not required.

      The mapping file is generated automatically. Enter the field index corresponding to each parameter in the CSV file. The index starts from 0.

      Mapping rule: The mapping rule comes from the input parameter (request) in the model configuration file config.json. When type is set to string, number, integer, or boolean, you are required to set the index parameter. For details about the mapping rule, see Mapping Example.

      The index must be a positive integer starting from 0. If the value of index does not comply with the rule, this parameter is ignored in the request. After the mapping rule is configured, the CSV data must be separated by commas (,).

      Output Path

      The path for storing the batch prediction results. You can select an empty folder you created.

      Instance Flavor

      The system provides available compute resources matching your AI application. Select an available resource from the drop-down list.

      For example, if the model comes from an ExeML project, the compute resources are automatically associated with the ExeML specifications for use.

      Compute Nodes

      Number of instances for the current AI application version. If you set the number of nodes to 1, the standalone computing mode is used. If you set the number of nodes to a value greater than 1, the distributed computing mode is used. Select a computing mode based on your actual needs.

      Environment Variable

      Set environment variables and inject them to the pod. To ensure data security, do not enter sensitive information, such as plaintext passwords, in environment variables.

      Timeout

      Timeout of a single model, including both the deployment and startup time. The default value is 20 minutes. The value must range from 3 to 120.

      Runtime Log Output

      This feature is disabled by default. The runtime logs of batch services are stored only in the ModelArts log system. You can query the runtime logs in the Logs tab of the service details page.

      If this feature is enabled, the runtime logs of batch services will be exported and stored in Log Tank Service (LTS). LTS automatically creates log groups and log streams and caches run logs generated within seven days by default. For details about LTS log management, see Log Tank Service.

      NOTE:
      • This cannot be disabled once it is enabled.
      • You will be billed for log query and storage features provided by LTS. For details, see LTS Pricing Details.
      • Do not print unnecessary audio log files. Otherwise, system logs may fail to be displayed, and the error message "Failed to load audio" may be displayed.
  4. Confirm the configurations and complete service deployment as prompted. Deploying a service generally requires a period of time, which may be several minutes or tens of minutes depending on the amount of your data and resources.

    Once a batch service is deployed, it will start immediately. You will be billed for the chosen resources while it is running.

    You can go to the batch service list to view the basic information about the batch service. In the batch service list, after the status of the newly deployed service changes from Deploying to Running, the service is deployed.

Manifest File Specifications

ModelArts batch services support manifest files, which describe data input and output.

Example input manifest file
  • File name: test.manifest
  • File content:
    {"source": "obs://test/data/1.jpg"}
    {"source": "s3://test/data/2.jpg"}
    {"source": "https://infers-data.obs.cn-north-1.myhuaweicloud.com:443/xgboosterdata/data.csv?AccessKeyId=2Q0V0TQ461N26DDL18RB&Expires=1550611914&Signature=wZBttZj5QZrReDhz1uDzwve8GpY%3D&x-obs-security-token=gQpzb3V0aGNoaW5hixvY8V9a1SnsxmGoHYmB1SArYMyqnQT-ZaMSxHvl68kKLAy5feYvLDM..."}
  • Requirements on the file:
    1. The file name extension must be .manifest.
    2. The file content must be in JSON format. Each line describes a piece of input data, which must be a specific file instead of a folder.
    3. JSON content requires a source field, which must be an OBS file address in either of the following formats:
      1. Bucket path <obs path>{{Bucket name}}/{{Object name}}/File name, which is used to access your OBS data. You can obtain the path by accessing the OBS object. <obs path> can be obs:// or s3://.
      2. Share link generated by OBS, including signature information. It applies to accessing OBS data of other users. The link has a validity period. Perform operations within the period.

Example output manifest file

A manifest file will be generated in the output directory of the batch service.
  • Assume that the output path is //test-bucket/test/. The result is stored in the following path:
    OBS bucket or directory name
    ├── test-bucket
    │   ├── test
    │   │   ├── infer-result-{{task_id}}.manifest
    │   │   ├── infer-result
    │   │   │ ├── 1.jpg_result.txt
    │   │   │ ├── 2.jpg_result.txt
  • Content of the infer-result-0.manifest file:
    {"source": "obs://obs-data-bucket/test/data/1.jpg","result":"SUCCESSFUL","inference-loc": "obs://test-bucket/test/infer-result/1.jpg_result.txt"}
    {"source": "s3://obs-data-bucket/test/data/2.jpg","result":"FAILED","error_message": "Download file failed."}
    {"source ": "https://infers-data.obs.xxx.com:443/xgboosterdata/2.jpg?AccessKeyId=2Q0V0TQ461N26DDL18RB&Expires=1550611914&Signature=wZBttZj5QZrReDhz1uDzwve8GpY%3D&x-obs-security-token=gQpzb3V0aGNoaW5hixvY8V9a1SnsxmGoHYmB1SArYMyqnQT-ZaMSxHvl68kKLAy5feYvLDMNZWxzhBZ6Q-3HcoZMh9gISwQOVBwm4ZytB_m8sg1fL6isU7T3CnoL9jmvDGgT9VBC7dC1EyfSJrUcqfB_N0ykCsfrA1Tt_IQYZFDu_HyqVk-GunUcTVdDfWlCV3TrYcpmznZjliAnYUO89kAwCYGeRZsCsC0ePu4PHMsBvYV9gWmN9AUZIDn1sfRL4voBpwQnp6tnAgHW49y5a6hP2hCAoQ-95SpUriJ434QlymoeKfTHVMKOeZxZea-JxOvevOCGI5CcGehEJaz48sgH81UiHzl21zocNB_hpPfus2jY6KPglEJxMv6Kwmro-ZBXWuSJUDOnSYXI-3ciYjg9-h10b8W3sW1mOTFCWNGoWsd74it7l_5-7UUhoIeyPByO_REwkur2FOJsuMpGlRaPyglZxXm_jfdLFXobYtzZhbul4yWXga6oxTOkfcwykTOYH0NPoPRt5MYGYweOXXxFs3d5w2rd0y7p0QYhyTzIkk5CIz7FlWNapFISL7zdhsl8RfchTqESq94KgkeqatSF_iIvnYMW2r8P8x2k_eb6NJ7U_q5ztMbO9oWEcfr0D2f7n7Bl_nb2HIB_H9tjzKvqwngaimYhBbMRPfibvttW86GiwVP8vrC27FOn39Be9z2hSfJ_8pHej0yMlyNqZ481FQ5vWT_vFV3JHM-7I1ZB0_hIdaHfItm-J69cTfHSEOzt7DGaMIES1o7U3w%3D%3D","result":"SUCCESSFUL","inference-loc": "obs://test-bucket/test/infer-result/2.jpg_result.txt"}
  • File format:
    1. The file name is infer-result-{{task_id}}.manifest, where task_id is the batch task ID, which is unique for a batch service.
    2. If a large number of files need to be processed, multiple manifest files may be generated with the same suffix .manifest and are distinguished by suffix, for example, infer-result-{{task_id}}_1.manifest.
    3. The infer-result-{{task_id}} directory is created in the manifest directory to store the file processing result.
    4. The file content is in JSON format. Each line describes the output result of a piece of input data.
    5. The JSON file contains multiple fields:
      1. source: input data description, which is the same as that of the input manifest file
      2. result: file processing result, which can be SUCCESSFUL or FAILED
      3. inference-loc: output result path. This field is available when result is SUCCESSFUL. The format is obs://{{Bucket name}}/{Object name}.
      4. error_message: error information. This field is available when the result is FAILED.

Mapping Example

The following example shows the relationship between the configuration file, mapping rule, CSV data, and inference request.

The following uses a file for prediction as an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
[
    {
        "method": "post",
        "url": "/",
        "request": {
            "Content-type": "multipart/form-data",
            "data": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "req_data": {
                                "type": "array",
                                "items": [
                                    {
                                        "type": "object",
                                        "properties": {
                                            "input_1": {
                                                "type": "number"
                                            },
                                            "input_2": {
                                                "type": "number"
                                            },
                                            "input_3": {
                                                "type": "number"
                                            },
                                            "input_4": {
                                                "type": "number"
                                            }
                                        }
                                    }
                                ]
                            }
                        }
                    }
                }
            }
        }
    }
]

The ModelArts console automatically resolves the mapping relationship from the configuration file as shown below. When calling a ModelArts API, configure the mapping by following the rule.

{
    "type": "object",
    "properties": {
        "data": {
            "type": "object",
            "properties": {
                "req_data": {
                    "type": "array",
                    "items": [
                        {
                            "type": "object",
                            "properties": {
                                "input_1": {
                                    "type": "number",
                                    "index": 0
                                },
                                "input_2": {
                                    "type": "number",
                                    "index": 1
                                },
                                "input_3": {
                                    "type": "number",
                                    "index": 2
                                },
                                "input_4": {
                                    "type": "number",
                                    "index": 3
                                }
                            }
                        }
                    ]
                }
            }
        }
    }
}

The following shows the format of the CSV data for inference. The data must be separated by commas (,).

5.1,3.5,1.4,0.2
4.9,3.0,1.4,0.2
4.7,3.2,1.3,0.2

Depending on the defined mapping relationship, the inference request is shown below, whose format is similar to that for real-time services.

{
	"data": {
		"req_data": [{
			"input_1": 5.1,
			"input_2": 3.5,
			"input_3": 1.4,
			"input_4": 0.2
		}]
	}
}

Viewing the Batch Service Prediction Result

When deploying a batch service, you can select the location of the output data directory. You can view the running result of the batch service that is in the Completed state.

  1. Log in to the ModelArts console and choose Model Deployment > Batch Services.
  2. Click the name of the target service in the Completed status. The service details page is displayed.

    • You can view the service name, status, ID, input path, output path, and description.
    • You can click next to Description to edit the description.

  3. Obtain the detailed OBS path next to Output Path, switch to the path and obtain the batch service prediction results, including the prediction result file and the AI application prediction result.

    If the prediction is successful, the directory contains the prediction result file and AI application prediction result. Otherwise, the directory contains only the prediction result file.

    • Prediction result file: The file is in the xxx.manifest format and contains the file path and prediction result.
    • AI application prediction result:
      • If images are input, a result file is generated for each image in the Image name_result.txt format, for example, IMG_20180919_115016.jpg_result.txt.
      • If audio files are input, a result file is generated for each audio file in the Audio file name__result.txt format, for example, 1-36929-A-47.wav_result.txt.
      • If table data is input, the result file is generated in the Table name__result.txt format, for example, train.csv_result.txt.