Specifications for Writing a Model Inference Code File
This section describes the general method of editing model inference code in ModelArts. For details about the custom script examples (including inference code examples) of mainstream AI engines, see Examples of Custom Scripts. This section also provides an inference code example for the TensorFlow engine and an example of customizing the inference logic in the inference script.
Due to the limitation of API Gateway, the duration of a single prediction in ModelArts cannot exceed 40s. The model inference code must be logically clear and concise for satisfactory inference performance.
Specifications for Writing Inference Code
- In the model inference code file customize_service.py, add a child model class. This child model class inherits properties from its parent model class. For details about the import statements of different types of parent model classes, see Table 1. The ModelArts environment has already configured the necessary Python packages for import statements, so you do not need to install them separately.
Table 1 Parent class and import statement of each model type Model Type
Parent Class
Import Statement
TensorFlow
TfServingBaseService
from model_service.tfserving_model_service import TfServingBaseService
PyTorch
PTServingBaseService
from model_service.pytorch_model_service import PTServingBaseService
MindSpore
SingleNodeService
from model_service.model_service import SingleNodeService
- The following methods can be overridden.
Table 2 Methods to be overridden Method
Description
__init__(self, model_name, model_path)
Initialization method, which is suitable for models created based on deep learning frameworks. Models and labels are loaded using this method. To implement model loading logic, override this method for PyTorch and Caffe-based models.
__init__(self, model_path)
Initialization method, which is suitable for models created based on machine learning frameworks. This method initializes the model path (self.model_path). In Spark_MLlib, this method also initializes SparkSession (self.spark).
_preprocess(self, data)
Preprocess method, which is called before an inference request and converts API request data into the model's expected input format.
_inference(self, data)
Inference request method. You are advised not to override this method, as it will replace the built-in inference process in ModelArts with your custom logic.
_postprocess(self, data)
Postprocess method, which is called after an inference request is complete and converts the model output to the API output.
- You can override the preprocess and postprocess methods for preprocessing the API input and postprocessing the inference output.
- Overriding the init method of the parent model class may cause a model to run abnormally.
- The attribute that can be used is the local path to the model. The attribute name is self.model_path. Additionally, PySpark-based models can use self.spark to obtain the SparkSession object in customize_service.py.
The inference code requires an absolute file path for reading files. You can obtain the local path to the model from the self.model_path attribute.
- When TensorFlow, Caffe, or MXNet is used, self.model_path indicates the path to the model file. The following provides an example:
# Reads the label.json file in the model directory. with open(os.path.join(self.model_path, 'label.json')) as f: self.label = json.load(f)
- When PyTorch, Scikit_Learn, or PySpark is used, self.model_path indicates the path to the model file. The following provides an example:
# Reads the label.json file in the model directory. dir_path = os.path.dirname(os.path.realpath(self.model_path)) with open(os.path.join(dir_path, 'label.json')) as f: self.label = json.load(f)
- When TensorFlow, Caffe, or MXNet is used, self.model_path indicates the path to the model file. The following provides an example:
- The API accepts data in either multipart/form-data or application/json format for pre-processing, actual inference, and post-processing.
- multipart/form-data request
curl -X POST \ <modelarts-inference-endpoint> \ -F image1=@cat.jpg \ -F images2=@horse.jpg
The input data is as follows:
[ { "image1":{ "cat.jpg":"<cat.jpg file io>" } }, { "image2":{ "horse.jpg":"<horse.jpg file io>" } } ]
- application/json request
curl -X POST \ <modelarts-inference-endpoint> \ -d '{ "images":"base64 encode image" }'
The input data is python dict.
{ "images":"base64 encode image" }
- multipart/form-data request
TensorFlow Inference Script Example
- Inference code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
from PIL import Image import numpy as np from model_service.tfserving_model_service import TfServingBaseService class MnistService(TfServingBaseService): def _preprocess(self, data): preprocessed_data = {} for k, v in data.items(): for file_name, file_content in v.items(): image1 = Image.open(file_content) image1 = np.array(image1, dtype=np.float32) image1.resize((1, 784)) preprocessed_data[k] = image1 return preprocessed_data def _postprocess(self, data): infer_output = {} for output_name, result in data.items(): infer_output["mnist_result"] = result[0].index(max(result[0])) return infer_output
- Request
curl -X POST \ Real-time service address \ -F images=@test.jpg
- Response
{"mnist_result": 7}
The preceding sample code resizes images imported to the user's form to adapt to the model input shape. The 32×32 image is read from the Pillow library and resized to 1×784 to match the model input. In subsequent processing, convert the model output into a list for the RESTful API to display.
Inference Script Example of Custom Inference Logic
Customize a dependency package in the configuration file by referring to Example of the Model Configuration File Using a Custom Dependency Package. Then, use the following code example to load the model in saved_model format for inference.
Python logging used by inference base images allows the display of only warning logs. To query INFO logs, set the log level to INFO in the code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 |
# -*- coding: utf-8 -*- import json import os import threading import numpy as np import tensorflow as tf from PIL import Image from model_service.tfserving_model_service import TfServingBaseService import logging logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') logger = logging.getLogger(__name__) class MnistService(TfServingBaseService): def __init__(self, model_name, model_path): self.model_name = model_name self.model_path = model_path self.model_inputs = {} self.model_outputs = {} # The label file can be loaded here and used in the post-processing function. # Directories for storing the label.txt file on OBS and in the model package # with open(os.path.join(self.model_path, 'label.txt')) as f: # self.label = json.load(f) # Load the model in saved_model format in non-blocking mode to prevent blocking timeout. thread = threading.Thread(target=self.get_tf_sess) thread.start() def get_tf_sess(self): # Load the model in saved_model format. # The session will be reused. Do not use the with statement. sess = tf.Session(graph=tf.Graph()) meta_graph_def = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], self.model_path) signature_defs = meta_graph_def.signature_def self.sess = sess signature = [] # only one signature allowed for signature_def in signature_defs: signature.append(signature_def) if len(signature) == 1: model_signature = signature[0] else: logger.warning("signatures more than one, use serving_default signature") model_signature = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY logger.info("model signature: %s", model_signature) for signature_name in meta_graph_def.signature_def[model_signature].inputs: tensorinfo = meta_graph_def.signature_def[model_signature].inputs[signature_name] name = tensorinfo.name op = self.sess.graph.get_tensor_by_name(name) self.model_inputs[signature_name] = op logger.info("model inputs: %s", self.model_inputs) for signature_name in meta_graph_def.signature_def[model_signature].outputs: tensorinfo = meta_graph_def.signature_def[model_signature].outputs[signature_name] name = tensorinfo.name op = self.sess.graph.get_tensor_by_name(name) self.model_outputs[signature_name] = op logger.info("model outputs: %s", self.model_outputs) def _preprocess(self, data): # Two HTTPS request formats # 1. Request in form-data format: data = {"Request key value":{"File name":<File io>}} # 2. Request in JSON format: data = json.loads("JSON body passed in the API") preprocessed_data = {} for k, v in data.items(): for file_name, file_content in v.items(): image1 = Image.open(file_content) image1 = np.array(image1, dtype=np.float32) image1.resize((1, 28, 28)) preprocessed_data[k] = image1 return preprocessed_data def _inference(self, data): feed_dict = {} for k, v in data.items(): if k not in self.model_inputs.keys(): logger.error("input key %s is not in model inputs %s", k, list(self.model_inputs.keys())) raise Exception("input key %s is not in model inputs %s" % (k, list(self.model_inputs.keys()))) feed_dict[self.model_inputs[k]] = v result = self.sess.run(self.model_outputs, feed_dict=feed_dict) logger.info('predict result : ' + str(result)) return result def _postprocess(self, data): infer_output = {"mnist_result": []} for output_name, results in data.items(): for result in results: infer_output["mnist_result"].append(np.argmax(result)) return infer_output def __del__(self): self.sess.close() |
To load multiple models or models that are not supported by ModelArts, specify the loading path using the __init__ method. Example code:
# -*- coding: utf-8 -*- import os from model_service.tfserving_model_service import TfServingBaseService class MnistService(TfServingBaseService): def __init__(self, model_name, model_path): # Obtain the path to the model folder. root = os.path.dirname(os.path.abspath(__file__)) # test.onnx is the name of the model file to be loaded and must be stored in the model folder. self.model_path = os.path.join(root, test.onnx) # Load multiple models, for example, test2.onnx. # self.model_path2 = os.path.join(root, test2.onnx)
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot