Deploying a Local Service in the Development Environment for Debugging
You can deploy a local service for debugging. That is, after Importing a Model or Debugging a Model, deploy a predictor in a notebook instance for local inference.
Only ModelArts notebook is supported for deploying a local service.
- Local service predictor and real-time service predictor in the development environment
- Deploy a local service predictor, that is, the model files in the development environment. The environment specifications depend on the resource specifications. For example, if you deploy a local predictor in a modelarts.vm.cpu.2u notebook, the runtime environment is cpu.2u.
- Deploying the predictor in Deploying a Real-Time Service is to deploy the model file stored in OBS to the container provided by the Service Deployment module. The environment specifications (such as CPU and GPU specifications) are determined by Table 3 configs parameters of predictor.
- To deploy the predictor in Deploying a Real-Time Service, you must create a container based on the AI engine, which is time-consuming. Deploying a local service predictor takes a maximum of 10 seconds. Local service predictors can be used to test models but are not recommended for industrial applications of models.
- In this version, the following AI engines can be used to deploy a local service predictor: XGBoost, Scikit_Learn, PyTorch, TensorFlow, and Spark_MLlib. For details about the version, see Supported AI Engines for ModelArts Inference.
Sample Code
In ModelArts Notebook, you do not need to enter authentication parameters for session authentication. For details about session authentication of other development environments, see Session Authentication.
Sample code of local TensorFlow 1.8 inference
Configure tensorflow_model_server in the environment. You can call the SDK API to quickly configure it. For details, see the example code below.
- In the CPU-based environment, call Model.configure_tf_infer_environ(device_type="CPU") to complete the configuration. You only need to configure the item and run it once in the environment.
- In the GPU-based environment, call Model.configure_tf_infer_environ(device_type="GPU") to complete the configuration. You only need to configure the item and run it once in the environment.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
from modelarts.session import Session from modelarts.model import Model from modelarts.config.model_config import ServiceConfig session = Session() # GPU-based environment inference configuration Model.configure_tf_infer_environ(device_type="GPU") # CPU-based environment inference configuration #Model.configure_tf_infer_environ(device_type="CPU") model_instance = Model( session, model_name="input_model_name", # Model name model_version="1.0.0", # Model version source_location=model_location, # Model file path model_type="MXNet", # Model type model_algorithm="image_classification", # Model algorithm execution_code="OBS_PATH", input_params=input_params, # For details, see the input_params format description. output_params=output_params, # For details, see the output_params format description. dependencies=dependencies, # For details, see the dependencies format description. apis=apis) configs = [ServiceConfig(model_id=model_instance.get_model_id(), weight="100", instance_count=1, specification="local")] predictor_instance = model_instance.deploy_predictor(configs=configs) if predictor_instance is not None: predict_result = predictor_instance.predict(data="your_raw_data_or_data_path", data_type="your_data_type") # Local inference and prediction. data can be raw data or a file path, and data_type can be JSON, files, or images. print(predict_result) |
Parameters
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
service_name |
No |
String |
Name of a service that consists of 1 to 64 characters and must start with a letter. Only letters, digits, underscores (_), and hyphens (-) are allowed. |
|
configs |
Yes |
JSON Array |
Local service configurations |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
model_id |
Yes |
String |
Model ID. Obtain the value by calling the API described in Obtaining Models or from the ModelArts management console. |
|
weight |
Yes |
Integer |
Traffic weight allocated to a model. When a local service predictor is deployed, set this parameter to 100. |
|
specification |
Yes |
String |
When a local service is deployed, set this parameter to local. |
|
instance_count |
Yes |
Integer |
Number of instances deployed in a model. The maximum number of instances is 128. When a local service predictor is deployed, set this parameter to 1. |
|
envs |
No |
Map<String, String> |
(Optional) Environment variable key-value pair required for running a model. By default, this parameter is left blank. |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
predictor |
Yes |
Predictor object |
Predictor object, which contains only the attributes in Testing an Inference Service |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot