文档首页/ AI开发平台ModelArts/ SDK参考/ 服务管理/ 部署在线服务

更新时间：2025-02-25 GMT+08:00

查看PDF

部署在线服务

部署在线服务包括：

已部署为在线服务的初始化。
部署在线服务predictor。
部署批量服务transformer。

部署服务返回服务对象Predictor，其属性包括服务管理章节下的所有功能。

示例代码

在ModelArts notebook平台，Session鉴权无需输入鉴权参数。其它平台的Session鉴权请参见Session鉴权。

方式1：已部署为在线服务predictor的初始化

      
           from modelarts.session import Session
from modelarts.model import Predictor

session = Session()
predictor_instance = Predictor(session, service_id="your_service_id")

方式2：部署在线服务predictor

部署服务到公共资源池

        
         
           
           
             from modelarts.session import Session
from modelarts.model import Model
from modelarts.config.model_config import ServiceConfig, TransformerConfig, Schedule

session = Session()
model_instance = Model(session, model_id='your_model_id')
vpc_id = None                                        # （可选）在线服务实例部署的虚拟私有云ID，默认为空
subnet_network_id = None                             # （可选）子网的网络ID，默认为空
security_group_id = None                             # （可选）安全组，默认为空
configs = [ServiceConfig(model_id=model_instance.model_id,
                         weight="100",
                         instance_count=1,
                         specification="modelarts.vm.cpu.2u")]  # 参考表3中specification字段
predictor_instance = model_instance.deploy_predictor(
            service_name="service_predictor_name",
            infer_type="real-time",
            vpc_id=vpc_id,
            subnet_network_id=subnet_network_id,
            security_group_id=security_group_id,
            configs=configs,                       # predictor配置参数, 参考下文configs参数格式说明
            schedule = [Schedule(op_type='stop', time_unit='HOURS', duration=1)]       # （可选）设置在线服务运行时间
)

            

          

        
       

参数“model_id”代表将部署成在线服务的模型。“model_id”可以通过查询模型列表或者ModelArts管理控制台获取。

部署服务到专属资源池

from modelarts.config.model_config import ServiceConfig

configs = [ServiceConfig(model_id=model_instance.model_id, weight="100", instance_count=1, 
						 specification="modelarts.vm.cpu.2u")]
predictor_instance = model_instance.deploy_predictor( 
                                                      service_name="your_service_name",
                                                      infer_type="real-time",
                                                      configs=configs,
                                                      cluster_id="your dedicated pool id"
                                                    )

configs参数格式说明：SDK提供了ServiceConfig类对其定义，configs为list，list中的元组对象是ServiceConfig。定义代码如下：

      
           configs = []
envs = {"model_name":"mxnet-model-1", "load_epoch":"0"}

service_config1 = ServiceConfig(
        model_id="model_id1",                 # model_id1和model_id2必须是同一个模型的不同版本对应的model_id
        weight="70",
        specification="modelarts.vm.cpu.2u",  # 参考表3中specification字段
        instance_count=2,
        envs=envs)                            # （可选）设置环境变量的值，如：envs = {"model_name":"mxnet-model-1", "load_epoch":"0"}
service_config2 = ServiceConfig(
        model_id='model_id2',
        weight="30",
        specification="modelarts.vm.cpu.2u",  # 参考表3中specification字段
        instance_count=2,
        envs=envs)                            # （可选）设置环境变量的值，如：envs = {"model_name":"mxnet-model-1", "load_epoch":"0"}
configs.append(service_config1)
configs.append(service_config2)

方式3：部署批量服务transformer

      
           from modelarts.session import Session
from modelarts.model import Model
from modelarts.config.model_config import TransformerConfig

session = Session()
model_instance = Model(session, model_id='your_model_id')
vpc_id = None                                        # （可选）批量服务实例部署的虚拟私有云ID，默认为空
subnet_network_id = None                             # （可选）子网的网络ID，默认为空
security_group_id = None                             # （可选）安全组，默认为空

transformer = model_instance.deploy_transformer(
        service_name="service_transformer_name",
        infer_type="batch",
        vpc_id=vpc_id,
        subnet_network_id=subnet_network_id,
        security_group_id=security_group_id,
        configs=configs                          # transformer配置参数, 参考下文configs参数格式说明 
)

configs参数格式说明：SDK提供了TransformerConfig类对其定义，configs都是list，list中的元组对象是TransformerConfig。定义代码如下：

      
           configs = []
mapping_rule = None                               # （可选）输入参数与csv数据的映射关系
mapping_type= "file"                              # file或者csv
envs = {"model_name":"mxnet-model-1", "load_epoch":"0"}

transformer_config1 = TransformerConfig(
            model_id="model_id",
            specification="modelarts.vm.cpu.2u",   # 参考表3中specification字段
            instance_count=2,
            src_path="/shp-cn4/sdk-demo/",         # 批量任务输入数据的OBS路径，如："/your_obs_bucket/src_path"
            dest_path="/shp-cn4/data-out/",        # 批量任务输出结果的OBS路径，如："/your_obs_bucket/dest_path"
            req_uri="/",
            mapping_type=mapping_type,
            mapping_rule=mapping_rule,
            envs=envs)                             # （可选）设置环境变量的值，如：envs = {"model_name":"mxnet-model-1", "load_epoch":"0"}
configs.append(transformer_config1)

参数说明

表1 参数说明
参数	是否必选	参数类型	描述
service_id	是	String	服务ID，可从ModelArts前端在线服务中获取。
session	是	Object	会话对象，初始化方法见Session鉴权。

表2 部署在线服务predictor和transformer参数说明
参数	是否必选	参数类型	描述
service_name	否	String	服务名称，支持1-64位可见字符（含中文），只能以英文大小写字母或者中文字符开头，名称可以包含字母、中文、数字、中划线、下划线。
description	否	String	服务备注，默认为空，不超过100个字符。
infer_type	否	String	推理方式，取值为real-time/batch/edge。默认为real-time。 real-time代表在线服务，将模型部署为一个Web Service，并且提供在线的测试UI与监控能力，服务一直保持运行。 batch为批量服务，批量服务可对批量数据进行推理，完成数据处理后自动停止。 edge表示边缘服务，通过华为云智能边缘平台，在边缘节点将模型部署为一个Web Service，需提前在IEF（智能边缘服务）创建好节点。
vpc_id	否	String	在线服务实例部署的虚拟私有云ID，默认为空，此时ModelArts会为每个用户分配一个专属的VPC，用户之间隔离；如需要在服务实例中访问名下VPC内的其他服务组件，则可配置此参数为对应VPC的ID。 VPC一旦配置，不支持修改。当vpc_id与cluster_id一同配置时，只有专属集群参数生效。
subnet_network_id	否	String	子网的网络ID，默认为空，当配置了vpc_id则此参数必填。需填写虚拟私有云控制台子网详情中显示的“网络ID”。通过子网可提供与其他网络隔离的、可以独享的网络资源。
security_group_id	否	String	安全组，默认为空，当配置了vpc_id则此参数必填。安全组起着虚拟防火墙的作用，为服务实例提供安全的网络访问控制策略。安全组须包含至少一条入方向规则，对协议为TCP、源地址为0.0.0.0/0、端口为8080的请求放行。
configs	是	包括predictor configs结构和transformer configs	模型运行配置。当推理方式为batch/edge时仅支持配置一个模型。当推理方式为real-time时，可根据业务需要配置多个模型并分配权重，但多个模型的版本号不能相同
schedule	否	schedule结构数组	服务调度配置，仅在线服务可配置，默认不使用，服务长期运行。请参见表6。
cluster_id	否	String	旧版专属池id，默认为空，当配置cluster_id时，表示将服务部署到旧版专属资源池中。
pool_name	否	String	新版专属池名称。

表3 predictor configs结构
参数	是否必选	参数类型	描述
model_id	是	String	模型ID。“model_id”可以通过查询模型列表或者ModelArts管理控制台获取。
weight	是	Integer	权重百分比，分配到此模型的流量权重，仅当infer_type为real-time时需要配置，多个权重相加必须等于100；当在一个在线服务中同时配置了多个模型版本且设置不同的流量权重比例时，持续地访问此服务的预测接口，ModelArts会按此权重比例将预测请求转发到对应的模型版本实例。 { "service_name": "mnist", "description": "mnist service", "infer_type": "real-time", "config": [ { "model_id": "xxxmodel-idxxx", "weight": "70", "specification": "modelarts.vm.cpu.2u", "instance_count": 1, "envs": { "model_name": "mxnet-model-1", "load_epoch": "0" } }, { "model_id": "xxxxxx", "weight": "30", "specification": "modelarts.vm.cpu.2u", "instance_count": 1 } ] }
specification	是	String	资源规格，当前版本可选modelarts.vm.cpu.2u/modelarts.vm.gpu.p4(需申请)/modelarts.vm.ai1.a310(需申请)，需申请权限才能使用的规格请在华为云创建工单，由ModelArts运维工程师添加权限。
instance_count	是	Integer	模型部署的实例数，当前限制最大实例数为128，如需使用更多的实例数，需提交工单申请。
envs	否	Map<String, String>	运行模型需要的环境变量键值对，可选填，默认为空。

表4 transformer configs结构
参数	是否必选	参数类型	描述
model_id	是	String	模型ID。
specification	是	String	资源规格，当前版本可选modelarts.vm.cpu.2u/modelarts.vm.gpu.p4。
instance_count	是	Integer	模型部署的实例数，邀测阶段取值范围[1, 2]。
envs	否	Map<String, String>	运行模型需要的环境变量键值对，可选填，默认为空。
src_path	是	String	批量任务输入数据的OBS路径。
dest_path	是	String	批量任务输出结果的OBS路径。
req_uri	是	String	批量任务中调用的推理接口，即模型镜像中暴露的REST接口，需要从模型的config.json文件中选取一个api路径用于此次推理；如使用ModelArts提供的预置推理镜像，则此接口为“/”。
mapping_type	是	String	输入数据的映射类型，可选“file”或“csv”。 file指每个推理请求对应到输入数据目录下的一个文件，当使用此方式时，此模型对应req_uri只能有一个输入参数且此参数的类型是file。 csv指每个推理请求对应到csv里的一行数据，当使用此方式时，输入数据目录下的文件只能以.csv为后缀，且需配置mapping_rule参数，以表达推理请求体中各个参数对应到csv的索引。创建批量服务且输入数据映射方式为file的样例 { "service_name": "batchservicetest", "description": "", "infer_type": "batch", "config": [{ "model_id": "598b913a-af3e-41ba-a1b5-bf065320f1e2", "specification": "modelarts.vm.cpu.2u", "instance_count": 1, "src_path": "https://infers-data.obs.example.com/xgboosterdata/", "dest_path": "https://infers-data.obs.example.com/output/", "req_uri": "/", "mapping_type": "file" }] } 创建批量服务且输入数据映射方式为csv的样例 { "service_name": "batchservicetest", "description": "", "infer_type": "batch", "config": [{ "model_id": "598b913a-af3e-41ba-a1b5-bf065320f1e2", "specification": "modelarts.vm.cpu.2u", "instance_count": 1, "src_path": "https://infers-data.obs.example.com/xgboosterdata/", "dest_path": "https://infers-data.obs.example.com/output/", "req_uri": "/", "mapping_type": "csv", "mapping_rule": { "type": "object", "properties": { "data": { "type": "object", "properties": { "req_data": { "type": "array", "items": [{ "type": "object", "properties": { "input5": { "type": "number", "index": 0 }, "input4": { "type": "number", "index": 1 }, "input3": { "type": "number", "index": 2 }, "input2": { "type": "number", "index": 3 }, "input1": { "type": "number", "index": 4 } } }] } } } } } }] }
mapping_rule	否	Map	输入参数与csv数据的映射关系，仅当mapping_type为csv时需要填写。映射规则与模型配置文件config.json中输入参数的定义方式相似，只需要在每一个基本类型（string/number/integer/boolean）的参数下配置index参数，指定使用csv数据中对应索引下标的数据作为此参数的值去发送推理请求，csv数据必须以英文半角逗号分隔，index从0开始计数，特殊地，当index为-1时忽略此参数，具体请参见部署transformer的示例代码的样例。样例中mapping_rule描述的推理请求体格式为： { "data": { "req_data": [{ "input1": 1, "input2": 2, "input3": 3, "input4": 4, "input5": 5 }] } }

表5 部署predictor和transformer返回参数说明
参数	是否必选	参数类型	描述
predictor	是	Predictor对象	Predictor对象，其属性描述包括服务管理章节全部功能。

表6 schedule结构
参数	是否必选	参数类型	说明
op_type	是	String	调度类型，当前仅支持取值为“stop”。
time_unit	是	String	调度时间单位，可选： DAYS HOURS MINUTES
duration	是	Integer	对应时间单位的数值，比如2小时后停止，则“time_unit”填“HOURS”，“duration”填“2”。

给出MXNet实现手写数字识别项目中部署在线predictor实例：

        
             from modelarts.session import Session
from modelarts.model import Model
from modelarts.config.model_config import ServiceConfig, TransformerConfig

model_instance = Model(session, model_id = "you_model_id")
configs = []
config1 = ServiceConfig(model_id="you_model_id", 
                        weight="100", 
                        instance_count=1, 
                        specification="modelarts.vm.cpu.2u",
                        envs={"input_data_name":"images",
                              "input_data_shape":"0,1,28,28",
                              "output_data_shape":"0,10"})
configs.append(config1)
predictor = model_instance.deploy_predictor(service_name="DigitRecognition", configs=configs)

给出MXNet实现手写数字识别项目中部署transformer实例（批量推理）：

        
             from modelarts.session import Session
from modelarts.model import Model
from modelarts.config.model_config import ServiceConfig, TransformerConfig

model_instance = Model(session, model_id = "your_model_id") 
configs = []
config1 = TransformerConfig(model_id="your_model_id", 
                            specification="modelarts.vm.cpu.2u", 
                            instance_count=1, 
                            envs={"input_data_name":"images","input_data_shape":"0,1,28,28","output_data_shape":"0,10"},
                            src_path="/w0403/testdigitrecognition/inferimages/",
                            dest_path="/w0403/testdigitrecognition/" ,
                            req_uri = "/",
                            mapping_type = "file")
configs.append(config1)
predictor = model_instance.deploy_transformer(service_name="DigitRecognition", infer_type="batch", configs=configs)

父主题： 服务管理

上一篇：在开发环境中部署本地服务进行调试

下一篇：查询服务详情

意见反馈

文档内容是否对您有帮助？

有帮助没帮助

提供反馈

提交成功！非常感谢您的反馈，我们会继续努力做到更好！您可在我的云声建议查看反馈及问题处理状态。

系统繁忙，请稍后重试

在使用文档中是否遇到以下问题

内容与产品页面不一致

内容不易理解

缺失示例代码

步骤不可操作

搜不到想要的内容

缺少最佳实践

意见反馈（选填）

0/500

请至少选择一项反馈信息并填写问题反馈

字符长度不能超过500

直接提交取消

如您有其它疑问，您也可以通过华为云社区问答频道来与我们联系探讨

智能客服提问云社区提问

部署在线服务

示例代码

参数说明

相关文档

意见反馈

文档内容是否对您有帮助？

7*24

备案

专业服务

退订

建议反馈

售前咨询热线