获取不同模型类型支持的指标列表
功能介绍
获取不同模型类型支持的指标列表接口用于获取服务模型类型对应支持的指标列表。
接口约束
- 区域限制:仅支持西南-贵阳一区域。
- API流量限制:所有用户对该API的总请求次数上限,1分钟内不超过1000次。
- 用户流量限制:单个用户对该API的请求次数上限,1分钟内不超过200次。
- 限流响应:超出限流阈值时,API将返回HTTP 429状态码(Too Many Requests)。
- 重试建议:遇到限流时,建议等待60秒后重试。
URI
GET /v1/{project_id}/maas/monitoring/generation-supported-metrics
|
参数 |
是否必选 |
参数类型 |
描述 |
|---|---|---|---|
|
project_id |
是 |
String |
参数解释:项目ID。关于如何获取项目ID,请参见获取项目ID和名称。 约束限制:不涉及。 取值范围:只能由小写英文字母和数字组成,长度32字符。 默认取值:不涉及。 |
请求参数
|
参数 |
是否必选 |
参数类型 |
描述 |
|---|---|---|---|
|
x-auth-token |
是 |
String |
参数解释:用户Token。通过调用IAM服务的获取用户Token接口获取(响应消息头中X-Subject-Token的值)。获取方式请参见认证鉴权。 约束限制:不涉及。 取值范围:不涉及。 默认取值:不涉及。 |
响应参数
状态码:200
|
参数 |
参数类型 |
描述 |
|---|---|---|
|
[数组元素] |
Array of GenerationMetricsItem objects |
GenerationSupportMetricsResponse里的模型类型对应的返回结果item。 |
|
参数 |
参数类型 |
描述 |
|---|---|---|
|
type |
String |
参数解释:模型类型。 取值范围:模型类型取值如下:
|
|
metrics |
Array of strings |
参数解释:该类型模型支持的监控指标。 取值范围:不涉及。 |
|
desc_zh |
String |
参数解释:中文描述。 取值范围:不涉及。 |
|
desc_en |
String |
参数解释:英文描述。 取值范围:不涉及。 |
状态码:400
请求示例
查询各类型模型支持的调用统计指标。
/v1/{project_id}/maas/monitoring/generation-supported-metrics
响应示例
状态码:200
成功响应。
[ {
"type" : "Text Generation",
"metrics" : [ "request_count", "succ_2xx_count", "error_count", "req_count4xx5xx", "error_rate", "total_token", "avg_total_token", "max_total_token", "p50_total_token", "p80_total_token", "p90_total_token", "p99_total_token", "prompt_token", "completion_token", "avg_prompt_token", "p50_prompt_token", "p80_prompt_token", "p90_prompt_token", "p99_prompt_token", "max_prompt_token", "avg_completion_token", "p50_completion_token", "p80_completion_token", "p90_completion_token", "p99_completion_token", "max_completion_token", "avg_latency", "rpm", "tpm", "avg_ttft", "p50_ttft", "p80_ttft", "p90_ttft", "p99_ttft", "max_ttft", "avg_tpot", "p50_tpot", "p80_tpot", "p90_tpot", "p99_tpot", "max_tpot", "cache_token", "cache_hit_ratio" ],
"desc_zh" : "文本生成类模型",
"desc_en" : "Text generation model"
}, {
"type" : "Video Generation",
"metrics" : [ "request_count", "succ_2xx_count", "error_count", "req_count4xx5xx", "error_rate", "total_token", "avg_total_token", "max_total_token", "p50_total_token", "p80_total_token", "p90_total_token", "p99_total_token", "completion_token", "avg_completion_token", "p50_completion_token", "p80_completion_token", "p90_completion_token", "p99_completion_token", "max_completion_token", "avg_generation_time", "rpm", "tpm" ],
"desc_zh" : "视频生成类模型",
"desc_en" : "Video generation model"
}, {
"type" : "Image Generation",
"metrics" : [ "request_count", "succ_2xx_count", "error_count", "req_count4xx5xx", "error_rate", "avg_latency", "avg_generation_time" ],
"desc_zh" : "图像生成类模型",
"desc_en" : "Image generation model"
}, {
"type" : "Vector Model",
"metrics" : [ "request_count", "succ_2xx_count", "error_count", "req_count4xx5xx", "error_rate", "total_token", "avg_total_token", "max_total_token", "p50_total_token", "p80_total_token", "p90_total_token", "p99_total_token", "prompt_token", "completion_token", "avg_prompt_token", "p50_prompt_token", "p80_prompt_token", "p90_prompt_token", "p99_prompt_token", "max_prompt_token", "avg_completion_token", "p50_completion_token", "p80_completion_token", "p90_completion_token", "p99_completion_token", "max_completion_token", "avg_latency", "rpm", "tpm", "avg_ttft", "p50_ttft", "p80_ttft", "p90_ttft", "p99_ttft", "max_ttft", "avg_tpot", "p50_tpot", "p80_tpot", "p90_tpot", "p99_tpot", "max_tpot" ],
"desc_zh" : "文本向量化",
"desc_en" : "Vector model"
}, {
"type" : "Embedding",
"metrics" : [ "request_count", "succ_2xx_count", "error_count", "req_count4xx5xx", "error_rate", "total_token", "avg_total_token", "max_total_token", "p50_total_token", "p80_total_token", "p90_total_token", "p99_total_token", "prompt_token", "avg_prompt_token", "p50_prompt_token", "p80_prompt_token", "p90_prompt_token", "p99_prompt_token", "max_prompt_token", "avg_latency", "rpm", "tpm" ],
"desc_zh" : "Embedding模型",
"desc_en" : "Embedding model"
}, {
"type" : "Image Understanding",
"metrics" : [ "request_count", "succ_2xx_count", "error_count", "req_count4xx5xx", "error_rate", "total_token", "avg_total_token", "max_total_token", "p50_total_token", "p80_total_token", "p90_total_token", "p99_total_token", "prompt_token", "completion_token", "avg_prompt_token", "p50_prompt_token", "p80_prompt_token", "p90_prompt_token", "p99_prompt_token", "max_prompt_token", "avg_completion_token", "p50_completion_token", "p80_completion_token", "p90_completion_token", "p99_completion_token", "max_completion_token", "avg_latency", "rpm", "tpm", "avg_ttft", "p50_ttft", "p80_ttft", "p90_ttft", "p99_ttft", "max_ttft", "avg_tpot", "p50_tpot", "p80_tpot", "p90_tpot", "p99_tpot", "max_tpot" ],
"desc_zh" : "图像理解类模型",
"desc_en" : "Image understanding model"
}, {
"type" : "Rerank",
"metrics" : [ "request_count", "succ_2xx_count", "error_count", "req_count4xx5xx", "error_rate", "total_token", "avg_total_token", "max_total_token", "p50_total_token", "p80_total_token", "p90_total_token", "p99_total_token", "avg_latency", "rpm", "tpm" ],
"desc_zh" : "重排序类模型",
"desc_en" : "Rerank model"
} ]
状态码:400
失败响应。
{
"error_msg" : "The project ID in the request does not match that in the token.",
"error_code" : "ModelArts.0210"
}
状态码
|
状态码 |
描述 |
|---|---|
|
200 |
成功响应。 |
|
400 |
失败响应。 |
错误码
请参见错误码。