文档首页/ 云容器引擎 CCE/ 最佳实践/ 云原生AI/ 使用Inference Pool结合Envoy Gateway构建AI基础设施层

更新时间：2026-06-04 GMT+08:00

使用Inference Pool结合Envoy Gateway构建AI基础设施层

本方案采用Envoy Gateway与Envoy AI Gateway联合InferencePool架构，解决企业在生产环境中部署生成式AI服务时面临的供应商依赖度高、安全管控薄弱、成本监控困难、运维复杂度大等痛点，旨在为企业打造稳定、安全、高度可扩展的AI基础设施层。

应用现状

在当前大语言模型（LLM）推理服务的部署架构中，企业正经历从传统微服务架构向AI原生架构的转型。随着生成式人工智能（GenAI）逐渐进入生产环境，企业基础设施团队面临多重挑战：

供应商锁定与连接脆弱性：企业通常需对接多种LLM供应源，包括OpenAI、Anthropic、AWS Bedrock等云端服务及自建模型。由于缺乏统一的技术抽象层，企业在切换供应商、应对单点故障时，易出现业务中断，难以实现跨提供商的自动容灾切换，从而影响业务连续性。
企业级安全隔离缺失：AI服务的调用缺乏统一的访问控制（RBAC）和速率限制（Rate Limiting）。同时，直接向应用端暴露API Key易造成敏感凭据泄露，且无法对出口流量（Egress）实施统一的身份鉴权，存在严重安全隐患。
成本与性能的“黑盒”状态：LLM调用成本高昂，且服务响应延迟波动大。传统监控工具无法深入解析Token消耗、模型使用模式及响应性能，导致企业难以准确掌握GenAI的成本结构与性能瓶颈，无法进行有效优化和资源分配。
流量管理标准缺失：随着LLM推理服务的规模扩大，缺乏一种标准化的方式（例如，Kubernetes原生接口）来管理模型版本切换、权重分流及复杂Header路由逻辑。这导致运维工作量随模型数量呈指数级增长，运维效率大幅下降。

解决方案

针对上述痛点，本方案采用“Envoy Gateway + Envoy AI Gateway + InferencePool”一体化架构，依托Envoy经过大规模生产验证的成熟代理技术，为企业构建稳定、安全、高度可扩展的AI基础设施层，完美解决GenAI落地过程中的基础设施瓶颈。

基于Kubernetes Gateway API的标准化实现
- 高级流量管理：通过HTTPRoute实现推理服务的加权流量拆分，支持蓝绿发布等渐进式部署策略。
- 深度协议路由：支持识别OpenAI协议头，能够根据 `model` 字段或自定义业务Header实现细粒度流量路由。
- 网格原生兼容：与主流Service Mesh（例如，Istio）无缝集成，实现端到端的流量加密与集中治理。
跨提供商的弹性连接
- 多源抽象：支持接入OpenAI、AWS Bedrock等云端服务，以及企业自建的InferencePool，实现统一的模型调用接口。
- 智能容灾：当自建模型池负载过高或第三方API不可用时，系统可自动切换至备用模型，确保服务高可用与业务连续性。
企业级安全与合规
- 上行身份验证（Upstream Authentication）：在网关层统一管理供应商API Key，隔离应用层与凭证，降低泄露风险。
- 精细化管控：支持基于策略的访问控制和多维度速率限制，防止API被滥用，保障系统稳定性与合规性。
全面的可观测性与可扩展性
- 成本与性能分析：实时追踪Token消耗、模型使用分布与响应耗时，为企业提供关键性能指标（KPI），支持资源优化与成本控制。
- 可插拔架构设计：继承Envoy的扩展能力，支持通过插件快速开发定制化功能（例如，请求改写、自定义过滤器），灵活适配不断演进的AI技术生态。

更多Envoy AI Gateway介绍，请参见社区文档Envoy AI Gateway。

前提条件

已创建v1.32及以上版本的集群。

已准备好镜像。

本教程需要以下镜像，请提前在一台能够访问公网且已安装Docker的机器上完成下载。

下载镜像。

docker pull docker.io/envoyproxy/gateway-dev:latest
docker pull docker.io/envoyproxy/ratelimit:master
docker pull docker.io/envoyproxy/ai-gateway-extproc
docker pull docker.io/envoyproxy/ai-gateway-controller
docker pull ghcr.io/llm-d/llm-d-inference-sim:v0.4.0
docker pull registry.k8s.io/gateway-api-inference-extension/epp:v1.0.1
docker pull docker.io/envoyproxy/ai-gateway-testupstream:latest
docker pull docker.io/envoyproxy/envoy:distroless-dev

将下载的镜像上传至SWR镜像仓库，以确保Kubernetes集群内的所有节点可以拉取。
上传镜像的具体方法请参见上传镜像。

操作步骤

在目标节点上执行以下命令，安装Envoy Gateway。

安装Helm。本文以3.19.3版本为例。

curl -O https://get.helm.sh/helm-v3.19.3-linux-amd64.tar.gz
tar xvf helm-v3.19.3-linux-amd64.tar.gz 
cp ./linux-amd64/helm /usr/local/bin/ 
helm version

若回显如下，则说明安装成功。

version.BuildInfo{Version:"v3.19.3", GitCommit:"0707f566a3f4ced24009ef14d67fe0ce69db****", GitTreeState:"clean", GoVersion:"go1.24.10"}

获取Helm模板包。

helm pull oci://docker.io/envoyproxy/gateway-helm --version v0.0.0-latest
tar xvf gateway-helm-v0.0.0-latest.tgz
cd gateway-helm

修改“values.yaml”文件中的镜像信息。
```
vi values.yaml
```
将默认镜像替换为已上传至华为云SWR的对应镜像。
```
docker.io/envoyproxy/gateway-dev:latest
docker.io/envoyproxy/ratelimit:master
```

准备Envoy Gateway配置文件。

创建基础配置文件envoy-gateway-values.yaml。

# Copyright Envoy AI Gateway Authors
# SPDX-License-Identifier: Apache-2.0
# The full text of the Apache license is available in the LICENSE file at
# the root of the repo.


# This file contains the base Envoy Gateway helm values needed for AI Gateway integration.
# This is the minimal configuration that all AI Gateway deployments need.
#
# Use this file when installing Envoy Gateway with:
#   helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
#     --version v0.0.0-latest \
#     --namespace envoy-gateway-system \
#     --create-namespace \
#     -f envoy-gateway-values.yaml
#
# For additional features, combine with addon values files:
#   -f envoy-gateway-values.yaml -f examples/token_ratelimit/envoy-gateway-values-addon.yaml
#   -f envoy-gateway-values.yaml -f examples/inference-pool/envoy-gateway-values-addon.yaml


config:
  envoyGateway:
    gateway:
      controllerName: gateway.envoyproxy.io/gatewayclass-controller
    logging:
      level:
        default: info
    provider:
      type: Kubernetes
    extensionApis:
      # Not strictly required, but recommended for backward/future compatibility.
      enableEnvoyPatchPolicy: true
      # Required: Enable Backend API for AI service backends.
      enableBackend: true
    # Required: AI Gateway needs to fine-tune xDS resources generated by Envoy Gateway.
    extensionManager:
      hooks:
        xdsTranslator:
          translation:
            listener:
              includeAll: true
            route:
              includeAll: true
            cluster:
              includeAll: true
            secret:
              includeAll: true
          post:
            - Translation
            - Cluster
            - Route
      service:
        fqdn:
          # IMPORTANT: Update this to match your AI Gateway controller service
          # Format: <service-name>.<namespace>.svc.cluster.local
          # Default if you followed the installation steps above:
          hostname: ai-gateway-controller.envoy-ai-gateway-system.svc.cluster.local
          port: 1063

创建扩展插件配置envoy-gateway-values-addon.yaml。

# Copyright Envoy AI Gateway Authors
# SPDX-License-Identifier: Apache-2.0
# The full text of the Apache license is available in the LICENSE file at
# the root of the repo.

# This addon file adds InferencePool support to Envoy Gateway.
# Use this in combination with the base envoy-gateway-values.yaml:
#
#   helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
#     --version v0.0.0-latest \
#     --namespace envoy-gateway-system \
#     --create-namespace \
#     -f ../../manifests/envoy-gateway-values.yaml \
#     -f envoy-gateway-values-addon.yaml
#
# You can also combine with rate limiting:
#   -f ../../manifests/envoy-gateway-values.yaml \
#   -f ../token_ratelimit/envoy-gateway-values-addon.yaml \
#   -f envoy-gateway-values-addon.yaml

config:
  envoyGateway:
    extensionManager:
      # Enable InferencePool custom resource support
      backendResources:
        - group: inference.networking.k8s.io
          kind: InferencePool
          version: v1

安装Envoy Gateway。

helm upgrade -i eg . \
  --version v0.0.0-latest \
  --namespace envoy-gateway-system \
  --create-namespace \
  -f envoy-gateway-values.yaml \
  -f envoy-gateway-values-addon.yaml

当回显信息中STATUS显示为deployed时，表示安装成功。

点击放大

在控制台验证部署状态。
1. 在目标集群的“工作负载”页面的“无状态负载”页签，确认envoy-gateway的状态为“运行中”。
2. 在目标集群的“服务”页面的“服务”页签，确认与envoy-gateway对应的服务已正确创建。

获取并安装Inference Pool CRD。

wget https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.1/manifests.yaml
kubectl apply -f manifests.yaml

安装Envoy AI Gateway。
1. 获取并安装Envoy AI Gateway的CRD。
```
# 获取Envoy AI Gateway的CRD Helm包。
helm pull oci://docker.io/envoyproxy/ai-gateway-crds-helm --version v0.0.0-latest

# 解压缩
tar ai-gateway-crds-helm-v0.0.0-latest.tgz

# 安装Envoy AI Gateway的CRD。
cd ai-gateway-crds-helm

helm upgrade -i aieg-crd . \
  --version v0.0.0-latest \
  --namespace envoy-ai-gateway-system \
  --create-namespace
```
  当回显信息中STATUS显示为deployed时，表示安装成功。
2. 获取Envoy AI Gateway的控制器Helm包并解压缩。
```
helm pull oci://docker.io/envoyproxy/ai-gateway-helm --version v0.0.0-latest
tar ai-gateway-helm-v0.0.0-latest.tgz
```
3. 进入相应目录并修改“values.yaml”文件中的镜像信息。
```
cd ai-gateway-helm
vi values.yaml
```
  将默认镜像替换为已上传至华为云SWR的对应镜像。
```
docker.io/envoyproxy/ai-gateway-extproc
docker.io/envoyproxy/ai-gateway-controller
```
4. 安装Envoy AI Gateway控制器。
```
helm upgrade -i aieg . \
  --version v0.0.0-latest \
  --namespace envoy-ai-gateway-system \
  --create-namespace
```
  当回显信息中STATUS显示为deployed时，表示安装成功。
5. 在控制台验证部署状态。
  1. 在目标集群的“工作负载”页面的“无状态负载”页签，确认ai-gateway-controller的状态为“运行中”。
  2. 在目标集群的“服务”页面的“服务”页签，确认与ai-gateway-controller对应的服务已正确创建。

部署负载，测试Gateway能力。

获取并部署模拟vLLM模型（Llama3-8b）。

获取配置文件。

# vLLM simulation backend
wget https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/v1.0.1/config/manifests/vllm/sim-deployment.yaml
# InferenceObjective
wget https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/inferenceobjective.yaml
# InferencePool resources
wget https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/v1.0.1/config/manifests/inferencepool-resources.yaml

修改“sim-deployment.yaml”文件中的镜像信息。
```
vi sim-deployment.yaml
```
将默认镜像替换为已上传至华为云SWR的对应镜像。
```
ghcr.io/llm-d/llm-d-inference-sim:v0.4.0
```
修改inferencepool-resources.yaml文件中的镜像信息。
```
vi inferencepool-resources.yaml
```
将默认镜像替换为已上传至华为云SWR的对应镜像。
```
registry.k8s.io/gateway-api-inference-extension/epp:v1.0.1
```

获取并部署模拟Mistral。

请将代码中的镜像名称替换为前提条件中准备的对应镜像。

docker.io/envoyproxy/ai-gateway-testupstream:latest
registry.k8s.io/gateway-api-inference-extension/epp:v1.0.1

详细代码示例如下。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mistral-upstream
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mistral-upstream
  template:
    metadata:
      labels:
        app: mistral-upstream
    spec:
      containers:
        - name: testupstream
          image: docker.io/envoyproxy/ai-gateway-testupstream:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080
          env:
            - name: TESTUPSTREAM_ID
              value: test
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 1
            periodSeconds: 1
---
apiVersion: inference.networking.k8s.io/v1
kind: InferencePool
metadata:
  name: mistral
  namespace: default
spec:
  targetPorts:
    - number: 8080
  selector:
    matchLabels:
      app: mistral-upstream
  endpointPickerRef:
    name: mistral-epp
    port:
      number: 9002
---
apiVersion: inference.networking.x-k8s.io/v1alpha2
kind: InferenceObjective
metadata:
  name: mistral
  namespace: default
spec:
  priority: 10
  poolRef:
    # Bind the InferenceObjective to the InferencePool.
    name: mistral
---
apiVersion: v1
kind: Service
metadata:
  name: mistral-epp
  namespace: default
spec:
  selector:
    app: mistral-epp
  ports:
    - protocol: TCP
      port: 9002
      targetPort: 9002
      appProtocol: http2
  type: ClusterIP
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: mistral-epp
  namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mistral-epp
  namespace: default
  labels:
    app: mistral-epp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mistral-epp
  template:
    metadata:
      labels:
        app: mistral-epp
    spec:
      serviceAccountName: mistral-epp
      # Conservatively, this timeout should mirror the longest grace period of the pods within the pool
      terminationGracePeriodSeconds: 130
      containers:
        - name: epp
          image: registry.k8s.io/gateway-api-inference-extension/epp:v1.0.1
          imagePullPolicy: IfNotPresent
          args:
            - --pool-name
            - "mistral"
            - "--pool-namespace"
            - "default"
            - --v
            - "4"
            - --zap-encoder
            - "json"
            - --grpc-port
            - "9002"
            - --grpc-health-port
            - "9003"
            - "--config-file"
            - "/config/default-plugins.yaml"
          ports:
            - containerPort: 9002
            - containerPort: 9003
            - name: metrics
              containerPort: 9090
          livenessProbe:
            grpc:
              port: 9003
              service: inference-extension
            initialDelaySeconds: 5
            periodSeconds: 10
          readinessProbe:
            grpc:
              port: 9003
              service: inference-extension
            initialDelaySeconds: 5
            periodSeconds: 10
          volumeMounts:
            - name: plugins-config-volume
              mountPath: "/config"
      volumes:
        - name: plugins-config-volume
          configMap:
            name: plugins-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: plugins-config
  namespace: default
data:
  default-plugins.yaml: |
    apiVersion: inference.networking.x-k8s.io/v1alpha1
    kind: EndpointPickerConfig
    plugins:
    - type: queue-scorer
    - type: kv-cache-utilization-scorer
    - type: prefix-cache-scorer
    schedulingProfiles:
    - name: default
      plugins:
      - pluginRef: queue-scorer
      - pluginRef: kv-cache-utilization-scorer
      - pluginRef: prefix-cache-scorer
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: pod-read
  namespace: default
rules:
  - apiGroups: ["inference.networking.x-k8s.io"]
    resources: ["inferenceobjectives", "inferencepools"]
    verbs: ["get", "watch", "list"]
  - apiGroups: ["inference.networking.k8s.io"]
    resources: ["inferencepools"]
    verbs: ["get", "watch", "list"]
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: pod-read-binding
  namespace: default
subjects:
  - kind: ServiceAccount
    name: mistral-epp
    namespace: default
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: pod-read
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: auth-reviewer
rules:
  - apiGroups:
      - authentication.k8s.io
    resources:
      - tokenreviews
    verbs:
      - create
  - apiGroups:
      - authorization.k8s.io
    resources:
      - subjectaccessreviews
    verbs:
      - create
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: auth-reviewer-binding
subjects:
  - kind: ServiceAccount
    name: mistral-epp
    namespace: default
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: auth-reviewer

获取并用AIServiceBackend部署一个传统后端。

请将代码中的镜像名称替换为前提条件中准备的对应镜像。

docker.io/envoyproxy/ai-gateway-testupstream:latest

详细代码示例如下。

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: Backend
metadata:
  name: envoy-ai-gateway-basic-testupstream
  namespace: default
spec:
  endpoints:
    - fqdn:
        hostname: envoy-ai-gateway-basic-testupstream.default.svc.cluster.local
        port: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: envoy-ai-gateway-basic-testupstream
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: envoy-ai-gateway-basic-testupstream
  template:
    metadata:
      labels:
        app: envoy-ai-gateway-basic-testupstream
    spec:
      containers:
        - name: testupstream
          image: docker.io/envoyproxy/ai-gateway-testupstream:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080
          env:
            - name: TESTUPSTREAM_ID
              value: test
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 1
            periodSeconds: 1
---
apiVersion: v1
kind: Service
metadata:
  name: envoy-ai-gateway-basic-testupstream
  namespace: default
spec:
  selector:
    app: envoy-ai-gateway-basic-testupstream
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  type: ClusterIP

部署Gateway。

该步骤为了在内网环境也能顺利运行，自定义了EnvoyProxy的配置，并将Envoy Service设置为NodePort类型，便于快速开展网关路由能力测试。

请将代码中的镜像名称替换为前提条件中准备的对应镜像。

docker.io/envoyproxy/envoy:distroless-dev

详细代码示例如下。

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: nodeport-config
  namespace: envoy-gateway-system
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyService:
        type: NodePort
      envoyDeployment:
        container:
          image: docker.io/envoyproxy/envoy:distroless-dev
---
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: inference-pool-with-aigwroute
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: nodeport-config
    namespace: envoy-gateway-system
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: inference-pool-with-aigwroute
  namespace: default
spec:
  gatewayClassName: inference-pool-with-aigwroute
  listeners:
    - name: http
      protocol: HTTP
      port: 80
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
  name: inference-pool-with-aigwroute
  namespace: default
spec:
  parentRefs:
    - name: inference-pool-with-aigwroute
      kind: Gateway
      group: gateway.networking.k8s.io
  rules:
    # Route for vLLM Llama model via InferencePool
    - matches:
        - headers:
            - type: Exact
              name: x-ai-eg-model
              value: meta-llama/Llama-3.1-8B-Instruct
      backendRefs:
        - group: inference.networking.k8s.io
          kind: InferencePool
          name: vllm-llama3-8b-instruct
    # Route for Mistral model via InferencePool
    - matches:
        - headers:
            - type: Exact
              name: x-ai-eg-model
              value: mistral:latest
      backendRefs:
        - group: inference.networking.k8s.io
          kind: InferencePool
          name: mistral
    # Route for traditional backend (non-InferencePool)
    - matches:
        - headers:
            - type: Exact
              name: x-ai-eg-model
              value: some-cool-self-hosted-model
      backendRefs:
        - name: envoy-ai-gateway-basic-testupstream

验证部署状态。
1. 在目标集群的“工作负载”页面的“无状态负载”页签，确认全部工作负载状态为“运行中”。
2. 在目标集群的“服务”页面的“服务”页签，确认对应的服务已正确创建。
  在该页面获取并记录相应服务的访问地址和NodePort端口号，并按照格式[获取的IP地址]:[NodePort端口]进行拼接。在后续的所有测试中，需将配置中出现的 `$GATEWAY_IP` 替换为该地址。

测试Gateway路由至不同模型/后端的能力。

测试Llama-3模型路由。

执行以下命令，验证网关对Llama3模型的路由转发。

curl -H "Content-Type: application/json" \
  -d '{
        "model": "meta-llama/Llama-3.1-8B-Instruct",
        "messages": [
            {
                "role": "user",
                "content": "Hi. Say this is a test"
            }
        ]
    }' \
  http://$GATEWAY_IP/v1/chat/completions

正常会返回content信息，表示模型运行中。

{"choices":[{"finish_reason":"stop","index":0,"message":{"content":"The temperature there is twenty-five degrees centigrade. Give a man a fish and you feed him for a day; Teach a man to fish",role":"assistant"}}],"created":1767755896,"do_remote_decode":false,"do_remote_prefill":false,"id":"chatcmp-561ca69e-9716-411f-9656-7a96d9******","model":"meta-llama/llama-3.1-8B-Instruct","object":"chat.completion","remote_block_id":"","remote_engine_id":"","remote_host":"","remote_port":0,"usage":{"completion_tokens":28,"prompt_tokens":7,"total_tokens":35},

测试Mistral模型路由。

执行以下命令，验证网关对Mistral的路由转发。

curl -H "Content-Type: application/json" \
  -d '{
        "model": "mistral:latest",
        "messages": [
            {
                "role": "user",
                "content": "Hi. Say this is a test"
            }
        ]
    }' \
  http://$GATEWAY_IP/v1/chat/completions

正常会返回content信息，表示模型运行中。

{"choices":[{"message":{"content":"This is a test.","role":"assistant"}}]}

测试普通后端负载路由。

执行以下命令，验证网关对自定义后端负载的路由转发。

curl -H "Content-Type: application/json" \
  -d '{
        "model": "some-cool-self-hosted-model",
        "messages": [
            {
                "role": "user",
                "content": "Hi. Say this is a test"
            }
        ]
    }' \
  http://$GATEWAY_IP/v1/chat/completions

正常会返回content信息，表示模型运行中。

{"choices":[{"message":{"role":"assistant","content":"I am the captain of my soul."}}]}

父主题：云原生AI

上一篇：NVIDIA DRA插件安装指导

下一篇：GPU节点实现自动弹性伸缩

意见反馈

文档内容是否对您有帮助？

有帮助没帮助

提供反馈

提交成功！非常感谢您的反馈，我们会继续努力做到更好！您可在我的云声建议查看反馈及问题处理状态。

系统繁忙，请稍后重试

如您有其它疑问，您也可以通过华为云社区问答频道来与我们联系探讨

云宝助手提问云社区提问

使用Inference Pool结合Envoy Gateway构建AI基础设施层

应用现状

解决方案

前提条件

操作步骤

相关文档

意见反馈

文档内容是否对您有帮助？