Secure Boundaries

The shared responsibility model is a cooperation mode where both providers and customers take security and compliance responsibilities of cloud services.

The providers manage the cloud infrastructure and provide secure hardware and software to ensure the service availability. The customers protect their data and applications, while complying with related compliance requirements.

The providers are responsible for the services and functions and should:

Establish and maintain secure infrastructure, including networks, servers, and storage devices.
Provide reliable underlying platforms to ensure runtime security for the environment.
Provide identity authentication and access control to ensure that only authorized users can access the cloud services and tenants are isolated from each other.
Provide reliable backup and disaster recovery to prevent data loss due to hardware faults or natural disasters.
Provide transparent monitoring and incident response services, security updates, and vulnerability patches.

The customers should:

Encrypt data and applications for confidentiality and integrity.
Ensure that the model software is securely updated and vulnerabilities are fixed.
Comply with related regulations, such as GDPR, HIPAA, and PCI DSS.
Control access to ensure that only authorized users can access and manage resources such as online services.
Monitor and report any abnormal activity and take actions in a timely manner.

Inference Deployment Security Responsibilities

Providers
- Fix the patches related to underlying ECSs.
- Upgrade the K8S and fix vulnerabilities.
- Operate VM OS lifecycle maintenance.
- Ensure the security and compliance of the ModelArts inference platform.
- Improve the security of containerized application services.
- Upgrade the model runtime environment and fix vulnerabilities periodically.

Customers
- Authorize resource use and control access.
- Ensure the security of applications, its supply chain, and dependencies by security scanning, auditing, and access verification.
- Minimize permissions and limit credential delivery.
- Ensure the security of models (custom images, OBS models, and dependencies) during runtime.
- Update and fix vulnerabilities in a timely manner.
- Securely store sensitive data such as credentials.

Best Practices for Inference Deployment Security

External service authorization
ModelArts inference requires authorization from other cloud services. You can grant only the required permissions based on your needs. For example, you can grant access permission on an OBS bucket to a tenant for model management.
Internal resource authorization
ModelArts inference supports fine-grained permission control. You can configure the permissions for users based on the actual needs to restrict the permissions on some resources.
Model management
To decouple models from images and protect model assets, you can dynamically import models from trainings or OBS. You need to upgrade the dependency packages of models, and fix vulnerabilities in open-source or third-party packages. Sensitive information related to models needs to be decoupled and configured during real-time service deployment. Select the runtime environment recommended by ModelArts. The earlier environments may have security vulnerabilities.

You can select open trusted images when creating models using a container image, for example, images from OpenEuler and Ubuntu. Create non-root users rather than root users to run an image. Only the security package required during the runtime is installed in the image. Downsize the image and upgrade the installation package to the latest vulnerability-free version. Decouple sensitive information from images during service deployment. Ensure that it is not hardcoded in the Dockerfile directly. Perform security scanning on images periodically and install patches to fix vulnerabilities. To facilitate alarm reporting and fault rectification, add health check interface and ensure that the service status can be returned properly. To ensure the service data security, use HTTPS transmission streams and reliable encryption suites for containers.
Model deployment
To prevent services from being overloaded or wasted, set proper compute node specifications during deployment. Do not listen to other ports in the container. If other ports need to be accessed locally, listen to them on localhost. Do not directly transfer sensitive information through environment variables. Encrypt sensitive information with encryption component before data transmission.

App authentication key is an access credential for real-time services. You must keep the app key properly.