Function Instance Types and Usage Modes
This section describes the CPU and GPU instance types, billing modes, and specifications.
Instance Type
Function instances are classified into the following types:
- CPU instances: Basic function workflow instances, which are suitable for burst traffic and compute-intensive scenarios.
- GPU instances: GPU instances based on the Turing architecture, which are suitable for audio/video, AI, and image processing scenarios. Different service loads are accelerated by GPU hardware to improve processing efficiency.
GPU instances can be deployed only using container images or custom runtimes.
Usage Modes
CPU and GPU instances are available in two types: on-demand and reserved. The two modes are described as follows:
In on-demand mode, instances are automatically scaled by FunctionGraph based on invocations.
Instances are created by requests and destroyed after 1 minute of inactivity. Initial invocations involve a cold start.
Constraints:
By default, a single Huawei Cloud account (IAM account) can have a maximum of 1,000 instances in a region. If you need more instances, submit a service ticket.
Billing mode:
In on-demand mode, billing begins when the function is triggered by the request and ends when the request is completed. A single instance can process a single request or multiple requests based on the configuration. For details about the execution duration, see Table 1. For details about how to configure concurrent processing, see Configuring Single-Instance Multi-Concurrency.
When no function is invoked, the system does not allocate compute resources and no fees are generated. You will be billed only when the function is invoked and executed. For details about service billing, see FunctionGraph Billing Overview.
Request Mode |
Duration |
Example |
---|---|---|
Single-instance single-concurrency |
The execution duration starts from the time when the request arrives at the instance and ends when the request is completed. |
|
Single-instance multi-concurrency |
The execution duration starts from the time when the first request arrives at the instance and ends when the last request is completed. |
If the first request arrives at 00:00:00 and ends at 00:00:05, and the last request arrives at 00:00:03 and ends at 00:00:08. They are executed in the same instance and the total duration is 8 seconds. |
In the reserved mode, you can manage the lifecycle of function instances to achieve flexible control over computing resources. When reserved instances are configured for a function, FunctionGraph prioritizes scheduling incoming requests to these instances. If traffic exceeds the capacity of reserved resources, FunctionGraph automatically scales resources to dynamically allocate on-demand instances, ensuring service continuity.
After reserved instances are created for a function, the code, dependencies, and initializer of the function are automatically loaded. Reserved instances are always alive in the execution environment, eliminating the influence of cold starts on your services. You are advised to configure a fixed number of reserved instances based on the service requirements, scheduled scaling, metric scaling, and intelligent recommendation policies based on traffic peaks and troughs.
Note:
To ensure service stability and reliability, do not rely on the initializer of the reserved instance to execute one-time services.
Billing mode:
For details, see the execution duration (reserved instances) billing item in Billing Items and Reserved Instance Billing.
Specifications
- CPU instance
CPU instances include the following specifications.
Table 2 CPU instance specifications Memory
Max. Code Package Size
Max. Execution Duration
Max. Disk Size
128–32,768 MB
The value must be a multiple of 64.
- ZIP file: 1.5 GB (after decompression)
- OBS bucket: 300 MB (after compression)
259,200s
If the execution takes longer than 900 seconds, use asynchronous invocation.
The value can be 512 MB (default) or 10 GB.
- GPU instance
GPU instances have the following specifications.
Table 3 GPU instance specifications GPU
Total Graphics Memory
Compute (TFLOPS)
Optional Split Specifications
On-demand Mode
Reserved Mode
Idle Mode
NVIDIA T4
16 GB
FP16
FP32
Graphics Memory (MB)
Memory (MB)
Supported
Supported
Supported
65
8
1024–16384 (1–16 GB)
The value must be a multiple of 1024 MB.
128–32768
The value must be a multiple of 64.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot