ModelArts Metric Collector
Description
Metric Collector, a default built-in plug-in of ModelArts, runs as a node daemon to collect node and job metrics and report them to AOM. For details about the metric list, see Viewing All ModelArts Monitoring Metrics on the AOM Console.

Constraints
- The plug-in is automatically installed during resource pool creation and cannot be uninstalled.
- This plug-in is automatically installed if ModelArts Node Agent is upgraded to the latest version for an existing resource pool.
- During the plug-in upgrade, the pod for metric collection restarts. As a result, metrics may not be reported for a short period of time. Exercise caution when performing the operation.
Components
Component |
Description |
Resource Type |
---|---|---|
modelarts-metric-collector |
Node and container metrics collection |
DaemonSet |
Parameters
Parameter |
Description |
---|---|
Standby Node Metric Reporting |
Whether the standby node of a dedicated pool reports metrics. The default value is false. |
Enable Exporter |
Third-party monitoring systems such as Prometheus can obtain metrics collected by ModelArts. If this function is disabled, third-party monitoring systems such as Prometheus cannot collect metrics. This function is enabled by default. Dedicated pool: Enable this function if you want to use inference job metrics for scaling. |
Report Metrics to a Custom Common Prometheus Instance on AOM |
By default, metrics are reported to the Prometheus_AOM_Default instance of AOM. If this function is enabled, metrics are reported to the custom Prometheus common instance, as shown in Figure 2. If this function is disabled, metrics are reported to the default Prometheus instance, that is, the Prometheus_AOM_Default instance, as shown in Figure 3. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot