Help Center/ ROMA Connect/ User Guide/ Monitoring Metrics
Updated on 2024-11-27 GMT+08:00

Monitoring Metrics

Overview

Cloud Eye monitors the running status of cloud services and usage of each metric, and creates alarm rules for monitoring metrics.

After you enable ROMA Connect, Cloud Eye automatically associates with ROMA Connect monitoring metrics to help you understand the running status of ROMA Connect.

Enabling Cloud Eye

Cloud Eye is enabled by default.

For details about how to view ROMA Connect monitoring metrics, see Querying Cloud Service Monitoring Metrics.

Create an alarm rule to send an alarm notification when the monitoring data meets the specified conditions. For details, see Creating an Alarm Rule.

Metrics Supported by FDI

Table 1 Metrics supported by FDI

ID

Metric

Description

Value Range

Monitored Object

Raw Data Monitoring Period (Minute)

active_task_count

Active Tasks

Total number of running tasks in the instance.

Configure this metric when you want to receive alarm notifications once an exception occurs. This metric is suitable for stable projects where there are few changes in task quantity.

≥ 0

Unit: count

Instance

5

task_count

Total Tasks

Total number of FDI tasks in the instance, regardless of the running status.

Configure this metric when you want to receive alarm notifications once a task is mistakenly deleted. This metric is suitable for stable projects where there are few changes in task quantity or only few tasks need to be added or deleted.

≥ 0

Unit: count

Instance

5

data_size

Data Size

Total size of data written by all tasks in the instance in the last period.

Configure this metric if you want to receive alarm notifications once the total size of written data crosses a given threshold.

≥ 0

Unit: byte, KB, MB, GB, TB or PB

Instance

5

data_count

Data Records

Total records of data written by all tasks in the instance in the last period.

Configure this metric if you want to receive alarm notifications once the total number of written data records crosses a given threshold.

≥ 0

Unit: records

Instance

5

success_task_count

Successful Tasks

Total number of successful tasks in the instance in the last period.

≥ 0

Unit: count

Instance

5

fail_task_count

Failed Tasks

Total number of failed tasks in the instance in the last period.

≥ 0

Unit: count

Instance

5

task_fail_count

Failed Count

Number of times a task fails to be executed in the last period.

≥ 0

Unit: count

Instance

5

cdc_unsubmitted_transaction_delay

Delay of Earliest Transaction Not Submitted by CDC

The difference between the time of the earliest transaction that is being processed but not submitted by the CDC composite task and the current time.

For example, for a MySQL task, this metric indicates the difference between the current system time and the time when the binary log is being read in the task. The value of this metric is consistent with that of Real-time read monitoring at the read end on the View Log page of a task.

≥ 0

Unit: ms

Task

1

cdc_submitted_transaction_delay

Delay of Latest Transaction Submitted by CDC

The interval between the time of the latest transaction that has been submitted by the CDC composite task and the current time. This metric applies only to Oracle tasks.

It indicates the difference between the current time and the time of the latest transaction that has been CDC processed and successfully synchronized to the destination. Configure the delay threshold based on the actual project data volume. A value of greater than or equal to 3600 seconds (1 hour) is recommended.

≥ 0

Unit: ms

Task

1

cdc_big_transaction_count

Oversized CDC Transaction Count

Number of oversized transactions read by the CDC task. This metric applies only to Oracle tasks.

It indicates the total number of oversized transactions (containing more than 100,000 data records) in the last period (5 minutes). For example, if a service should not have transactions with more than 100,000 data records, set the threshold to greater than or equal to 1.

≥ 0

Unit: count

Task

5

cdc_expired_transaction_count

Timed-out CDC Transaction Count

Number of timed-out transactions read by the CDC task.

≥ 0

Unit: count

Task

1

Metrics Supported by APIC

Table 2 Metrics supported by APIC

ID

Metric

Description

Value Range

Monitored Object

Raw Data Monitoring Period (Minute)

data_api_request_count

Data API Calls

Number of times that a data API has been called

≥ 0

Instance

1

data_api_max_latency

Maximum Latency for Data API

Maximum latency of a data API

≥ 0

Unit: ms

Instance

1

data_api_avg_latency

Average Latency for Data API

Average latency of a data API

≥ 0

Unit: ms

Instance

1

data_api_errors

Data API Failures

Number of times that a data API has been called

≥ 0

Instance

1

func_api_request_count

Function API Calls

Number of times that a function API has been called

≥ 0

Instance

1

func_api_max_latency

Maximum Latency for Function API

Maximum latency of a function API

≥ 0

Unit: ms

Instance

1

func_api_avg_latency

Average Latency for Function API

Average latency of a function API

≥ 0

Unit: ms

Instance

1

func_api_errors

Function API Failures

Number of times that a function API fails

≥ 0

Instance

1

requests

API Calls

Number of times that an API has been called

≥ 0

Instance

1

error_4xx

4xx Errors

Number of times that an API returns a 4xx error

≥ 0

Instance

1

error_5xx

5xx Errors

Number of times that an API returns a 5xx error

≥ 0

Instance

1

throttled_calls

Throttled API Calls

Number of times that an API call has been throttled

≥ 0

Instance

1

avg_latency

Average Latency

Average latency of an API

≥ 0

Instance

1

max_latency

Maximum Latency

Maximum latency of an API

≥ 0

Unit: ms

Instance

1

req_count

API Calls

Number of API calls

≥ 0

API

1

req_count_2xx

2xx Responses

Number of times that the API returns a 2xx response

≥ 0

API

1

req_count_4xx

4xx Errors

Number of times that the API returns a 4xx error

≥ 0

API

1

req_count_5xx

5xx Errors

Number of times that the API returns a 5xx error

≥ 0

API

1

req_count_error

Total Errors

Total number of API errors.

≥ 0

API

1

avg_latency

Average Latency

Average latency of the API

≥ 0

Unit: ms

API

1

max_latency

Maximum Latency

Maximum latency of the API

≥ 0

Unit: ms

API

1

input_throughput

Incoming Traffic

Incoming traffic of the API

≥ 0

Unit: byte, KB, MB, GB, TB or PB

API

1

output_throughput

Outgoing Traffic

Outgoing traffic of the API

≥ 0

Unit: byte, KB, MB, GB, TB or PB

API

1

Metrics Supported by MQS

Table 3 Metrics supported by MQS

ID

Metric

Description

Value Range

Monitored Object

Raw Data Monitoring Period (Minute)

current_partitions

Partitions

Number of used partitions in an instance

≥ 0

Unit: count

Instance

1

current_topics

Topics

Number of created topics in an instance

≥ 0

Unit: count

Instance

1

group_msgs

Accumulated Messages

Total number of accumulated messages in all consumer groups of an instance

≥ 0

Unit: count

Instance

1

broker_data_size

Message Size

Total size of messages in the broker

≥ 0

Unit: byte, KB, MB, GB, TB or PB

Broker

1

broker_messages_in_rate

Message Creation Rate

Number of messages created per second

≥ 0

Unit: count/second

Broker

1

broker_bytes_out_rate

Message Retrieval

Number of bytes retrieved per second

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Broker

1

broker_bytes_in_rate

Message Creation

Number of bytes created per second

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Broker

1

broker_public_bytes_in_rate

Public Inbound Traffic

Inbound traffic over public networks per second of the broker

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Broker

1

broker_public_bytes_out_rate

Public Outbound Traffic

Outbound traffic over public networks per second of the broker

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Broker

1

broker_fetch_mean

Average Message Creation Processing Duration

Average time that the broker spends processing message creation requests

≥ 0

Unit: ms

Broker

1

broker_produce_mean

Average Message Retrieval Processing Duration

Average time that the broker spends processing message retrieval requests

≥ 0

Unit: ms

Broker

1

broker_alive

Broker Alive

Whether the MQS broker is alive

≥ 0

Broker

1

broker_connections

Connections

Total number of TCP connections on the MQS broker

≥ 0

Unit: count

Broker

1

broker_cpu_usage

CPU Usage

CPU usage of the MQS VM

≥ 0

Unit: percent

Broker

1

broker_disk_read_await

Average Disk Read Time

Average time for each disk I/O read in the monitoring period

≥ 0

Unit: ms

Broker

1

broker_disk_write_await

Average Disk Write Time

Average time for each disk I/O write in the monitoring period

≥ 0

Unit: ms

Broker

1

broker_total_bytes_in_rate

Inbound Traffic

Inbound traffic per second of the MQS broker

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Broker

1

broker_total_bytes_out_rate

Outbound Traffic

Outbound traffic per second of the MQS broker

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Broker

1

broker_cpu_core_load

Average Load per CPU Core

Average load of each CPU core of the MQS VM

≥ 0

Broker

1

broker_disk_usage

Disk Capacity Usage

Disk usage of the MQS VM

≥ 0

Unit: percent

Broker

1

broker_memory_usage

Memory Usage

Memory usage of the MQS VM

≥ 0

Unit: percent

Broker

1

broker_heap_usage

JVM Heap Memory Usage of Kafka

Heap memory usage of the MQS Kafka JVM

≥ 0

Unit: percent

Broker

1

produced_messages

Created Messages

Number of messages created per minute by the Rest node

≥ 0

Unit: count

Broker

1

topic_bytes_in_rate

Message Creation

Total size of messages created per second by the Rest node

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Broker

1

topic_bytes_out_rate

Message Retrieval

Total size of messages retrieved per second by the Rest node

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Broker

1

topic_messages_in_rate

Message Creation Rate

Number of messages created per second

≥ 0

Unit: count/second

Queue

1

topic_bytes_out_rate

Message Retrieval

Number of bytes retrieved per second

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Queue

1

topic_bytes_in_rate

Message Creation

Number of bytes created per second

≥ 0

Unit: byte/s, KB/s, MB/s, GB/s, TB/s, or PB/s

Queue

1

topic_messages

Total Messages

Total number of messages in the queue

≥ 0

Unit: count

Queue

1

produced_messages

Created Messages

Number of messages that have been created

≥ 0

Unit: count

Queue

1

partition_messages

Partition Messages

Total number of messages in the partition

≥ 0

Unit: count

Queue

1

messages_consumed

Partition Retrieved Messages

Number of messages retrieved by the consumer group

≥ 0

Unit: count

Consumer group

1

messages_remained

Partition Available Messages

Number of messages that can be retrieved in the consumer group

≥ 0

Unit: count

Consumer group

1

topic_messages_remained

Topic Available Messages

Number of remaining messages that can be retrieved from the specified topic in the consumer group

≥ 0

Unit: count

Consumer group

1

topic_messages_consumed

Topic Retrieved Messages

Number of messages that have been retrieved from the specified topic in the consumer group

≥ 0

Unit: count

Consumer group

1

consumer_messages_remained

Consumer Available Messages

Number of remaining messages that can be retrieved in the consumer group

≥ 0

Unit: count

Consumer group

1

consumer_messages_consumed

Consumer Retrieved Messages

Number of messages that have been retrieved in the consumer group

≥ 0

Unit: count

Consumer group

1

Metrics Supported by LINK

Table 4 Metrics supported by LINK

ID

Metric

Description

Value Range

Monitored Object

Raw Data Monitoring Period (Minute)

online_connections

Online Devices

Number of online devices of a user

≥ 0

Unit: count

Instance

1

msg_count

Total Number of Messages

Number of messages sent by all devices of a user

≥ 0

Unit: count

Instance

1

msg_tps

TPS

Number of messages sent by devices per second in a measurement period

≥ 0

Unit: count/s

Instance

1

msg_max_latency

Maximum Latency for Message Sending

Number of milliseconds for which a device delays sending of messages in a measurement period

≥ 0

Unit: ms

Instance

1

Dimensions

Key

Value

instance_id

ROMA Connect instance

fdi

Data integration

apic

Service integration

kafka_instance_id

Message integration instance

kafka_broker

Message integration broker node

kafka_rest

Message integration Rest node

kafka_topics

Message integration queue

kafka_partitions

Message integration partition

kafka_groups-partitions

Consumer group of the message integration partition

kafka_groups_topics

Consumer group of the message integration queue

kafka_groups

Consumer group of message integration

link

Device integration