Monitoring a Job
After a job is created, you can view the job details through the following operations:
- Viewing Job Details
- Checking the Dashboard
- Viewing the Job Execution Plan
- Viewing the Task List of a Job
- Viewing Job Audit Logs
- Viewing Job Running Logs
- Viewing Job Tags
Viewing Job Details
This section describes how to view job details. After you create and run a job, you can view job details, including SQL statements and parameter settings. For a user-defined job, you can only view its parameter settings.
- In the navigation tree on the left pane of the CS management console, choose Job Management to switch to the page.
- In the Name column, click the job name to switch to the Job Details page. On the Job Details page, you can view SQL statements, total cost for the job, and Parameter List.
Table 1 Parameter description Parameter
Description
Type
Job type, for example, Flink streaming SQL job.
ID
Job ID.
Status
Status of a job.
Running Mode
If you create a job in a shared cluster, this parameter is Shared.
If you create a job in a user-defined cluster, this parameter is Exclusively.
Cluster
If you create a job in a shared cluster, this parameter is Cluster Shared.
If you create a job in a user-defined cluster, the specific cluster name is displayed.
SPUs
Number of SPUs for a job.
Parallelism
Number of tasks where CS jobs can simultaneously run.
Enable Checkpoint
Select Enable Checkpoint to save the intermediate job running results to OBS, thereby preventing data loss in the event of exceptions.
Checkpoint Interval (s)
This parameter is valid only when Enable Checkpoint is set to true.
Interval between storing intermediate job running results to OBS.
Checkpoint Mode
This parameter is valid only when Enable Checkpoint is set to true.
Checkpoint mode. Values include:
- AtLeastOnce: indicates that events are processed at least once.
- ExactlyOnce: indicates that events are processed only once.
Save Job Log
Select Save Job Log to save job run logs to OBS so that you can locate faults by using run logs in the event of faults.
OBS Bucket
This parameter is valid when Enable Checkpoint is true or Save Job Log is true.
Name of the OBS bucket where data is dumped.
Topic Name
SMN topic name. If an exception occurs during job running, CS notifies users of the exception over SMN.
Auto Restart upon Exception
If you enable this function, CS automatically restarts and restores abnormal jobs upon job exceptions.
Idle State Retention Time
Defines for how long the state of a key is retained without being updated before it is removed in GroupBy or Window.
Created
Time when a job is created.
Start Time
Start time of a job.
Enterprise Project
Name of the enterprise project to which a job belongs.
Total Billing Time
Total running duration of a job for charging.
Checking the Dashboard
You can view details about job data input and output through the dashboard.
- In the navigation tree on the left pane of the CS management console, choose Job Management to switch to the page.
- In the Name column on the page, click the desired job name. On the displayed page, click Job Monitoring.
The following table describes monitoring metrics related to Spark jobs.
Table 2 Monitoring metrics related to Spark jobs Metric
Description
InputSize (records/sec)
Provides the number of input records for a Spark job.
ProcessingTime (ms)
Provides the processing time distribution chart of all mini-batch tasks.
SchedulingDelay (ms)
Provides the scheduling delay distribution chart of all mini-batch tasks.
TotalDelay (ms)
Provides the total scheduling delay of all mini-batch tasks.
- Click
to refresh all the charts. - Click a chart and scroll the mouse wheel to zoom in or out.
- You can only view monitoring information about running jobs.
The following table describes monitoring metrics related to Flink jobs.
Table 3 Monitoring metrics related to Flink jobs Metric
Description
Data Input Rate
Provides the data input rate of a Flink job. Unit: Data records/s
Total Input Records
Provides the total number of input data records in a Flink job. Unit: Data records
Total Input Bytes
Provides the total input bytes of a Flink job. Unit: Byte
Data Output Rate
Provides the data output rate of a Flink job. Unit: Data records/s
Total Output Records
Provides the total number of output data records in a Flink job. Unit: Data records
Total Output Bytes
Provides the total output bytes of a Flink job. Unit: Byte
CPU Load (%)
Provides the CPU usage.
Memory Usage (%)
Provides the heap memory usage of a job.
- Click Real-Time Refresh to refresh the running jobs in real time. The charts are updated every 10 seconds.
- Click
. In the displayed dialog box, specify the parameter as required. - Click
in the upper right corner of a chart to zoom in the chart. - Click
to delete a metric.
- Click
Viewing the Job Execution Plan
You can view the execution plan to understand the operator stream information about the running job.
Execution plans of Spark jobs cannot be viewed.
- In the navigation tree on the left pane of the CS management console, choose Job Management to switch to the page.
- In the Name column on the page, click the desired job name. On the displayed page, click Execution Plan.
- Scroll the mouse wheel or click
to zoom in or out. - The stream diagram displays the operator stream information about the running job in real time.
- Scroll the mouse wheel or click
Viewing the Task List of a Job
You can view details about each task running on a job, including the task start time, number of received and transmitted bytes, and running duration.
The task list of the Spark job cannot be viewed.
- In the navigation tree on the left pane of the CS management console, choose Job Management to switch to the page.
- In the Name column on the page, click the desired job name. On the displayed page, click Task List.
- View the operator task list.
Table 4 Parameter description Parameter
Description
Name
Name of an operator.
Duration
Running duration of an operator.
Parallelism
Number of parallel tasks in an operator.
Task
Operator tasks are categorized into the following:
- The digit in red indicates the number of failed tasks.
- The digit in light gray indicates the number of canceled tasks.
- The digit in yellow indicates the number of tasks that are being canceled.
- The digit in green indicates the number of finished tasks.
- The digit in blue indicates the number of running tasks.
- The digit in sky blue indicates the number of tasks that are being deployed.
- The digit in dark gray indicates the number of tasks in a queue.
Status
Status of an operator task.
Back Pressure Status
Working load status of an operator. Available options are as follows:
- OK: indicates that the operator is in normal working load.
- LOW: indicates that the operator is in slightly high working load.
- HIGH: indicates that the operator is in high working load.
Delay
Duration from the time when source data starts being processed to the time when data reaches the current operator. The unit is millisecond.
Sent Records
Records of an operator sending data.
Sent Bytes
Number of bytes sent by an operator.
Received Bytes
Number of bytes received by an operator.
Received Records
Records of an operator receiving data.
Start Time
Time when an operator starts running.
End Time
Time when an operator stops running.
- Click
to view the task list. Table 5 Parameter description Parameter
Description
Start Time
Time when a task starts running.
End Time
Time when a task stops running.
Duration
Task running duration.
Received Bytes
Number of bytes received by a task.
Received Records
Records received by a task.
Sent Bytes
Number of bytes sent by a task.
Sent Records
Records sent by a task.
Attempts
Number of retry attempts after a task is suspended.
Host
Host IP address of the operator.
- View the operator task list.
Viewing Job Audit Logs
You can view the job operation records in audit logs, such as job creation, submission, running, and stop.
- In the navigation tree on the left pane of the CS management console, choose to switch to the Job Management page.
- In the Name column on the page, click the desired job name to switch to the Job Details page.
- Click Audit Log to view audit logs of the job. Figure 1 Viewing job audit logs
A maximum of 50 logs can be displayed. For more audit logs, query them in CTS. For details about how to view audit logs in CTS, see section "Querying Real-Time Traces" in the Cloud Trace Service Quick Start.
If no information is displayed on the page, you need to enable CTS.
- Click Enable to switch to the page.
- Click OK.
You can also log in to the CTS management console to enable CTS. For details, see Enabling CTS.
Table 6 Parameters related to audit logs Parameter
Parameter description
Event Name
Name of an event.
Resource Name
Name of a running job.
Resource ID
ID of a running job.
Type
Job operation type.
Level
Event level. Available options include the following:
- incident
- warning
- normal
Operator
Account used to run a job.
Generated
Time when an event occurs.
Source IP Address
IP address of the operator.
Operation Result
Operation result.
Viewing Job Running Logs
You can view the run logs to locate the faults occurring during job running.
- In the navigation tree on the left pane of the CS management console, choose Job Management to switch to the page.
- In the Name column on the page, click the desired job name. On the displayed page, click Running Logs.
On the displayed page, you can view information of Job Manager and Task Manager for running jobs.
Information about Job Manager and Task Manager is updated every minute. By default, only the run logs generated within the last 1 minute are displayed. You can click Log history to view more logs.
If you select an OBS bucket for saving job logs during the job configuration, you can switch to the OBS bucket and download log files to view more historical logs.
If the job is not running, information on the Task Manager page cannot be viewed.
Viewing Job Tags
You can view, add, modify, and delete job tags.
- In the navigation tree on the left pane of the CS management console, choose Job Management to switch to the page.
- In the row where the job whose tag you want to view is located, click the job name in the Name column to switch to the page.
- Click Tags to display the tag information about the current job.
For more information about job tags, see Managing Job Tags.
Last Article: Performing Operations on a Job
Next Article: Job Template
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.