Setting Up Scheduling for a Job
This section describes how to set up scheduling for an orchestrated job.
- If the processing mode of a job is batch processing, configure scheduling types for jobs. Three scheduling types are supported: run once, run periodically, and event-based. For details, see Setting Up Scheduling for a Job Using the Batch Processing Mode.
- If the processing mode of a job is real-time processing, configure scheduling types for nodes. Three scheduling types are supported: run once, run periodically, and event-based. For details, see Setting Up Scheduling for Nodes of a Job Using the Real-Time Processing Mode.
Prerequisites
- You have developed a job by following the instructions in Developing a Job.
- You have locked the job. Otherwise, you must click Lock so that you can develop the job. A job you create or import is locked by you by default. For details, see the lock function.
Constraints
- Set an appropriate value for this parameter. A maximum of five instances can be concurrently executed in a job. If the start time of a job instance is later than the configured job execution time, the job instances in the subsequent batch will be queued. As a result, the job execution costs a longer time than expected. For CDM and ETL jobs, the recurrence must be at least 5 minutes. In addition, the recurrence should be adjusted based on the data volume of the job table and the update frequency of the source table.
- If you use DataArts Studio DataArts Factory to schedule a CDM migration job and configure a scheduled task for the job in DataArts Migration, both configurations take effect. To ensure unified service logic and avoid scheduling conflicts, enable job scheduling in DataArts Factory and do not configure a scheduled task for the job in DataArts Migration.
Setting Up Scheduling for a Job Using the Batch Processing Mode
Three scheduling types are available: Run once, Run periodically, and Event-based. The procedure is as follows:
Click the Scheduling Setup tab on the right of the canvas to expand the configuration page and configure the scheduling parameters listed in Table 1.
Parameter |
Description |
---|---|
Scheduling Type |
Scheduling type of the job. Available options include: |
Dry run |
If you select this option, the job will not be executed, and a success message will be returned. |
Parameter |
Description |
---|---|
From and to |
The period during which a scheduling task takes effect. |
Recurrence |
The frequency at which the scheduling task is executed, which can be: Set an appropriate value for this parameter. A maximum of five instances can be concurrently executed in a job. If the start time of a job instance is later than the configured job execution time, the job instances in the subsequent batch will be queued. As a result, the job execution costs a longer time than expected. For CDM and ETL jobs, the recurrence must be at least 5 minutes. In addition, the recurrence should be adjusted based on the data volume of the job table and the update frequency of the source table.
|
Dependency job |
If you select a dependency job that is executed periodically, the current job will be executed only when an instance of the dependency job is executed within a certain period of time. You can only search for jobs by name. For details about the conditions of dependency jobs and how a job runs after its dependency jobs are set, see Job Dependency. If you select multiple dependency jobs, you can execute the current job only after all dependency job instances are executed within a specified time range (see How a Job Runs After a Dependency Job Is Set for It for details.). The constraints are as follows:
|
Policy for Current job If Dependency job Fails |
Policy for processing the current job when one or more instances of its dependency job fail to be executed in its period.
For example, the recurrence of the current job is 1 hour and that of its dependency jobs is 5 minutes.
|
Run After Dependency job Ends |
If a job depends on other jobs, the job is executed only after its dependency job instances are executed within a specified time range (see How a Job Runs After a Dependency Job Is Set for It for details). If the dependency job instances are not successfully executed, the current job is in waiting state. If you select this option, the system checks whether all job instances in the previous cycle have been executed before executing the current job. |
Cross-Cycle Dependency |
Dependency between job instances
|
Parameter |
Description |
---|---|
Event Type |
Type of the event that triggers job running
|
Parameters for KAFKA event-triggered jobs |
|
Connection Name |
Before selecting a data connection, ensure that a Kafka data connection has been created in the Management Center. |
Topic |
Topic of the message to be sent to the Kafka. |
Concurrent Events |
Number of jobs that can be concurrently processed. The maximum number of concurrent events is 128. |
Event Detection Interval |
Interval at which the system detects the stream for new messages. The unit of the interval can be Second or Minute. |
Access Policy |
Select the location where data is to be accessed:
|
Failure Policy |
Select a policy to be performed after scheduling fails.
|
Setting Up Scheduling for Nodes of a Job Using the Real-Time Processing Mode
Three scheduling types are available: Run once, Run periodically, and Event-based. The procedure is as follows:
Select a node. On the node development page, click the Scheduling Parameter Setup tab. On the displayed page, configure the parameters listed in Table 4.
Parameter |
Description |
---|---|
Scheduling Type |
Scheduling type of the job. Available options include:
|
Parameters displayed when Scheduling Type is Run periodically |
|
From and to |
The period during which a scheduling task takes effect. |
Recurrence |
The frequency at which the scheduling task is executed, which can be:
For CDM and ETL jobs, the recurrence must be at least 5 minutes. In addition, the recurrence should be adjusted based on the data volume of the job table and the update frequency of the source table. |
Cross-Cycle Dependency |
Dependency between job instances
|
Parameters displayed when Scheduling Type is Event-based |
|
Event Type |
Type of the event that triggers job running. |
Connection Name |
Before selecting a data connection, ensure that a Kafka data connection has been created in the Management Center. |
Topic |
Topic of the message to be sent to the Kafka. |
Consumer Group |
A scalable and fault-tolerant group of consumers in Kafka. Consumers in a group share the same ID. They collaborate with each other to consume all partitions of subscribed topics. A partition in a topic can be consumed by only one consumer.
NOTE:
If you select KAFKA for Event Type, the consumer group ID is automatically displayed. You can also manually change the consumer group ID. |
Concurrent Events |
Number of jobs that can be concurrently processed. The maximum number of concurrent events is 10. |
Event Detection Interval |
Interval at which the system detects the stream for new messages. The unit of the interval can be Seconds or Minutes. |
Failure Policy |
Select a policy to be performed after scheduling fails.
|
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.