Obtaining the Maximum Value and Transferring It to a CDM Job Using a Query SQL Statement
Scenario
You can run a query SQL statement to transfer the obtained maximum time value to a CDM job. In the advanced attributes of the CDM job, the where clause is used to determine the maximum time range to obtain the data to be migrated and complete the incremental data migration.
Constraints
- You have completed operations in Creating a Data Connection.
- You have completed operations in Creating a Database.
Examples
Creating an SQL Script
- In the left navigation pane of DataArts Factory, choose .
- Create an SQL script. This section uses the MRS Spark SQL script as an example.
- Select a created data connection and database.
- Compile the SQL script to obtain the maximum time data from table1.
select max(time) from table1
- Save and submit the version. The maxtime script is created.
Creating a Pipeline Subjob
- In the left navigation pane of DataArts Factory, choose .
- Select a CDM Job node and configure the node properties.
Figure 1 Configuring CDM Job node properties
Select a CDM cluster and associate the node with an existing CDM job.
Configure the job parameters and add job parameter maxtime.
Figure 2 Configuring job parameters
- Save and submit the version. The subjob sub is created.
Creating a Pipeline Job
- In the left navigation pane of DataArts Factory, choose .
- Select an MRS Spark SQL node and a For Each node to execute the CDM subjob cyclically.
- Configure properties of the MRS Spark SQL node and associate the node with the created maxtime script.
Figure 3 Configuring properties for the MRS Spark SQL node
- Configure properties of the For Each node and associate the node with the created CDM subjob.
Figure 4 Configuring properties for the For Each node
After associating the node with the created subjob sub, write a parameter expression.
#{Loop.current[0]}
Configure the data set, with an EL expression supported.
#{Job.getNodeOutput("maxtime")}
- Save and submit the version. The job is created.
Obtaining the Maximum Time Value from the CDM Job Using a Where Clause and Transferring the Value to the Destination Job
- Open the created subjob.
- Click next to the job name to go to the job configuration page.
Figure 5 Editing the CDM job
- In the advanced attributes of the source job configuration, configure a where clause to obtain the data to be migrated. When the job is executed, the migration data obtained from the source will be replicated, exported, and imported to the destination.
Figure 6 Configuring a where clause
The where clause is as follows:
dt > '${maxtime}'
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot