Help Center/ Cloud Data Migration/ FAQs/ Troubleshooting/ Hudi Destination Case Library/ What Should I Do If the Number of Read Rows Is the Same as That of Write Rows, Both Numbers No Longer Increase, and the Job Stays in Running State?
Updated on 2023-02-06 GMT+08:00

What Should I Do If the Number of Read Rows Is the Same as That of Write Rows, Both Numbers No Longer Increase, and the Job Stays in Running State?

Possible Cause

CDM writes data to the Hive temporary table and then run a Spark SQL statement to write data to Hudi. The number of written rows is the number of rows written to the Hive temporary table. When the number of rows no longer increases, all the source data has been read and written to the Hive table. However, the job is executing a Spark SQL statement and the execution is complete only after the Spark SQL statement is executed.

Troubleshooting

Open the log, search for insert into, find the following log, and view Yarn task details on MRS Resource Manager based on Yarn ApplicationId in the log.

The speed of executing the Spark SQL statement is closely related to the resources of the tenant queue. Before executing Hudi tasks, ensure that the tenant queue has sufficient resources.