Help Center/ MapReduce Service/ Troubleshooting/ Using Sqoop/ Sqoop Failed to Read Data from MySQL and Write Parquet Files to OBS
Updated on 2023-09-05 GMT+08:00

Sqoop Failed to Read Data from MySQL and Write Parquet Files to OBS

Issue

An error is reported when Sqoop reads MySQL data and writes the data to OBS in Parquet format. However, the data can be successfully written to OBS if the Parquet format is not specified.

Symptom

Cause Analysis

Parquet does not support Hive 3. Data can be written using HCatalog.

Procedure

Use HCatalog to write data: Specify the Hive database and table in parameters and modify the SQL statement in the script.

Details are as follows:

Original script:

sqoop import --connect 'jdbc:mysql://10.160.5.65/huawei_pos_online_00?zeroDateTimeBehavior=convertToNull' --username root --password Mrs@2022

--split-by id

--num-mappers 2

--query 'select * from pos_remark where 1=1 and $CONDITIONS'

--target-dir obs://za-test/dev/huawei_pos_online_00/pos_remark

--delete-target-dir

--null-string '\\N'

--null-non-string '\\N'

--as-parquetfile

Modified script:

sqoop import --connect 'jdbc:mysql://10.160.5.65/huawei_pos_online_00?zeroDateTimeBehavior=convertToNull' --username root --password Mrs@2022

--split-by id

--num-mappers 2

--query 'select id,pos_case_id,pos_transaction_id,remark,update_time,update_user,is_deleted,creator,modifier,gmt_created,gmt_modified,update_user_id,tenant_code from pos_remark where 1=1 and $CONDITIONS'

--hcatalog-database huawei_dev

--hcatalog-table ods_pos_remark