Help Center/ MapReduce Service/ Troubleshooting/ Using Sqoop/ Failed to Use Sqoop to Read MySQL Data and Write Parquet Files to OBS
Updated on 2024-05-28 GMT+08:00

Failed to Use Sqoop to Read MySQL Data and Write Parquet Files to OBS

Issue

An error is reported when Sqoop reads MySQL data and writes the data to OBS in Parquet format. However, the data can be successfully written to OBS if the Parquet format is not specified.

Symptom

Cause Analysis

The Parquet format does not support Hive 3. Data can be written using HCatalog.

Procedure

Use HCatalog to write data: Specify the Hive database and table in parameters and modify the SQL statement in the script. There can be security risks if a command contains the authentication password. You are advised to disable the command recording function (history) before running the command.

Original script:

sqoop import --connect 'jdbc:mysql://10.160.5.65/huawei_pos_online_00?zeroDateTimeBehavior=convertToNull' --username root --password xxx
--split-by id
--num-mappers 2
--query 'select * from pos_remark where 1=1 and $CONDITIONS'
--target-dir obs://za-test/dev/huawei_pos_online_00/pos_remark
--delete-target-dir
--null-string '\\N'
--null-non-string '\\N'
--as-parquetfile

Modified script:

sqoop import --connect 'jdbc:mysql://10.160.5.65/huawei_pos_online_00?zeroDateTimeBehavior=convertToNull' --username root --password xxx
--split-by id
--num-mappers 2
--query 'select id,pos_case_id,pos_transaction_id,remark,update_time,update_user,is_deleted,creator,modifier,gmt_created,gmt_modified,update_user_id,tenant_code from pos_remark where 1=1 and $CONDITIONS'
--hcatalog-database huawei_dev
--hcatalog-table ods_pos_remark