Esta página aún no está disponible en su idioma local. Estamos trabajando arduamente para agregar más versiones de idiomas. Gracias por tu apoyo.

Common Sqoop Commands and Parameters

Updated on 2022-09-15 GMT+08:00

Common Sqoop commands

Table 1 Common Sqoop commands

Command

Description

import

Imports data to a cluster.

export

Exports data of a cluster.

codegen

Obtains data from a table in the database to generate a Java file and compress the file.

create-hive-table

Creates a Hive table.

eval

Executes a SQL statement and view the result.

import-all-tables

Imports all tables in a database to HDFS.

job

Generates a Sqoop job.

list-databases

Lists database names.

list-tables

List table names.

merge

Merges data in different HDFS directories and saves the data to a specified directory.

metastore

Starts the metadata database to record the metadata of a Sqoop job.

help

Prints help information.

version

Prints the version information.

Common Parameters

Table 2 Common parameters

Category

Parameter

Description

Parameters for database connection

--connect

Specifies the URL for connecting to a relational database.

--connection-manager

Specifies the connection manager class.

--driver jdbc

Specifies the driver package for database connection.

--help

Prints help information.

--password

Specifies the password for connecting to a database.

--username

Specifies the username for connecting to a database.

--verbose

Prints detailed information on the console.

import parameters

--fields-terminated-by

Specifies the field delimiter, which must be the same as that in a Hive table or HDFS file.

--lines-terminated-by

Specifies the line delimiter, which must be the same as that in a Hive table or HDFS file.

--mysql-delimiters

Specifies the default delimiter settings of MySQL.

export parameters

--input-fields-terminated-by

Specifies the field delimiter.

--input-lines-terminated-by

Specifies the line delimiter.

Hive parameters

--hive-delims-replacement

Replaces characters such as \r and \n in data with user-defined characters.

--hive-drop-import-delims

Removes characters such as \r and \n when data is imported to Hive.

--map-column-hive

Specifies the data type of fields during the generation of a Hive table.

--hive-partition-key

Creates a partition.

--hive-partition-value

Imports data to a specified partition of a database.

--hive-home

Specifies the installation directory for Hive.

--hive-import

Specifies that data is imported from a relational database to Hive.

--hive-overwrite

Overwrites existing Hive data.

--create-hive-table

Creates a Hive table. The default value is false. A destination table will be created if it does not exist.

--hive-table

Specifies a Hive table to which data is to be imported.

--table

Specifies the relational database table.

--columns

Specifies the fields of a relational data table to be imported.

--query

Specifies the query statement for importing the query result.

HCatalog parameters

--hcatalog-database

Specifies a Hive database and imports data to it using HCatalog.

--hcatalog-table

Specifies a Hive table and imports data to it using HCatalog.

Others

-m or --num-mappers

Specifies the number of map tasks used by a Sqoop job.

--split-by

Specifies the column based on which Sqoop splits work units. This parameter is used together with -m.

--target-dir

Specifies the temporary directory of HDFS.

--null-string string

Specifies the string to be written for a null value for string columns.

--null-non-string

Specifies the string to be written for a null value for non-string columns.

--check-column

Specifies the column for determining incremental data import.

--incremental append or lastmodified

Incrementally imports data.

append: appends records, for example, appending records that are greater than the value specified by last-value.

lastmodified: appends data that is modified after the date specified by last-value.

--last-value

Specifies the last value of the check column from the previous import.

--input-null-string

Specifies the string to be interpreted as NULL for string columns.

--input-null-non-string

Specifies the string to be interpreted as null for non-string columns. If this parameter is not specified, NULL will be used.

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback