Customizing Row Separators in Hive Tables

Scenario

In most cases, a carriage return character is used as the row delimiter in Hive tables stored in text files, that is, the carriage return character is used as the terminator of a row during queries. However, some data files are delimited by special characters instead of carriage return characters.

MRS Hive allows you to use different characters or character combinations to delimit rows of Hive text data. When creating a table, set inputformat to SpecifiedDelimiterInputFormat, and set the following parameter before each search. Then the table data is queried by the specified delimiter.

set hive.textinput.record.delimiter='';

The Hue component of the current version does not support setting multiple separators when files are imported to a Hive table.
This section applies to MRS 3.x or later.

Procedure

Specify inputFormat and outputFormat when creating a table.

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name [(col_name data_type [COMMENT col_comment], ...)] [ROW FORMAT row_format] STORED AS inputformat 'org.apache.hadoop.hive.contrib.fileformat.SpecifiedDelimiterInputFormat' outputformat 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';
Specify the delimiter before search.

set hive.textinput.record.delimiter='!@!';

Hive will use '!@!' as the row delimiter.

Parent topic: Common Hive SQL Syntax

Previous topic: Extended Hive SQL Syntax

Next topic: Syntax of Traditional Relational Databases Supported by Hive

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

For any further questions, feel free to contact us through the chatbot.

Chatbot