Deze pagina is nog niet beschikbaar in uw eigen taal. We werken er hard aan om meer taalversies toe te voegen. Bedankt voor uw steun.

On this page

Application Development Overview

Updated on 2022-09-14 GMT+08:00

Hive Introduction

Hive is an open-source data warehouse built on Hadoop. It stores structured data and provides basic data analysis services using the Hive query language (HQL), a language like the SQL. Hive converts HQL statements to Mapreduce or Spark jobs for querying and analyzing massive data stored in Hadoop clusters.

Hive provides the following features:

  • Extracts, transforms, and loads (ETL) data using HQL.
  • Analyzes massive structured data using HQL.
  • Supports flexible data storage formats, including JavaScript object notation (JSON), comma separated values (CSV), TextFile, RCFile, ORCFILE, and SequenceFile, and supports custom extensions.
  • Multiple client connection modes. Interfaces, such as JDBC and Thrift interfaces are supported.

Hive applies to offline massive data analysis (such as log and cluster status analysis), large-scale data mining (such as user behavior analysis, interest region analysis, and region display), and other scenarios.

To ensure Hive high availability (HA), user data security, and service access security, MRS incorporates the following features based on Hive 3.1.0:

  • Kerberos security authentication
  • Data file encryption
  • Complete rights management

For Hive features in the Open Source Community, see https://cwiki.apache.org/confluence/display/hive/designdocs.

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback