Storing Hive Table Partitions to OBS and HDFS
Scenario
In the scenario where storage and compute resources are separated, you can specify different storage sources, for example, OBS or HDFS, for partitions in a Hive partitioned table.
This feature applies only to MRS 3.2.0 or later. This section describes the capability of specifying storage sources for partitioned tables. For details about how to connect Hive to OBS in the storage-compute decoupling scenario, see Interconnecting Hive with OBS.
Prerequisites
The Hive client has been installed.
Example
- Log in to the node where the Hive client is installed as the Hive client installation users.
- Run the following command to go to the client installation directory:
cd Client installation directory
For example, if the client installation directory is /opt/client, run the following command:
cd /opt/client
- Run the following command to configure environment variables:
- Check whether the cluster authentication mode is in security mode.
- If yes, run the following command to authenticate the user:
- If no, go to 5.
- Run the following command to log in to the Hive client:
- Run the following commands to create a Hive partitioned table named table_1, and set the path of partitions pt=?2021-12-12 and pt='2021-12-18 to hdfs//xxx and obs://xxx respectively:
create table table_1(id string) partitioned by(pt string) [stored as [orc|textfile|parquet|...]];
alter table table_1 add partition(pt='2021-12-12') location 'hdfs://xxx';
alter table table_1 add partition(pt='2021-12-18') location 'obs://xxx';
- After data is inserted into table_1, it is stored in the corresponding storage source. You can run the desc command to view the location of each partition.
desc formatted table_1 partition(pt='2021-12-18');
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot