Help Center> MapReduce Service> Component Operation Guide> Using Alluxio> Accessing Alluxio Using a Data Application

Accessing Alluxio Using a Data Application

The port number used for accessing the Alluxio file system is 19998, and the access address is alluxio://<Master node IP address of Alluxio>:19998/<PATH>. This section uses examples to describe how to access the Alluxio file system using data applications (Spark, Hive, Hadoop MapReduce, and Presto).

Using Alluxio as the Input and Output of a Spark Application

Log in to the Master node in a cluster as user root using the password set during cluster creation.
Run the following command to configure environment variables:

source /opt/client/bigdata_env
If Kerberos authentication is enabled for the current cluster, run the following command to authenticate the user. If Kerberos authentication is disabled for the current cluster, skip this step:

kinit MRS cluster user

Example: kinit admin
Prepare an input file and copy local data to the Alluxio file system.

For example, prepare the input file test_input.txt in the local /home directory, and run the following command to save the test_input.txt file to Alluxio:

alluxio fs copyFromLocal /home/test_input.txt /input
Run the following commands to start spark-shell:

spark-shell
Run the following command in spark-shell (replace <Master node IP address of Alluxio> with the actual IP address):

val s = sc.textFile("alluxio://<Master node IP address of Alluxio>:19998/input")

val double = s.map(line => line + line)

double.saveAsTextFile("alluxio://<Master node IP address of Alluxio>:19998/output")
Run the alluxio fs ls / command to check whether the output directory /output containing double content of the input file exists in the root directory of Alluxio.

Creating a Hive Table on Alluxio

Log in to the Master node in a cluster as user root using the password set during cluster creation.
Run the following command to configure environment variables:

source /opt/client/bigdata_env
If Kerberos authentication is enabled for the current cluster, run the following command to authenticate the user. If Kerberos authentication is disabled for the current cluster, skip this step:

kinit MRS cluster user

Example: kinit admin
Prepare an input file. For example, prepare the hive_load.txt input file in the local /home directory. The file content is as follows:
```
1, Alice, company A
2, Bob, company B
```
Run the following command to import the hive_load.txt file to Alluxio:

alluxio fs copyFromLocal /home/hive_load.txt /hive_input
Run the following command to start the Hive beeline:

beeline
Run the following command (replace <Master node IP address of Alluxio> with the actual IP address) in the beeline to create a table based on the input file in Alluxio:

>CREATE TABLE u_user(id INT, name STRING, company STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

>LOAD DATA INPATH 'alluxio://<Master node IP address of Alluxio>:19998/hive_input' INTO TABLE u_user;
Run the following command to view the created table:

select * from u_user;

Running Hadoop Wordcount in Alluxio

Log in to the Master node in a cluster as user root using the password set during cluster creation.
Run the following command to configure environment variables:

source /opt/client/bigdata_env
If Kerberos authentication is enabled for the current cluster, run the following command to authenticate the user. If Kerberos authentication is disabled for the current cluster, skip this step:

kinit MRS cluster user

Example: kinit admin
Prepare an input file and copy local data to the Alluxio file system.

For example, prepare the input file test_input.txt in the local /home directory, and run the following command to save the test_input.txt file to Alluxio:

alluxio fs copyFromLocal /home/test_input.txt /input
Run the wordcount job using yarn jar. (Replace <Master node IP address of Alluxio>, <Hadoop version>, and <MRS cluster version> with the actual values.)

yarn jar /opt/share/hadoop-mapreduce-examples-<Hadoop version>-mrs-<MRS cluster version>/hadoop-mapreduce-examples-<Hadoop version>-mrs-<MRS cluster version>.jar wordcount alluxio://<Master node IP address of Alluxio>:19998/input alluxio://<Master node IP address of Alluxio>:19998/output
Run the alluxio fs ls / command to check whether the output directory /output containing the wordcount result exists in the root directory of Alluxio.

Using Presto to Query Tables in Alluxio

Log in to the Master node in a cluster as user root using the password set during cluster creation.
Run the following command to configure environment variables:

source /opt/client/bigdata_env
If Kerberos authentication is enabled for the current cluster, run the following command to authenticate the user. If Kerberos authentication is disabled for the current cluster, skip this step:

kinit MRS cluster user

Example: kinit admin
Start the Hive beeline to create a table in Alluxio. (Replace <Master node IP address of Alluxio> with the actual IP address.)

beeline

>CREATE TABLE u_user (id int, name string, company string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION 'alluxio://<Master node IP address of Alluxio>:19998/u_user';

>insert into u_user values(1,'Alice','Company A'),(2, 'Bob', 'Company B');
Start the Presto client. For details, see 2 to 8 in Using a Client to Execute Query Statements.
On the Presto client, run the select * from hive.default.u_user; statement to query the table created in Alluxio:

Figure 1 Using Presto to query the table created in Alluxio

Parent topic: Using Alluxio

Last Article: Configuring an Underlying Storage System

Next Article: Common Operations of Alluxio

Did this article solve your problem?

Thank you for your score！Your feedback would help us improve the website.

Products

Compute

Application

Dedicated Cloud

Storage

Management & Deployment

Migration

Network

Enterprise Intelligence

Video

Database

Edge Cloud Services

DevCloud

Security

Cloud Communications

Internet of Things

Solutions

Industry-Specific Solutions

General-Purpose Solutions

Security

DevOps

Enterprise Intelligence

Essential Platform

Big Data

Visual Cognition

Speech and Semantics

Support

Help Center

Customer Services

Developers

Console

语言 - Language

中国站 - 简体中文

中国站 - English

International - 简体中文

International - English