On this page

Show all

Help Center/ MapReduce Service/ Getting Started/ Best Practices for Beginners

Best Practices for Beginners

Updated on 2025-01-22 GMT+08:00

After an MRS cluster is deployed, you can try some practices provided by MRS to meet your service requirements.

Table 1 Best practices

Practice

Description

Data analytics

Using Spark2x to Analyze IoV Drivers' Driving Behavior

This practice describes how to use Spark to analyze driving behavior. You can get familiar with basic functions of MRS by using the Spark2x component to analyze and collect statistics on driving behavior, obtain the analysis result, and collect statistics on the number of violations such as sudden acceleration and deceleration, coasting, speeding, and fatigue driving in a specified period.

Using Hive to Load HDFS Data and Analyze Book Scores

This practice describes how to use Hive to import and analyze raw data and how to build elastic and affordable offline big data analytics. In this practice, reading comments from the background of a book website are used as the raw data. After the data is imported to a Hive table, you can run SQL commands to query the most popular best-selling books.

Using Hive to Load OBS Data and Analyze Enterprise Employee Information

This practice describes how to use Hive to import and analyze raw data from OBS and how to build elastic and affordable big data analytics based on decoupled storage and compute resources. This practice describes how to develop a Hive data analysis application and how to run HQL statements to access Hive data stored in OBS after you connect to Hive through the client. For example, manage and query enterprise employee information.

Using Flink Jobs to Process OBS Data

This practice describes how to use the built-in Flink WordCount program of an MRS cluster to analyze the source data stored in the OBS file system and calculate the number of occurrences of specified words in the data source.

MRS supports decoupled storage and compute in scenarios where a large storage capacity is required and compute resources need to be scaled on demand. This allows you to store your data in OBS and use an MRS cluster only for data computing.

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback