Configuring the Policy for Clearing Recycle Bin Directories of MRS Cluster Components
Scenarios
In MRS 3.2.0-LTS.1 or later, components prevent mis-deletion by default. That is, file data deleted by component users is not directly deleted but stored in the recycle bin directory in the OBS file system. This function is compatible with the native garbage collection mechanism of Hadoop FS, providing additional data protection for OBS-based Hadoop big data systems.
This section describes how to set the lifecycle policy of the recycle bin directory in the OBS file system to automatically clear related data periodically. The recycle bin directory is created for each required user. If a new user is added to the MRS cluster and has the permission to delete component data, you need to configure the policy for clearing the recycle bin directory for the new user.

For clusters that use decoupled storage and compute, configure a lifecycle policy for the related directories by referring to this section. Otherwise, the storage space may be used up and storage fees may increase. For details about OBS billing, see OBS Billing Overview.
You need to configure lifecycle policies for the recycle bin directories of preset users in the MRS cluster and the recycle bin directories of new users who need accidental deletion prevention. If a low privileged agency is used or only the permission for MRS users to access OBS file system directories is configured by referring to Configuring Fine-Grained OBS Access Permissions for MRS Cluster Users, you will need the operation permission for the recycle bin directory.
Cluster Version |
Directory Type |
Component |
Directory |
How to Create |
---|---|---|---|---|
Versions earlier than MRS 3.3.0-LTS |
Recycle bin directories that must be configured by default for each component in an MRS cluster |
Hive |
|
If the .Trash folder does not exist, create it on the cluster client as user omm. Run the following command: hdfs dfs -mkdir -p obs://Name of the OBS parallel file system where the table is stored/Folder path |
Spark |
|
|||
HetuEngine |
|
|||
HBase |
|
|||
Recycle bin directories of users who need accidental deletion prevention |
Hive/Spark/HetuEngine |
user/<New service user>/.Trash |
||
MRS 3.3.0-LTS or later |
Default recycle bin directories configured for each component in an MRS cluster |
Hive/Spark/HetuEngine |
/user/.Trash |
For example, if a user with the following permissions has been added to the cluster, you need to create a recycle bin directory clearing rule for the user in the parallel file system:
- Permissions to delete the HDFS files
- DROP, INSERT OVERWRITE, and TRUNCATE permissions on Hive tables
- DROP, TRUNCATE, DELETE, INSERT OVERWRITE, and LOAD OVERWRITE permissions on HetuEngine
Configuring the Lifecycle Rule of an OBS Directory
- Log in to the OBS console.
- Click Parallel File Systems and click the name of the file system used by the current MRS cluster.
- In the navigation pane, choose Data Management > Lifecycle Rules. Click Create to create a lifecycle rule for a specified directory. For details about the parameters, see Configuring a Lifecycle Rule.
Table 2 Parameters for creating a lifecycle rule Parameter
Description
Example Value
Status
Whether to enable the lifecycle rule.
Enable
Rule Name
Rule name that identifies different lifecycle configurations.
rule-test
Prefix
Prefix of the objects to which the lifecycle rule applies. Objects that have the specified prefix will be managed by the lifecycle rule. The prefix cannot start with a slash (/), have consecutive slashes (/), or contain the following special characters: \:*?"<>| If this parameter is not specified, the rule will take effect for the entire file system.
WARNING:To prevent other service data from being deleted by mistake, you are not advised to use the lifecycle rule configured for the entire file system or high-level directories.
Generally, the recycle bin directory of MRS components is in the following format. If the folder does not exist, create it.
user/<Username>/.Trash
user/omm/.Trash
Delete Files After (Days)
The object within the rule configuration scope expires and is automatically deleted by OBS if the number of days since its last update reaches this parameter value. The recommended value is 1 to 7 days.
2 days
- Click OK to complete the lifecycle rule configuration.
You can click Edit in the Operation column of a lifecycle rule to edit it. You can also click Disable or Enable to disable or enable it.
- Repeat the preceding steps to create recycle bin directory clearing rules for all users who have the data deletion permission in the current MRS cluster one by one until all recycle bin directories in the OBS file system are configured.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot