LakeFormation Overview
LakeFormation is a one-stop enterprise-class data lake and warehouse construction service. It provides APIs and a GUI for unified management of data lake metadata, and is compatible with Hive metadata and Ranger permission models. LakeFormation can connect to multiple compute engines and big data cloud services seamlessly to ensure quick building and easy operation of data lakes and unleash rich value of service data.
You can create a LakeFormation instance and connect it to an MRS cluster for centrally manage data lake metadata and permission.
Restrictions and Constraints
- Before interconnecting LakeFormation with MRS clusters, pay attention to the following restrictions:
- MRS clusters and LakeFormation instances must belong to the same cloud account and region.
- The VPC of the access client created by LakeFormation must be in the same VPC of the MRS clusters.
- The MRS cluster can only interconnect with the catalog named hive in the LakeFormation instance.
- For existing MRS clusters, you need to migrate the metadata database and permission policies to the LakeFormation instance, and then configure the interconnection.
- If metadata in multiple MRS clusters needs to be migrated to the same LakeFormation instance, the database names of the MRS clusters must be unique.
- After MRS is interconnected with LakeFormation, MRS components are subject to the following constraints:
- Hive does not support temporary tables.
- Hive does not support cross-cluster column encryption.
- Hive WebHCat cannot interconnect with LakeFormation.
- Hive cannot create an internal table if the designated directory already contains files.
- Before creating a Hudi table, you need to add the path authorization of the Hudi table directory on LakeFormation to grant OBS read and write permissions.
- Fields in a Hudi table cannot be edited on the LakeFormation console. You can only add, delete, or modify table fields on the Hudi client.
- When Flink reads and writes Hive tables, only hive_sync.mode=jdbc can be used to synchronize Hive tables. HMS is not supported.
- If a low-permission user lacks OBS path access permission for the default database, Spark will display a permission error message but will still successfully create the database.
- After MRS is interconnected with LakeFormation, the permission policy restrictions are as follows:
- In LakeFormation authorization, only LakeFormation roles can be used as authorization entities. IAM users or user groups cannot be used as authorization entities.
- The PolicySync process does not modify the default policy of the RangerAdmin Hive module in the cluster. The default policy still takes effect.
- After the PolicySync process is started, it compares the permissions with those of LakeFormation instances and deletes the non-default policies that do not exist in LakeFormation. You are advised to migrate the permission policies to LakeFormation instances first.
- For the Hive module on the RangerAdmin web UI, do not add or delete non-default policies. Grant permissions on the data permission page of LakeFormation instances.
- After the interconnection between the MRS cluster and LakeFormation is canceled, the non-default policies of RangerAdmin will not be cleared. You need to manually clear them.
- Hive does not support SQL statements for granting permissions. You need to grant permissions on the Data Permissions page.
- MRS does not support LakeFormation row filtering.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot