Updated on 2024-02-21 GMT+08:00

Performing Secondary Development After Interconnection

You can perform secondary development as required. Currently, the following examples are provided.

  • Custom authentication information obtaining class: used to obtain IAM authentication information for accessing LakeFormation.
  • Custom user information obtaining class: used to obtain the information of the user who accesses LakeFormation.

Custom Authentication Information Obtaining Class

The IdentityGenerator class is used to obtain IAM authentication information (token, permanent AK/SK, and temporary AK/SK and securityToken) for accessing LakeFormation.

LakeFormation provides a default class for obtaining authentication information. The AK/SK is obtained from the configuration file to generate authentication information.

In addition to the default authentication information obtaining class provided by LakeFormation, you can implement the default authentication information obtaining class.

  1. Develop code.

    The implementation project is as follows. Add the lakeformation-lakecat-client dependency to the POM file of the Maven project.

    <dependency>
    <groupId>com.huawei.lakeformation</groupId>
    <artifactId>lakeformation-lakecat-client</artifactId>
    <version>${lakeformation.version}</version>
    </dependency>
    Add a class for obtaining authentication information to implement the IdentityGenerator API.
    
    /*
    * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved.
    */
    
    package com.huawei.cloud.dalf.lakecat.examples;
    
    import com.huawei.cloud.dalf.lakecat.client.ConfigCenter;
    import com.huawei.cloud.dalf.lakecat.client.identity.Identity;
    import com.huawei.cloud.dalf.lakecat.client.identity.IdentityGenerator;
    
    import java.util.Collections;
    
    /**
    * Identity information generator example
    *
    */
    public class LakeFormationExampleIdentityGenerator implements IdentityGenerator {
    public String token;
    
    @Override
    public void initialize(ConfigCenter configCenter) {
    //Perform initialization.
    }
    
    @Override
    public Identity generateIdentity() {
    //Return the IAM authentication information.
    }
    }
  2. Configure integration.

    Use Maven to pack the code and place the JAR package in the spark/jars directory.

    Add the corresponding configurations for different interconnection methods:

    • If SparkCatalogPlugin is used for interconnection, add the following configurations to the spark-default.conf configuration file:
      # Authentication information obtaining class. Set this parameter based on the implementation class path. The value is for reference only.
      spark.sql.catalog.catalog_name.lakecat.auth.identity.util.class=com.huawei.cloud.dalf.lakecat.client.spark.v31.impl.SparkDefaultIdentityGenerator
    • You can use either of the following method to complete interconnection using MetastoreClient:

      Add the following configuration to spark-default.conf:

      # Authentication information obtaining class. Set this parameter based on the implementation class path. The value is for reference only.
      spark.hadoop.lakecat.auth.identity.util.class=com.huawei.cloud.dalf.lakecat.client.spark.v31.impl.SparkDefaultIdentityGenerator

      Alternatively, add the following configuration to hive-site.xml:

      <!--Authentication information obtaining class. The value is for reference only.-->
      <property>
      <name>lakecat.auth.identity.util.class</name>
      <value>com.huawei.cloud.dalf.lakecat.examples.LakeFormationExampleIdentityGenerator</value>
      </property>

Custom User Information Obtaining Class

The AuthenticationManager class is used to obtain the information of the user who accesses LakeFormation, which may be an IAM user or a local LDAP user. The default user information obtaining class obtains the user information using UserGroupInformation.getCurrentUser().

In addition to the default user information obtaining class introduced here, you can implement other user information obtaining method.

If user authentication information is used to access LakeFormation, the user information must be consistent with the user identity information (that is, the username and source must be consistent).

  1. Develop code.

    The implementation project is as follows. Add the lakeformation-lakecat-client dependency to the POM file of the Maven project.

    <dependency>
    <groupId>com.huawei.lakeformation</groupId>
    <artifactId>lakeformation-lakecat-client</artifactId>
    <version>${lakeformation.version}</version>
    </dependency>
    User information obtaining class, which implements the AuthenticationManager API.
    
    /*
    * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved.
    */
    
    package com.huawei.cloud.dalf.lakecat.examples;
    
    import com.huawei.cloud.dalf.lakecat.client.ConfigCenter;
    import com.huawei.cloud.dalf.lakecat.client.identity.AuthenticationManager;
    import com.huawei.cloud.dalf.lakecat.client.model.Principal;
    
    public class ExampleAuthenticationManager implements AuthenticationManager {
    @Override
    public void initialize(ConfigCenter configCenter) {
    //Perform initialization.
    }
    
    @Override
    public Principal getCurrentUser() {
    //Return the information about the current user.
    }
    }
  2. Configure integration.

    Use Maven to pack the code and place the JAR package in the spark/jars directory.

    Add the corresponding configurations for different interconnection methods:

    • If SparkCatalogPlugin is used for interconnection, add the following configurations to the spark-default.conf configuration file:
      # Optional parameter. Authentication manager implementation class, which is used to obtain the information of the current user. The value configured here is for reference only.
      spark.sql.catalog.catalog_name.lakeformation.authentication.manager.class=com.huawei.cloud.dalf.lakecat.examples.ExampleAuthenticationManager
      # Optional parameter, which specifies whether to specify the current user as the resource owner during resource creation. The default value is false.
      spark.sql.catalog.catalog_name.lakeformation.owner.designate=true
    • You can use either of the following method to complete interconnection using MetastoreClient:

      Add the following configuration to spark-default.conf:

      # Optional parameter. Authentication manager implementation class, which is used to obtain the information of the current user. The value configured here is for reference only.
      spark.hadoop.lakeformation.authentication.manager.class=com.huawei.cloud.dalf.lakecat.examples.ExampleAuthenticationManager
      # Optional parameter, which specifies whether to specify the current user as the resource owner during resource creation. The default value is false.
      spark.hadoop.lakeformation.owner.designate=true

      Alternatively, add the following configuration to hive-site.xml:

      <!--Optional parameter. Authentication manager implementation class, which is used to obtain the information of the current user. The value configured here is for reference only.-->
      <property>
      <name>lakeformation.authentication.manager.class</name>
      <value>com.huawei.cloud.dalf.lakecat.examples.ExampleAuthenticationManager</value>
      </property>
      <!--Optional parameter, which specifies whether to specify the current user as the resource owner during resource creation. The default value is false.-->
      <property>
      <name>lakeformation.owner.designate</name>
      <value>true</value>
      </property>
      </configuration>