Updated on 2025-07-31 GMT+08:00

Creating LakeFormation Metadata

Metadata objects managed by LakeFormation include catalogs, databases, and tables.

Prerequisites

  • A LakeFormation instance has been created and runs properly.
  • Catalog data is stored in OBS and you have permissions to perform operations on OBS.
  • You have created an OBS bucket for storing catalog data by referring to Creating a LakeFormation Metadata Storage Path.

Creating a Catalog

A catalog is a metadata management object that can contain multiple databases.

Multiple catalogs can be created and managed in LakeFormation to isolate metadata of different external clusters.

  1. Log in to the LakeFormation console.
  2. Select the LakeFormation instance to be operated from the drop-down list on the left and choose Metadata > Catalog in the navigation pane.
  3. Click Create and configure the parameters.

    1. In the Basic Information area, configure the following parameters.
      Table 1 Parameter description

      Parameter

      Description

      Catalog Name

      Name of the catalog to be created.

      The value can contain 1 to 256 characters. Only letters, numbers, and underscores (_) are allowed.

      Catalog Type

      Catalog type. Options:

      • DEFAULT
      • CLICKHOUSE

      Select Location

      (Optional) Location where catalog data is stored in the OBS bucket.

      Click , select Parallel file system or Object storage bucket as required, select a location, and click OK.

      • The location you specify must start with obs:// and must include a storage object. For example, select obs://lakeformation-test/catalog1. If no desired OBS Buckets is available, click go to OBS and create one.
      • To prevent data conflicts, the path cannot be the metadata storage path that is being used by other LakeFormation instances.
      • You are advised to select a folder that is not selected by other catalogs.

      Description

      Description of the catalog to be created.

      The value can contain 0 to 4,000 bytes.

    2. (Optional) Click Add under Database Storage Locations. Click to manually select a database storage location as required, and click OK. Multiple locations can be added.

      Database Storage Location is optional. If it is configured, the databases in this catalog must be stored in the subpath of Database Storage Locations or Select Location of the catalog.

    3. Click Submit.

  4. View information about the created catalog on the Catalog page.

Creating a Database

Multiple databases can be created under a LakeFormation catalog. Centralized metadata management can maximize the value of data assets.

  1. Log in to the LakeFormation console.
  2. Select the LakeFormation instance to be operated from the drop-down list on the left and choose Metadata > Database in the navigation pane.
  3. Select a catalog from the Catalog drop-down list box in the upper right corner. View the databases contained in this catalog.
  4. Click Create and configure the parameters.

    1. In the Basic Information area, configure the following parameters.
      Table 2 Parameter description

      Parameter

      Description

      Database Name

      Name of the database to be created.

      The value can contain 1 to 128 characters. Only letters, numbers, and underscores (_) are allowed.

      Catalog

      Catalog the database to be created belongs to.

      Select Location

      Location where the database information is stored in the OBS bucket.

      Click , select Parallel file system or Object storage bucket as required, select a location, and click OK.

      • The location you specify must start with obs:// and must include a storage object. For example, select obs://lakeformation-test/catalog1/database1. If no desired OBS Buckets is available, click go to OBS and create one.
      • The path must differ from the storage path of the associated catalog (that is, the Select Location parameter configured during catalog creation).
      • To prevent data conflicts, the path cannot be the metadata storage path that is being used by other LakeFormation instances.
      • If Database Storage Locations is set for the catalog the database belongs to, set this parameter to a subpath of Database Storage Locations or Select Location of the catalog.

      Description

      Description of the database to be created.

      The value can contain 0 to 4,000 bytes.

    2. (Optional) Click Add under Data Table Storage Locations. Click to manually select a table storage path as required, and click OK. Multiple locations can be added.
      • Data Table Storage Locations is optional.
      • The data table storage path can be set to the catalog path and its subpaths, or database storage location path and its subpaths.
      • If this parameter is set, the table storage location in the database must be a subpath of Data Table Storage Locations or Select Location in the database.
    3. (Optional) Click Add under Function Storage Locations. Click to manually select a function storage path as required, and click OK. Multiple locations can be added.
      • Function Storage Locations is optional.
      • The function storage location can be set to the catalog path and its subpaths, or database storage location path and its subpaths.
      • If this parameter is set, the function storage location in the database must be a subpath of Function Storage Locations or Select Location in the database.
    4. Click Submit.

  5. View the database information on the Database page.

Creating a Table

  1. Log in to the LakeFormation console.
  2. Select a LakeFormation instance from the drop-down list box on the left, choose Metadata > Table, and select a catalog and database from the Catalog and Database drop-down lists in the upper right corner. View the tables contained in the selected database.
  3. Click Create and set related parameters.

    1. In the Basic Information area, configure the following parameters.
      Table 3 Parameter description

      Parameter

      Description

      Table Name

      Name of the metadata table to be created.

      The value can contain 1 to 256 characters. Only letters, numbers, and underscores (_) are allowed.

      Catalog

      Catalog the table to be created belongs to.

      Database

      Database the table to be created belongs to.

      Table Type

      Type of the table to be created. Options:

      • MANAGED_TABLE: managed table. If a managed table or partition is deleted, the data and metadata associated with the table or partition will be deleted.
      • EXTERNAL_TABLE: external table. Use an external table when a file already exists or is located in a remote location.
      • VIRTUAL_VIEW: virtual view. It does not store actual data and does not occupy physical space.
      • MATERIALIZED_VIEW: materialized view. It stores actual data and occupies physical space.

      Data Storage Location

      File directory of the OBS Buckets that the table is mapped to.

      Click , select a location for storing the table in the OBS bucket, and click OK.
      • This parameter is optional. If it is not set, the table storage path is Upper-layer database storage path/Table name.
      • The selected location must start with obs:// and must contain one storage object. For example, select obs://lakeformation-test/catalog1/database1/table1. If no desired OBS Buckets is available, click go to OBS and create one.
      • The path must differ from the storage paths of its associated catalog and database.
      • To prevent data conflicts, the path cannot be the metadata storage path that is being used by other LakeFormation instances.
      • If Data Table Storage Locations is set for the database the table belongs to, set this parameter to a subpath of Select Location or Data Table Storage Locations of the database.

      Compress Data

      Whether to compress the data table.

      Compressing tables allows data within them to be stored in a compressed format, enhancing performance and saving storage space.

      Data Source Format

      Data source format of the table to be created. Options:

      • Avro
      • Json
      • Xml
      • Parquet
      • Csv
      • Orc
      • Text
      • Rc
      • Sequence
      • Custom

        Parameters Input Format, Output Format, Serde name, and SerializationLib are displayed if Data Source Format is set to Custom. Set these parameters as required.

      Separator

      This parameter is displayed if Data Source Format is set to Csv. Options:

      • Comma(,)
      • Vertical bar(|)
      • Semicolon(;)
      • Tab(\u0009)
      • Ctrl-A(\u0001)

      Description

      Description of the table to be created.

      The value can contain 0 to 4,000 bytes.

    2. (Optional) Click Add in the Table Field area. Manually add one or more metadata table fields as required and click OK.

      A table field is an independent piece of information that forms a record in a table.

    3. (Optional) Click Add in the Partition Key area. Manually add one or more partition keys of the metadata as required and click OK.

      A partition key is an ordered set of one or more table columns. The values in the table partition keys are used to determine the data partition that a row belongs to.

    4. (Optional) Click Add in the Table Attributes area. Manually add one or more table attributes of the metadata as required and click OK.

      A table attribute enables you to tag table definitions with your own metadata key-value pairs.

    5. Click Submit.

  4. Check information on the Table page.

Creating a Function

  1. Log in to the LakeFormation console.
  2. Select the target LakeFormation instance from the drop-down list box on the left and choose Metadata > Function. In the upper right corner, select a catalog and database from the Catalog and Database drop-down lists. You can view the functions contained in the selected database.
  3. Click Create and set related parameters.

    1. In the Basic Information area, set the related parameters.
      Table 4 Parameter description

      Parameter

      Description

      Function Name

      Name of the function to be created.

      The value can contain 1 to 256 characters. Only letters, numbers, and underscores (_) are allowed.

      Catalog

      Catalog to which the function to be created belongs.

      Database

      Database to which the function to be created belongs.

      Type

      Type of the function to be created. Currently, JAVA is supported.

      Class Name

      Class name of the function to be created.

    2. (Optional) Click Add under Function Locations to add the function package type and location as required, and click OK. Multiple locations can be added.
      • Function Locations is optional.
      • If Function Storage Locations is set for the database the function belongs to, set this parameter to a path or subpath of Select Location or Function Storage Locations of the database.
    3. Click Submit.

  4. View the function information on the Function page.

Reference

To view, modify, and delete the metadata you created, see LakeFormation Metadata Management.