Help Center/ DataArts Lake Formation/ Getting Started/ Creating a DataArts Lake Formation Instance and Planning Metadata
Updated on 2025-07-25 GMT+08:00

Creating a DataArts Lake Formation Instance and Planning Metadata

Scenario

This document provides step-by-step instructions for creating a DataArts Lake Formation (LakeFormation) instance from scratch and setting up catalogs along with internal databases, tables, and other metadata within the instance.

LakeFormation allows you to create, modify, check, and delete catalogs, databases, and data tables. It facilitates easy initialization and ongoing operations of your data lake and provides centralized and unified management of all metadata under the LakeFormation instance, thereby accelerating the planning and deployment of data lake services.

Procedure

Before you start, complete the operations described in Preparations. Then, follow these steps:

  1. Create a LakeFormation Instance: Create an exclusive LakeFormation instance.
  2. Create an OBS Path for Storing Metadata: Create an OBS path for storing metadata.
  3. Create a Catalog: Create a catalog named catalog1.
  4. Create a Database: Create a database named database1 in catalog catalog1.
  5. Create a Data Table: Create a data table named table_A in database database1.

Preparations

Step 1: Create a LakeFormation Instance

  1. Log in to the management console as the user prepared in Preparations.
  2. In the upper left corner, click and choose Analytics > LakeFormation to access the LakeFormation console.
  3. On the displayed page, select the checkbox next to I have read and agree with the LakeFormation Service Statement. and click Authorize.

    If authorization has been completed, skip this step.

  4. Click Buy Now or Buy Instance in the upper right corner of the Overview page.

    If a LakeFormation instance exists on the page, Buy Instance is displayed. Otherwise, Buy Now is displayed.

  5. Set the parameters listed below.

    Table 1 Parameters for creating a LakeFormation instance

    Parameter

    Example Value

    Description

    Type

    Exclusive

    Select an instance type.

    • Shared
    • Exclusive

    Billing Mode

    Pay-per-use

    Billing mode of the instance.

    Project

    xxx

    Select the project the instance belongs to.

    Name

    lakeformation-test

    Name of the LakeFormation instance.

    QPS

    10000

    Maximum number of requests per second. You do not need to set this parameter when Type is set to Shared.

    Enterprise Project

    xxx

    Enterprise project the cluster belongs to. If there is no enterprise project available, click Create to create one.

    Description

    -

    Description of the instance.

    Label

    -

    Enter a tag key and value and click Add.

  6. Click Buy Now, confirm the configuration, and pay.
  7. Click Back to Console. You can check information about the newly created LakeFormation instance on the console.

    Pay attention to the quota notification when creating an instance. If the resource quota is insufficient, apply for sufficient resources as prompted and then create an instance.

    Wait until the instance status changes to Running.

Step 2: Create an OBS Path for Storing Metadata

  1. Log in to the LakeFormation console.
  2. Click in the upper left corner of the page and choose Storage > Object Storage Service to access the Object Storage Service console.
  3. Click Parallel File Systems and click Create Parallel File System. On the displayed page, set the parameters, and click Create Now.

    • File System Name: Set the name of the parallel file system as required, for example, lakeformation-test.
    • Set other parameters based on the site requirements.

  4. On the Parallel File Systems page, click the name of the created file system, that is lakeformation-test.
  5. Click Files in the navigation pane, click Create Folder, enter a folder name, and click OK. Click the folder name and click Create Folder to create a subfolder.

    Repeat this step to create paths for storing metadata in sequence. The following paths are examples:

    • Catalog storage path: lakeformation-test/catalog1
    • Database storage path: lakeformation-test/catalog1/database1
    • Table storage path: lakeformation-test/catalog1/database1/table1 and lakeformation-test/catalog1/database1/table2
    • Function storage path: lakeformation-test/catalog1/database1/udf1

Step 3: Create a Catalog

  1. Log in to the LakeFormation console.
  2. In the upper left corner, click and choose Analytics > LakeFormation to access the LakeFormation console.
  3. From the drop-down list box on the left, select the LakeFormation instance you have created, for example, lakeformation-test. Choose Metadata > Catalog in the navigation pane on the left.
  4. On the displayed Catalog page, click Create. Set parameters by referring to the table below, retain the default values for other parameters, and click Submit.

    Table 2 Parameters for creating a catalog

    Parameter

    Example Value

    Description

    Catalog Name

    catalog1

    Name of the catalog to be created.

    The value can contain up to 256 characters. Only letters, numbers, and underscores (_) are allowed.

    Catalog Type

    DEFAULT

    Select a catalog type.

    Select Location

    obs://lakeformation-test/catalog1

    (Optional) Location where catalog data is stored in OBS.

    Click , select Parallel file system or Object storage bucket for Buckets, select a location, and click OK.

    • The location you specify must start with obs:// and must include a storage object. For example, select obs://lakeformation-test/catalog1. If there is no appropriate OBS path available, click go to OBS to create one and follow Step 2: Create an OBS Path for Storing Metadata to create it.
    • To prevent data conflicts, the path cannot be the metadata storage path that is being used by other LakeFormation instances.
    • You are advised to select a folder that is not selected by other catalogs.

    Description

    xxx

    Description of the catalog to be created.

  5. After the catalog is created, you can check its information on the Catalog page.

Step 4: Create a Database

  1. Log in to the LakeFormation console.
  2. In the upper left corner, click and choose Analytics > LakeFormation to access the LakeFormation console.
  3. From the drop-down list box on the left, select the LakeFormation instance you have created, for example, lakeformation-test. Choose Metadata > Database in the navigation pane on the left.
  4. On the displayed Database page, select the catalog you have created from the Catalog drop-down list box in the upper right corner, for example, catalog1.
  5. Click Create. Set parameters by referring to the table below, retain the default values for other parameters, and click Submit.

    Table 3 Parameters for creating a database

    Parameter

    Example Value

    Description

    Database Name

    database1

    Enter a name for the database to be created.

    The value can contain up to 128 characters. Only letters, numbers, and underscores (_) are allowed.

    Catalog

    catalog1

    Catalog the database to be created belongs to.

    Select Location

    obs://lakeformation-test/catalog1/database1

    Location where database information is stored in OBS.

    Click , select Parallel file system or Object storage bucket for Buckets, select a location, and click OK.

    • The location you specify must start with obs:// and must include a storage object. For example, select obs://lakeformation-test/catalog1/database1. If there is no appropriate OBS path available, click go to OBS to create one and follow Step 2: Create an OBS Path for Storing Metadata to create it.
    • The path must differ from the storage path of the associated catalog (that is, the Select Location parameter configured during catalog creation).
    • To prevent data conflicts, the path cannot be the metadata storage path that is being used by other LakeFormation instances.
    • If Database Storage Locations is set for the catalog the database belongs to, set this parameter to a subpath of Database Storage Locations or Select Location of the catalog.

    Description

    xxx

    Description of the database to be created.

  6. After the database is created, you can check its information on the Database page.

Step 5: Create a Data Table

  1. Log in to the LakeFormation console.
  2. In the upper left corner, click and choose Analytics > LakeFormation to access the LakeFormation console.
  3. From the drop-down list box on the left, select the LakeFormation instance you have created, for example, lakeformation-test. Choose Metadata > Table in the navigation pane on the left. In the upper right corner of the displayed Table page, select hive and default from the Catalog and Database drop-down list boxes, respectively.
  4. Click Create. Set parameters by referring to the table below, retain the default values for other parameters, and click Submit.

    Table 4 Basic information parameters

    Parameter

    Example Value

    Description

    Data Table

    table_A

    Name of the metadata table to be created.

    The value can contain up to 256 characters. Only letters, numbers, and underscores (_) are allowed.

    Catalog

    catalog1

    Catalog the table to be created belongs to.

    Database

    database1

    Database the table to be created belongs to.

    Table Type

    MANAGED_TABLE

    Type of the table to be created. The options are:

    • MANAGED_TABLE: managed table. If a managed table or partition is deleted, the data and metadata associated with the table or partition will be deleted.
    • EXTERNAL_TABLE: external table. Use an external table when a file already exists or is located in a remote location.
    • VIRTUAL_VIEW: virtual view. It does not store actual data and does not occupy physical space.
    • MATERIALIZED_VIEW: materialized view. It stores actual data and occupies physical space.

    Data Storage Location

    obs://lakeformation-test/catalog1/database1/table1

    OBS file directory the table is mapped to.

    Click , select the OBS location where the table is stored, and click OK.
    • This parameter is optional. If it is not set, the table storage path is Upper-layer database storage path/Table name.
    • The location you specify must start with obs:// and must include a storage object. For example, select obs://lakeformation-test/catalog1/database1/table1. If there is no appropriate parallel file system available, click go to OBS to create one and follow Step 2: Create an OBS Path for Storing Metadata to create it.
    • The path must differ from the storage paths of its associated catalog and database.
    • To prevent data conflicts, the path cannot be the metadata storage path that is being used by other LakeFormation instances.
    • If Data Table Storage Locations is set for the database the table belongs to, set this parameter to a subpath of Select Location or Data Table Storage Locations of the database.

    Compress Data

    Selected

    Whether to compress the data table.

    Compressing tables allows data within them to be stored in a compressed format, enhancing performance and saving storage space.

    Data Source Format

    Parquet

    Data source format of the table to be created.

    Separator

    -

    This parameter is available and mandatory when Data Source Format is set to Csv.

    Description

    xxx

    Description of the table to be created.

    The value can contain 0 to 4,000 bytes.

  5. After the table is created, you can check its information on the Table page.