Updated on 2024-02-02 GMT+08:00

Managing Catalogs

A data catalog is a metadata management object that can contain multiple databases.

Multiple catalogs can be created and managed in LakeFormation to isolate metadata of different external clusters.

Prerequisites

  • A LakeFormation instance has been created and is running properly.
  • Catalog data is stored in OBS and the permission to perform operations on OBS is obtained.
  • You have created an OBS parallel file system for storing catalog data by referring to Creating a Metadata Storage Path.

Creating a Catalog

  1. Log in to the LakeFormation console.
  2. In the upper left corner, click and choose Analytics > LakeFormation to access the LakeFormation console.
  3. Select the LakeFormation instance to be operated from the drop-down list on the left and choose Metadata > Catalog in the navigation pane.
  4. Click Create and set the parameters.

    1. In the Basic Information area, set the related parameters.
      Table 1 Parameters for creating a catalog

      Parameter

      Description

      Catalog Name

      Enter a catalog name.

      The value should contain 1 to 256 characters. Only letters, numbers, and underscores (_) are allowed.

      Catalog Type

      The options are as follows:

      • DEFAULT
      • CLICKHOUSE

      Select Location

      Select a location where catalog data is stored in the OBS parallel file system.

      Click , select a location, and click OK.

      • The selected location must start with obs:// and must contain one storage object. For example, select obs://lakeformation-test/catalog1. If no suitable parallel file system is available, click go to OBS and create one.
      • You are advised to select a folder that is not selected by other catalogs.

      Description

      Enter a description of the created catalog.

      The content length must be between 0 and 4000 bytes (3 bytes per Chinese character).

    2. (Optional) Click Add under Database Storage Locations. On the Add Database Storage Location page, click to manually select a database storage path as required. Click OK.

      Click to add more storage paths. Click to delete a storage path.

      Selecting database storage path is an optional operation. If a database storage path is added, the databases in this catalog must be stored in the subpath of the database storage path or that of the selected catalog location path.

    3. Click Submit.

  5. After the catalog is created, you can view the catalog information on the Catalog page.

    Click Modify in the Operation column to modify the configurations of a catalog.

    Click Database in the Operation column to view the databases in a catalog.

    Click More to authorize or delete a catalog, or view the permissions of a catalog.

    If files are deleted when metadata is deleted, the system processes the files based on the deletion policy that you have configured. For how to configure a deletion policy, see Configuring Metadata Lifecycle. If no deletion policy is configured, data is moved to the recycle bin (OBS path lake-formation-trash-dir/table_id) by default.