Updated on 2024-02-02 GMT+08:00

Metadata

Data Catalogs

A top-level resource in the metadata resources of a LakeFormation instance and multiple catalogs can be created in a LakeFormation instance. Metadata information such as name, description, and location are included in catalogs. Catalogs can be created, modified, and deleted.

Location indicates the file directory of the OBS parallel file system mapped to the catalog.

Databases

Databases are stored in the data catalogs of a LakeFormation instance and multiple databases can be created under a catalog. Metadata information such as name, description, and location are included in databases. You can create, modify, and delete databases, as well as grant and check databases permissions.

Location indicates the file directory of the OBS parallel file system mapped to the databases.

Tables

You can create multiple tables in a database. Metadata such as basic information, format and serialization information, fields, and attributes are included in tables. You can create, modify, and delete tables, as well as grant and check permissions.

Functions

Functions are used to perform specific processing on data in SQL queries, including built-in functions and user-defined functions (UDFs).

User-defined functions are classified into the following types:

  • Common UDFs: used to perform operations on a single data row and export a single data row.
  • User-defined aggregating functions (UDAFs): used to input multiple data rows and export a single data row.
  • User-defined table-generating functions (UDTFs): used to perform operations on a single data row and export multiple data rows.

Partitions

Partitioning is to split a data table by row to reduce the total amount of data read and write operations in specific SQL operations, and therefore shortening the response time.