Catalog Introduction
Iceberg Catalog serves as the top-tier component within Iceberg, tasked with overseeing the metadata and its associated operations across all Iceberg tables. This Catalog governs both the structure and metadata of these tables, providing interfaces essential for their creation, query, and modification—acting as the gateway through which users engage with Iceberg tables. Through it, users can pinpoint the exact location of the current metadata file for each table, making it an indispensable element for both reading from and writing to Iceberg tables.
The current DataArts Fabric SQL version supports using Hadoop Catalog as the Catalog component for Iceberg tables.
Hadoop Catalog
Hadoop Catalog operates independently of external systems and can utilize any file system, recording the metadata file paths of tables within a specific directory.
As Hadoop supports decoupled storage and compute, the underlying data files may reside on HDFS or an object storage system like OBS.
To locate a table in Hadoop Catalog, simply specify its path since all metadata for the table is embedded within these files.
LakeFormation Catalog
LakeFormation Catalog relies on the LakeFormation metadata service to oversee the most recent snapshots.
It uses an optimistic concurrency control (OCC) mechanism to maintain data integrity during concurrent writes across multiple tenants.
All Iceberg tables currently created on DataArts Fabric SQL are LakeFormation Catalog tables.
The following figure outlines the concurrency handling process for LakeFormation Catalog commits.
- Read the current table snapshot information from LakeFormation, including the latest metadata file path and snapshot ID.
- Write new data based on the current snapshot.
- Load the latest snapshot.
- Detect data conflicts. If a conflict occurs, the statement execution fails. Otherwise, attempt to commit the transaction.
- Write metadata, including manifest files, manifest list, and metadata file.
- Submit the latest metadata file path and snapshot ID to LakeFormation. If the submission fails, retry from step 3.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot