Updated on 2024-05-29 GMT+08:00

Multi-Catalog

The multi-source data catalog aims to facilitate interconnection with external data catalogs to enhance Doris' data lake analysis and federated data query capabilities.

The multi-source data catalog function adds a catalog layer to the original metadata layer to form the three metadata layers of the Catalog -> Database -> Table. The catalog may directly correspond to the external data catalog.

Basic Concepts

  • Internal Catalog

    The original databases and tables of the Doris belong to the Internal Catalog. Internal Catalog is a built-in default catalog and cannot be modified or deleted by users.

  • External Catalog

    You can run the CREATE CATALOG command to create an External Catalog, and view the existing Catalogs using the SHOW CATALOGS command.

  • Switching Catalogs

    After login, you will enter the Internal Catalog by default (the default usage is the same as that in earlier versions). Then, you can view or switch to your target database via SHOW DATABASES and USE DB.

    You can run the SWITCH command to switch the catalog. For example:

    SWITCH internal;
    SWITCH hive_catalog;

    After the switchover, you can run the SHOW DATABASES and USE DB commands to view and switch the database in the corresponding catalog. The Doris automatically synchronizes the databases and tables in the catalog. You can view and access data in External Catalogs the same way as doing that in Internal Catalogs.

    Currently, the Doris supports only read-only access to data in the External Catalog.

  • Delete Catalog

    Databases and tables in External Catalog are read-only. However, the catalog can be deleted (the internal catalog cannot be deleted). You can run the DROP CATALOG command to delete an External Catalog.

    This operation only deletes the mapping information of the Catalog in Doris, but does not modify or change the content of any external data catalog.

  • Resource

    Resource is a set of configurations. You can run the CREATE RESOURCE command to create a resource. Then, you can use the resource when creating a catalog.

    A resource can be used by multiple catalogs to reuse the configuration in the resource.