Help Center/ DataArts Fabric/ Service Overview/ Features/ DataArts Fabric SQL Features

Updated on 2025-10-28 GMT+08:00

View PDF

DataArts Fabric SQL Features

Introduction to DataArts Fabric SQL

DataArts Fabric SQL is a fully managed data platform designed for superb elasticity and lakehouse capabilities. It leverages Huawei Cloud's robust infrastructure, featuring resource pooling and massive storage. This, combined with its unique architecture—including parallel execution, metadata decoupling, and compute-storage separation—delivers advanced Software as a Service (SaaS) technologies. Its serverless architecture empowers you to process complex business logic using SQL, eliminating the need for infrastructure management.

Built upon the Huawei Cloud DataArts Fabric platform, DataArts Fabric SQL's architecture comprises a service access layer, computing layer, and storage layer. This design ensures hierarchical decoupling and elasticity across metadata services, computing, cache, and storage. Each layer can dynamically allocate resources without impacting the performance or availability of others. Statement-level elastic scaling and high-performance distributed analysis engines facilitate TB-level data queries in seconds and PB-level queries in minutes.

DataArts Fabric SQL supports processing and analyzing open structured data formats like Iceberg, ORC, and Parquet. It embraces the open lake ecosystem, enabling seamless data sharing when utilizing multiple data lake ecosystem services.

Embracing the Data+AI ecosystem, DataArts Fabric SQL offers a Python UDF feature. This allows users to execute Python scripts directly within SQL for one-stop AI data processing.

DataArts Fabric SQL provides a visualized interface and a JDBC driver for easy interaction with existing applications and third-party tools. Additionally, it offers REST and Python APIs, allowing developers to manage and transform data using familiar programming languages.

Product Architecture

Figure 1 DataArts Fabric SQL architecture
Click to enlarge

Functions

The following table describes the key functions of DataArts Fabric SQL.

**Table 1** DataArts Fabric SQL features
Feature	Description
Standard SQL statements	Supports ANSI standard SQL, extended to include GBK, UTF-8, SQL ASCII, and Latin-1 character sets.
DDL	Supports CREATE, ALTER, DROP, SHOW, and DESCRIBE operations for schemas and tables.
Data types	Supports smallint, int, bigint, float, double, numeric, timestamp, date, varchar, char, bool, binary, and string.
APIs	Supports standard JDBC 4.0, RESTful APIs, and Python Connector APIs.
Transaction capabilities	Supports partition-level transactions and concurrency control. Iceberg offers full transaction capabilities.
Multi-tenant management	Tenants are isolated via dedicated CNs/DNs. Each CN/DN occupies an exclusive POD for isolation.
Importing and exporting data	Data import and export use INSERT INTO. Data export from foreign tables supports format conversion (Parquet <-> ORC).
Scalability	Supports two-level elasticity. Elastic compute units (nodes) scale within seconds (less than 2s) in a resource pool based on query characteristics. Additional compute resources scale based on resource usage. NOTE: This is a beta feature. Capacity expansion operations require the DataArts Fabric service to provision containers via the management plane.
SMP	Provides intra-node Symmetric Multi-Processing (SMP) for full utilization of multi-core CPUs. query_dop is enabled by default.
Vectorized execution	The vectorized execution engine enhances OLAP performance.
Statistics collection	ANALYZE collects statistics to improve optimizer accuracy and ensure stable, efficient database performance.
Storage formats	Supports Parquet, ORC, and Iceberg.
DML	Supports INSERT INTO and INSERT OVERWRITE.
Partitioned tables	Supports partitioned tables for Parquet, ORC, and Iceberg.
Views	Supports views.
User-defined functions (UDFs)	Extends SQL statements with user-defined functions for unified execution. Currently, only Python is supported.
Elastic computing scale	A single query supports up to 256 elastic compute nodes.
Fine-grained access control	Table metadata is managed by LakeFormation, utilizing IAM permissions. LakeFormation currently oversees overall permission control.