Help Center/ DataArts Fabric/ Service Overview/ Features/ DataArtsFabric SQL Features
Updated on 2025-07-08 GMT+08:00

DataArtsFabric SQL Features

Introduction to DataArtsFabric SQL

DataArtsFabric SQL is a fully managed data platform designed for superb elasticity and lakehouse capabilities. It leverages Huawei Cloud's robust infrastructure, featuring resource pooling and massive storage. This, combined with its unique architecture—including parallel execution, metadata decoupling, and compute-storage separation—delivers advanced Software as a Service (SaaS) technologies. Its serverless architecture empowers you to process complex business logic using SQL, eliminating the need for infrastructure management.

Built upon the Huawei Cloud DataArtsFabric platform, DataArtsFabric SQL's architecture comprises a service access layer, computing layer, and storage layer. This design ensures hierarchical decoupling and elasticity across metadata services, computing, cache, and storage. Each layer can dynamically allocate resources without impacting the performance or availability of others. Statement-level elastic scaling and high-performance distributed analysis engines facilitate TB-level data queries in seconds and PB-level queries in minutes.

DataArtsFabric SQL supports processing and analyzing open structured data formats like Iceberg, ORC, and Parquet. It embraces the open lake ecosystem, enabling seamless data sharing when utilizing multiple data lake ecosystem services.

Embracing the Data+AI ecosystem, DataArtsFabric SQL offers a Python UDF feature. This allows users to execute Python scripts directly within SQL for one-stop AI data processing.

DataArtsFabric SQL provides a visualized interface and a JDBC driver for easy interaction with existing applications and third-party tools. Additionally, it offers REST and Python APIs, allowing developers to manage and transform data using familiar programming languages.

Product Architecture

Figure 1 DataArtsFabric SQL architecture

Functions

The following table describes the key functions of DataArtsFabric SQL.

Table 1 DataArtsFabric SQL features

Feature

Description

Standard SQL statements

Supports ANSI standard SQL, extended to include GBK, UTF-8, SQL ASCII, and Latin-1 character sets.

DDL

Supports CREATE, ALTER, DROP, SHOW, and DESCRIBE operations for schemas and tables.

Data types

Supports smallint, int, bigint, float, double, numeric, timestamp, date, varchar, char, bool, binary, and string.

APIs

Supports standard JDBC 4.0, RESTful APIs, and Python Connector APIs.

Transaction capabilities

Supports partition-level transactions and concurrency control.

Iceberg offers full transaction capabilities.

Multi-tenant management

Tenants are isolated via dedicated CNs/DNs. Each CN/DN occupies an exclusive POD for isolation.

Importing and exporting data

Data import and export use INSERT INTO. Data export from foreign tables supports format conversion (Parquet <-> ORC).

Scalability

Supports two-level elasticity. Elastic compute units (nodes) scale within seconds (less than 2s) in a resource pool based on query characteristics. Additional compute resources scale based on resource usage.

NOTE:

This is a beta feature. Capacity expansion operations require the DataArtsFabric service to provision containers via the management plane.

SMP

Provides intra-node Symmetric Multi-Processing (SMP) for full utilization of multi-core CPUs. query_dop is enabled by default.

Vectorized execution

The vectorized execution engine enhances OLAP performance.

Statistics collection

ANALYZE collects statistics to improve optimizer accuracy and ensure stable, efficient database performance.

Storage formats

Supports Parquet, ORC, and Iceberg.

DML

Supports INSERT INTO and INSERT OVERWRITE.

Partitioned tables

Supports partitioned tables for Parquet, ORC, and Iceberg.

Views

Supports views.

User-defined functions (UDFs)

Extends SQL statements with user-defined functions for unified execution. Currently, only Python is supported.

Elastic computing scale

A single query supports up to 256 elastic compute nodes.

Fine-grained access control

Table metadata is managed by LakeFormation, utilizing IAM permissions. LakeFormation currently oversees overall permission control.