Updated on 2025-08-25 GMT+08:00

Overview

Scenario

This section introduces a Pandas-like Python DataFrame SDK, enabling you to write data processing jobs using Python. Leveraging the powerful computational capabilities of the DataArts Fabric SQL kernel, it offers easy-to-use and efficient data processing functions for data scientists and AI engineers.

This feature is built upon the Ibis Python DataFrame open-source framework, integrating the Ibis frontend framework with the DataArts Fabric SQL engine. You can compose data processing scripts using the familiar Ibis DataFrame API. The Ibis framework translates these Python APIs into executable SQL statements for the DataArts Fabric SQL engine, ensuring efficient execution of computational logic within the engine.

What Is DataFrame?

A DataFrame is a two-dimensional tabular data structure akin to an Excel spreadsheet or a table in a relational database, supporting both row and column labels.

Its key features include:

  • Tabular structure: Data is organized in rows and columns, each capable of holding different data types like integers, strings, floats, and more.
  • Indexing support: Both rows and columns typically come with indexes, facilitating easier data selection, manipulation, and analysis.
  • Comprehensive functionality: Supports operations such as data cleansing, transformation, aggregation, merging, making it an essential tool for data analysis and scientific computation.