Updated on 2025-01-09 GMT+08:00

DLI Datasource V1 Table and Datasource V2 Table

What Are DLI Datasource V1 and V2 Tables?

  • DLI datasource V1 table (referred to as V1 table): This is a DLI-specific datasource table format. DLI's custom create/insert/truncate commands are used, and the data path of the table is $tablepath/UUID/Data file.
    Figure 1 DLI datasource v1 table
  • DLI datasource V2 table (referred to as V2 table): This is the open-source datasource table format of Spark. Spark's open-source create/insert/truncate commands are used, and the data path of the table is $tablepath/Data file.
    Figure 2 DLI datasource v2 table

Compatibility of DLI Spark Versions with V1 and V2 Tables

Table 1 Compatibility of DLI Spark versions with v1 and v2 tables

Table Type

Spark 2.3 SQL Queue

Spark 2.3 General-Purpose Queue

Spark 2.4 SQL Queue

Spark 2.4 General-Purpose Queue

Spark 3.1 SQL Queue

Spark 3.1 General-Purpose Queue

Spark 3.3 SQL Queue

Spark 3.3 General-Purpose Queue

V1 table

Partially supported

V2 table

×

×

×

×

Table 2 Syntax support list for Spark 3.3 general-purpose queues

Table Type

select

create table

create table like

CTAS

insert into

insert overwrite

load data

alter table set location

truncate table

V1 table

×

×

×

×

×

×

V2 table

How Do I Confirm If a User-Created Table is a V1 or V2 Table?

1. Use the datasource syntax to create a table:

CREATE TABLE IF NOT EXISTS table_name (id STRING) USING parquet;

2. Run show create table to check the value of the version field under TBLPROPERTIES.

If v1, it is a V1 table; if v2, it is a V2 table.

To change a V1 table to a V2 table, submit a service ticket to contact customer support.

Example Upgrade

Upgrading the Spark engine and modifying data tables may cause changes in the cost of billed resources if the type of compute resource changes when creating a queue.

  • If the original queue uses compute resources of the elastic resource pool type, creating a queue does not involve changes in the cost of compute resources.
  • If the original queue uses compute resources of a non-elastic resource pool type, creating a queue within an elastic resource pool will change the cost of compute resources. Refer to the price details of compute resources for specifics.
  • Example 1: Does upgrading Spark from version 2.4.x to Spark 3.3.1 affect the version of data tables when using a SQL queue?

    No, SQL queues in Spark 2.4.x support V1 and V2 tables, so upgrading Spark only requires considering the compatibility of the Spark version with SQL syntax.

  • Example 2: Does upgrading Spark from version 2.4.x to Spark 3.3.1 affect the version of data tables when using a general-purpose queue?

    General-purpose queues in Spark 2.4.x support V1 and V2 tables, but general-purpose queues in Spark 3.3.x do not support V1 tables.

    Therefore, to upgrade Spark from version 2.4.x to 3.3.1, follow these steps:

    1. Change V1 tables in Spark 2.4.x to V2 tables.
    2. Upgrade V2 tables in Spark 2.4.x to V2 tables in Spark 3.3.1.
      Consider the compatibility of Spark Jar job API syntax as well.
      Table 3 Compatibility of DLI Spark versions with v1 and v2 tables

      Table Type

      Spark 2.4 General-Purpose Queue

      Spark 3.3 General-Purpose Queue

      V1 table

      Partially supported

      V2 table

  • Example 3: How do I upgrade V1 tables in Spark 2.3.2 to V2 tables in Spark 3.3.1 using a general-purpose queue?

    General-purpose queues in Spark 2.3.2 do not support V2 tables, and general-purpose queues in Spark 3.3.1 do not support V1 tables.

    1. Upgrade V1 tables in Spark 2.3.2 to V1 tables in Spark 2.4.5.
    2. Change V1 tables in Spark 2.4.5 to V2 tables.
    3. Upgrade V2 tables in Spark 2.4.5 to V2 tables in Spark 3.3.1.

      Consider the compatibility of Spark Jar job API syntax as well.

    Table 4 Compatibility of DLI Spark versions with v1 and v2 tables

    Table Type

    Spark 2.3 General-Purpose Queue

    Spark 2.4 General-Purpose Queue

    Spark 3.3 General-Purpose Queue

    V1 table

    Partially supported

    V2 table

    ×