Updated on 2023-12-14 GMT+08:00

What's New in Spark 2.4.5

DLI complies with the release consistency of the open source Spark compute engine. This document describes the updates in Spark 2.4.5.

For more information about Spark 2.4.5, see Spark Release Notes.

Spark 2.4.5 Release Date

Version

Release Date

Status

EOM Date

EOS Date

DLI Spark 2.4.5

December 2021

Released

December 31, 2023

December 31, 2024

For more version support information, see Lifecycle of DLI Compute Engine Versions.

Spark 2.4.5 Description

Table 1 lists the main features of Spark 2.4.5.

For more new features, see Release Notes - Spark 3.1.1.

Table 1 Advantages of Spark 2.4.5

Feature

Description

Merging small files

If a large number of small files are generated during SQL execution, job execution and table query will take a long time. In this case, you are advised to merge small files.

Merge small files by referring to How Do I Merge Small Files?

Modifying column comments of non-partitioned or partitioned tables

You can modify the column comments of non-partitioned or partitioned tables.

Collecting statistics on the CPU usage of SQL jobs

You can view the total CPU used on the console.

Viewing Spark logs of container clusters

You need to view logs in the container.

Dynamic UDF loading (OBT)

The UDF takes effect without restarting the queue.

Supporting flame graphs on the Spark UI

Flame graphs can be created on the Spark UI.

Optimizing the query performance of the NOT IN statement for SQL jobs

The query performance of the NOT IN statement is improved.

Optimizing the query performance of the Multi-INSERT statement

The query performance of the Multi-INSERT statement is improved.