Deze pagina is nog niet beschikbaar in uw eigen taal. We werken er hard aan om meer taalversies toe te voegen. Bedankt voor uw steun.

On this page

Show all

How Do I Merge Small Files?

Updated on 2023-03-21 GMT+08:00

If a large number of small files are generated during SQL execution, job execution and table query will take a long time. In this case, you should merge small files.

  1. Set the configuration item as follows:

    spark.sql.shuffle.partitions = Number of partitions (number of the generated small files in this case)

  2. Execute the following SQL statements:
    INSERT OVERWRITE TABLE tablename
    select  * FROM  tablename distribute by rand()
Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback