Help Center/ Data Lake Insight/ User Guide/ Configuring a DLI Job Bucket
Updated on 2024-08-20 GMT+08:00

Configuring a DLI Job Bucket

Scenario

When you use a DLI job, you need an OBS bucket to store temporary data, such as job logs and results generated during job running.

Configure a DLI job bucket on the Global Configuration > Project page of the DLI management console.

Before the configuration, create an OBS bucket or parallel file system (PFS). In big data scenarios, you are advised to create a PFS. PFS is a high-performance file system provided by OBS, with access latency in milliseconds. PFS can achieve a bandwidth performance of up to TB/s and millions of IOPS, which makes it ideal for processing high-performance computing (HPC) workloads.

For details about PFS, see Parallel File System Feature Guide.

Notes

  • The OBS bucket must be used to store temporary data generated by DLI, such as job logs and job results.
  • Do not use the OBS bucket for other purposes.
  • The OBS bucket must be set and modified by the main account. Member users do not have the permission.
  • If the bucket is not configured, you will not be able to view job logs.
  • You can create lifecycle rules to automatically delete objects or change storage classes for objects that meet specified conditions.
  • Inappropriate modifications of the job bucket may lead to loss of historical data.

Procedure

  1. In the navigation pane of the DLI console, choose Global Configuration > Project.
  2. On the Project page, click next to Job Bucket to configure bucket information.
    Figure 1 Project
  3. Click to view available buckets.
  4. Select the bucket for storing the temporary data of the DLI job and click OK.
    Temporary data generated during DLI job running will be stored in the OBS bucket.
    Figure 2 Setting the job bucket