Updated on 2024-05-24 GMT+08:00

Dumping Kafka Data to Object Storage Service (OBS)

Scenario

Create a Smart Connect task to dump Kafka instance data to OBS.

  • This function is unavailable for single-node instances.
  • Data in the source Kafka instance is synchronized to the dumping file in real time.

Restrictions

  • A maximum of 18 Smart Connect tasks can be created for an instance.
  • After a Smart Connect task is created, task parameters cannot be modified.

Prerequisites

  • You have enabled Smart Connect.
  • A Kafka instance has been created and is in the Running state.
  • A topic has been created.

Dumping Kafka Data to Object Storage Service (OBS)

  1. Log in to the console.
  2. Click in the upper left corner to select a region.

    Select the region where your Kafka instance is located.

  3. Click and choose Middleware > Distributed Message Service for Kafka to open the console of DMS for Kafka.
  4. Click the desired Kafka instance to view its details.
  5. In the navigation pane, choose Smart Connect.
  6. On the displayed page, click Create Task.
  7. For Task Name, enter a unique Smart Connect task name.
  8. For Task Type, select Dumping.
  9. For Start Immediately, specify whether to execute the task immediately after the task is created. By default, the task is executed immediately. If you disable this option, you can enable it later in the task list.
  10. In the Source area, retain the default setting.
  11. In the Topics area, set parameters based on the following table.

    Table 1 Topic parameters

    Parameter

    Description

    Regular expression

    A regular expression is used to subscribe to topics whose messages you want to dump.

    Enter/Select

    Enter or select the names of the topics to be dumped. Separate them with commas (,). A maximum of 20 topics can be entered or selected.

  12. In the Target area, set parameters based on the following table.

    Table 2 Target parameters

    Parameter

    Description

    Offset

    Options:

    • Minimum offset: dumping the earliest data
    • Maximum offset: dumping the latest data

    Dumping Period (s)

    Interval for periodically dumping data. The time unit is second and the default interval is 300 seconds.

    No package files will be generated if there is no data within an interval.

    AK

    Access key ID.

    For details about how to obtain the AK, see Access Keys.

    SK

    Secret access key used together with the access key ID.

    For details about how to obtain the SK, see Access Keys.

    Dumping Address

    The OBS bucket used to store the topic data.

    You can select an existing OBS bucket from the drop-down list or click Create Dumping Address to create a new OBS bucket.

    Dumping Directory

    Directory for storing topic files dumped to OBS. Use slashes (/) to separate directory levels.

    Time Directory Format

    Data is saved to a hierarchical time directory in the dumping directory. For example, if the time directory is accurate to day, the directory will be in the format of bucket name/file directory/year/month/day.

    Record Separator

    Select a separator to separate OBS dumping records.

    Use Storage Key

    Specifies whether to dump keys.

    Do not use the key of a message as the dumping file name.

  13. Click Create. The Smart Connect task list page is displayed. The message "Task xxx was created successfully." is displayed in the upper right corner of the page.