Help Center> Data Ingestion Service> Getting Started> Step 1: Creating a DIS Stream
Updated on 2023-07-14 GMT+08:00

Step 1: Creating a DIS Stream

You can create a DIS stream on the DIS management console.

Procedure

  1. Use the account to log in to the DIS console.
  2. Click in the upper left corner of the page and select a region and project.
  3. Click Buy Stream and set related parameters.

    Table 1 Stream parameters

    Parameter

    Description

    Example

    Billing Mode

    Pay-per-use

    Pay-per-use

    Region

    Physical location of the cloud service. You can select a different region from the drop-down list.

    -

    Basic Information

    Stream Name

    Name of the DIS stream to be created. A stream name is 1 to 64 characters long. Only letters, digits, hyphens (-), and underscores (_) are allowed.

    dis-Tido

    Stream Type

    • Common: Each partition supports a maximum read speed of 2 MB/s and a maximum write speed of 1 MB/s.
    • Advanced: Each partition supports a maximum read speed of 10 MB/s and a maximum write speed of 5 MB/s.

    -

    Partitions

    Partitions are the base throughput unit of a DIS stream.

    5

    Partition Calculator

    Calculator used to calculate the estimated number of partitions based on the information you entered.
    1. Click Partition Calculator.
    2. In the Partition Calculator dialog box, configure the Average Record Size (KB), Max. Records Written, and Consumer Applications parameters. The Estimated Partitions field then displays the recommended number of partitions. The value of this field cannot be modified.
      NOTE:
      Partition calculation formulas:
      • Based on the traffic (the final value must be rounded up):

        Common stream: Average record size x (1 + 20%) x Maximum records written/ (1 x 1024 KB) (20% is the reserved partition percentage.)

        Advanced stream: Average record size x (1 + 20%) x Maximum records written/ (5 x 1024 KB) (20% is the reserved partition percentage.)

      • Based on the consumer program quantity (the final value must be rounded up):

        (Number of consumer programs/2) x Number of partitions calculated based on the traffic (The result of the number of consumer programs/2 must reserve two decimals.)

        The largest value among the values calculated based on the previous three formulas is considered as the estimated partition value.

    3. Click Use Estimated Value. The estimated value is automatically used as the value of Partitions.

    -

    Data Retention (hours)

    The maximum number of hours for which data can be preserved in DIS. Data will be deleted when the retention period expires.

    Value range: an integer ranging from 24 to 72.

    24

    Source Data Type

    • BLOB: a collection of binary data stored as a single entity in a database management system. If Source Data Type is set to BLOB, the supported Dump Destination can be OBS or MRS.
    • JSON: an open-standard file format that uses human-readable text to transmit data objects consisting of attribute–value pairs and array data types. If Source Data Type is set to JSON, the supported Dump Destination can be OBS, MRS, DLI, or DWS.
    • CSV: a simple text format for storing tabular data in a plain text file.

      If Source Data Type is set to CSV, the supported Dump Destination can be OBS, MRS, DLI, or DWS.

    JSON

    Auto-Scaling

    You can choose to enable or disable auto-scaling when creating a stream.

    You can click or to disable or enable auto-scaling.

    NOTE:

    You can choose whether to enable auto-scaling when creating a stream. You can also modify the auto-scaling attributes for a created stream.

    Auto-Scale Down To

    Lower limit for automatic scale-down. The number of target partitions for automatic scale-down must be greater than or equal to the lower limit.

    -

    Auto-Scale Up To

    Upper limit for automatic scale-up. The number of target partitions for automatic scale-up must be smaller than the lower limit.

    -

    Data Delimiter

    Data delimiter when Source Data Type is CSV.

    -

    Schema

    Whether to create a schema when creating a stream. This parameter is available when Source Data Type is JSON or CSV.

    You can click or to disable or enable the schema configuration.

    NOTE:

    If no data schema is created when a stream is created, you can also create it later after the stream is created. Create a schema on the Stream Management page. For details, see Managing a Source Data Schema.

    You can create a schema only when the source data type is set to JSON or CSV.

    Source Data Schema

    You can enter or import source data samples in JSON or CSV format. For details, see Managing a Source Data Schema.

    1. In the left text box, enter a JSON or CSV source data sample or click to import a source data sample.
    2. In the left text box, click to delete your entered or imported source data sample.
    3. In the left text box, click to generate an Avro schema in the right text box according to the source data sample.
    4. In the right text box, click to delete the generated Avro schema.
    5. In the right text box, click to modify the generated Avro schema.

    This parameter is mandatory only when Schema is set to Enable.

    Enterprise Project

    Configure the enterprise project to which streams belong. You can configure this parameter only when the Enterprise Management service is enabled. The default value is default.

    An enterprise project facilitates project-level management and grouping of cloud resources and users.

    You can select the default enterprise project (default) or other existing enterprise projects. To create an enterprise project, log in to the Enterprise Management console. For details, see the Enterprise Management User Guide.

    -

    Configure

    Click Configure now. The Tag parameter is displayed.

    For details about how to add a tag, see Managing Stream Tags.

    -

    Skip

    No advanced settings need to be configured.

    -

    Tag

    Identifier of the stream. Adding tags to streams can help you identify and manage your stream resources.

    -

  4. Click Next. The Details page is displayed.
  5. Click Submit.