Updated on 2025-08-12 GMT+08:00

Creating an Identification Task

Sensitive data identification leverages a data identification engine to scan, classify, and grade structured data (e.g., RDS, DWS) and unstructured data (e.g., OBS).

Prerequisites

Creating a Sensitive Data Identification Task

  1. Log in to the DSC console.
  2. Click in the upper left corner of the management console and select a region or project.
  3. In the navigation pane on the left, choose Sensitive Data Identification > Identification Task.
  4. In the upper left corner of the task list, click Create Task.
  5. On the displayed page, configure the parameters by referring to Table 1.

    Table 1 Parameter description

    Parameter

    Description

    Task Name

    You can customize the task name.

    The task name must meet the following requirements:

    • Contain 4 to 255 characters.
    • Consist of letters, digits, underscores (_), and hyphens (-).
    • The name must start with a letter.
    • Be unique.

    Sensitive Data

    Type of data to be identified. You can select multiple types.

    • OBS: DSC identifies sensitive data in the added Huawei Cloud OBS assets. For details about how to add OBS assets, see Adding an OBS Asset.
    • Database: DSC identifies sensitive data of authorized database assets. For details about how to authorize DSC to access your database assets, see Adding and Authorizing Database Assets.
    • Big data: DSC identifies sensitive data of authorized big data assets. For details about how to authorize DSC to access your big data assets, see Adding and Authorizing Big Data Assets.
    • MRS: DSC identifies sensitive data of authorized big data assets. For details about how to authorize DSC to access your MRS assets, see Adding and Authorizing Big Data Assets.
    • LTS: DSC identifies sensitive data of authorized LTS assets. For details about how to add log streams, see Adding a Log Stream.

    Identification Template

    You can select a built-in or custom template. DSC displays data by level and category based on the template you select. For details about how to add a template, see Creating an Identification Template.

    Identification Scope

    This parameter is displayed when Data Type is set to LTS. Set this parameter to 1 day, 2 days, or 3 days.

    Identification Intensity

    This parameter is displayed when Data Type is set to LTS. Select the log identification intensity, which can be High, Medium, or Low. A higher intensity indicates more sampled data.

    Identification Period

    Set the execution policy of the data identification task.

    • Once: The task will be executed once at a specified time or immediately.
    • Daily: The task is executed at a fixed time every day.
    • Weekly: The task is executed at a specified time every week.
    • Monthly: The task is executed at a specified time every month.

    When to Execute

    This parameter is displayed when Identification Period is set to Once.
    • Now: Select the option and click OK, the system executes the data identification task immediately.
    • As scheduled: The task will be executed at a specified time.

    Start Time

    This parameter is displayed when Identification Period is set to Daily, Weekly, or Monthly.

    Select the time when the task is being executed. After the time is selected, the task is executed every day, every week, every month, or at the specified time.

    (Optional) Topic

    • Select an existing topic from the drop-down list or click View Topic to create a topic for receiving alarm notifications.
    • If you do not configure a topic, you can view the identification result in the identification task list. For details, see Viewing and Downloading Sensitive Data Identification Results.

    (Optional) Add Identification Scope

    This parameter is displayed after Data Type is set to a specific asset. Click Add OBS Identification Scope, Add Database Identification Scope, Add Big Data Identification Scope, Add MRS Identification Scope, or Add LTS Identification Scope to add an asset identification scope. If no scope is specified, global scanning is performed on the selected assets by default. For details, see Adding an Identification Scope.

    AI-assisted Identification of Unstructured Data

    This parameter is displayed when Data Type is set to OBS and a bucket asset is selected.

    The AI-assisted identification system uses our proprietary natural language processing engine to automatically scan and identify data, then processes it automatically. Once activated, it checks built-in rules that support AI-assisted verification. Enabling this function may slow down the identification speed.

    NOTE:

    If no sensitive data matches the built-in rules, secondary identification is skipped, regardless of whether AI-assisted identification of unstructured data is enabled.

  6. Click OK. A message is displayed indicating the task is created successfully.
  7. Locate the task to be started and click Start Identification in the Operation column. If a message is displayed in the upper right corner, indicating that the scan task starts, the operation is successful.

    If you want to stop an ongoing task, click Stop in the Operation column of the row containing the target task.

    To disable a scheduled task, choose More > Stop Task in the Operation column of the target task.

Adding an Identification Scope

By default, DSC performs a global scan on the selected assets. You can also add a scan scope by referring to this section.

  1. Log in to the DSC console.
  2. Click in the upper left corner of the management console and select a region or project.
  3. In the navigation pane on the left, choose Sensitive Data Identification > Identification Task.
  4. Click Create Task. The Create Task page is displayed.
  5. Select the data type, select the name of the asset to be scanned, and click OK.
  6. In the lower left corner of the page, click the button to add an identification scope. You can add multiple scopes at the same time. For details about the parameter settings, see Table 2.

    Table 2 Parameters for configuring the scan scope

    Asset Type

    Configuration Parameter

    Description

    OBS

    Asset

    Select the bucket to be scanned from the drop-down list. You can select multiple buckets.

    Scan Scope

    • File Name Prefix: For example, if you add dsc_ as the inclusion condition, all files whose names start with dsc_ will be scanned.

      A maximum of one inclusion condition can be added for the file name prefix.

    • File name extension: The file name extension contains the file type following the dot (.). For example, the file name extension dsc_security.txt can be security.txt or .txt. Only the files that meet all the filtering conditions are scanned.

      A maximum of one inclusion condition can be added for the file name extension.

    • Directory name: Specifies the directory to be scanned. All files in the specified directory are scanned.

      A maximum of one inclusion condition can be added for the directory.

    After entering the file name prefix/suffix/directory name, click Add as Inclusion Condition to add it as an inclusion condition. Click Add as Exclusion Condition to add it as an exclusion condition.

    For example, if you select the File name prefix, enter the prefix dsc_, and click Add as Inclusion Condition, only the files whose file name prefix is dsc_ are scanned. If you click Add as Exclusion Condition to as the prefix as an exclusion condition, only files whose prefixes are not dsc_ are scanned.

    Scan Depth

    • Global Scan: If this parameter is selected, all data in the specified scope is scanned.
    • Specify Scan Scope: Select Specify Scan Scope and enter the Scan Depth. The depth of the root directory starts at 1 and increases incrementally. However, it must not surpass a depth of 10.

    Database/Big Data/MRS

    Asset

    Select an instance name from the drop-down list. You can select multiple instances.

    Scan Scope

    • Table name prefix: A maximum of one inclusion condition can be added for the table name prefix. For example, if you enter dsc_ as the prefix of a table name and click Add as Inclusion Condition only the table data whose prefix is dsc_ is scanned. If you click Add as Exclusion Condition to as the prefix as an exclusion condition, only tables whose prefixes are not dsc_ are scanned.
    • Table name suffix: A maximum of one inclusion condition can be added for the table name suffix. The principle is the same as that of the prefix.

    LTS

    Asset

    Select an instance name from the drop-down list. You can select multiple instances.

    Scan Scope

    • Key prefix: If this parameter is added as an inclusion condition, the log content that contains the key prefix is scanned. If this parameter is added as an exclusion condition, the log content except the key prefix is scanned.
    • Key suffix: The principle is the same as that of the key prefix.
      NOTE:
      • A maximum of one inclusion condition can be added for each of the key prefix and suffix.
      • A maximum of 10 exclusion conditions can be added for key prefixes and suffixes.

    Figure 1 Configuring the scan scope

Related Operations

  • In the Operation column of a target task, click More > Edit. On the displayed Edit Task page, edit and modify the task information.
  • Click More > Delete in the Operation column of the target task. In the displayed dialog box, click OK to delete the task.
    • If an identification task is running, stop the task or wait until the task is complete, then delete it.
    • The deletion operation cannot be undone.