Updated on 2024-11-19 GMT+08:00

Injecting Watermarks to Databases

Prerequisites

Constraints

  • GaussDB(DWS) data supports the following watermarks: smallint, integer, bigint, float4, float8, varchar, text, and char.
  • MRS Hive data supports the following watermarks: smallint, int, long, float, double, and string.
  • A single column in the embedding target cannot have more than 30% redundant data.
  • The database encoding is UTF-8.
  • The database injection is a non-primary key column.
  • It is recommended that the number of data rows in a data table be greater than 1500.

Creating a Sensitive Data Identification Task

  1. Log in to the management console.
  2. Click in the upper left corner and select a region or project.
  3. In the navigation tree on the left, click . Choose Security & Compliance > Data Security Center .
  4. In the navigation pane, choose Data Asset Protection > Database Watermark.
  5. Click Create Task. The Configure basic information page is displayed.

    Figure 1 Configuring basic information
    Table 1 Parameters for configuring basic information

    Parameter

    Description

    Task Name

    Enter a task name.

    The value can contain only letters, digits, underscores (_), and hyphens (-), and cannot exceed 255 characters.

    Watermark ID

    Enter the watermark identifier to be injected.

    Embedding Scheme

    Click the drop-down list box to select a watermark embedding scheme. The options are as follows:

    • Lossless - pseudo-column watermark: A pseudocolumn related to other attributes of the relationship table is generated. The pseudocolumn is deceptive to attackers. Watermarks are embedded into the pseudocolumn to reduce damage to the original data.
    • Lossless - pseudo-line watermark: Pseudo lines are generated based on the data type, data format, and value range. Watermarks are embedded into these pseudo lines to reduce damage to the original data.
    • Lossy - column watermark: If you directly add watermarks to column data, the data will be modified or damaged.
    NOTE:

    A higher error correction level indicates more watermark bits and a lower bit error rate (BER) during source tracing. Note that a higher error correction level requires more data as the embedding target to ensure embedding integrity. The default value is 1.

  6. Click Next. On the Configure source and target information page, set related parameters.

    • Lossless - pseudocolumn watermark: Embed watermarks to newly created columns to avoid data loss.
      Figure 2 Pseudocolumn watermarks
      Table 2 Source and destination parameters of pseudocolumn watermarks

      Parameter

      Description

      Data Source Type

      Select a data source type from the drop-down list box.

      • When Embedding Scheme is set to Lossy - Column Watermark, the following data source types are supported:
        • DWS
        • MRS_HIVE
      • When Embedding Scheme is set to Lossless pseudo-column watermarking or Lossless pseudo-line watermarking, the following data types are supported:
        • DWS
        • PostgreSQL
        • MySQL

      Database Instance

      Select a Database Instance from the drop-down list. If no database instance is available, add databases by following the instructions provided in sections Authorizing Access to Database Assets and Authorizing Access to Big Data Assets.

      Database

      Select a Database from the drop-down list.

      Schema

      This parameter is displayed when Database is DWS or PostgreSQL. Click a Mode as required.

      Source Table

      Select the corresponding Source Table name.

      Column Name

      Only letters, numbers, underscores (_), and hyphens (-) are allowed (255 characters max).

      Column Data Type

      Click to select the data type of the embedded pseudocolumn.

      • Numeric
      • String
      • Date

      Example Value

      Choose Setting Field Rules. The embedded pseudocolumn data example is displayed.

      Setting Field Rules

      • If Column Data Type is set to Numeric, this parameter is a random number. You can specify the range and precision of the random number. If the range and precision are not specified, pseudo data will be randomly generated.
      • When Column Data Type is set to String, you can select pseudo data such as the person name, ID card number, and mobile number from the drop-down list box.
      • When the Column Data Type is set to Date, you can specify a date range. If no date range is specified, pseudo data is randomly generated.

      Add a Pseudo Column

      You can click Add a Pseudo Column to add two pseudo-columns,

      Target Table

      Enter the target table name. The name can contain only letters, digits, underscores (_), and hyphens (-) and cannot exceed 255 characters.

    • Lossless - pseudo-line watermark: Watermarks are embedded into line copies to avoid data loss.
      Figure 3 Pseudo-line watermark
      Table 3 Source and destination parameters of pseudo-line watermarks

      Parameter

      Description

      Example Value

      Data Source Type

      Select a Data Source Type from the drop-down list. The following data source types are supported:

      • DWS
      • PostgreSQL
      • MySQL

      DWS

      Database Instance

      Select a Database Instance from the drop-down list. If no database instance is available, add databases by following the instructions provided in sections Authorizing Access to Database Assets and Authorizing Access to Big Data Assets.

      DWS-dsc-Test

      Database

      Select a Database from the drop-down list.

      gaussdb

      Mode

      This parameter is displayed when Database is DWS or PostgreSQL. Click a Mode as required.

      pg_catalog

      Source Table

      Click and select the corresponding source data table name.

      pg_proc

      Number of Pseudo-Line Spans

      Enter the number of pseudo lines. The value must be a valid integer greater than 1.

      10

      Target Table

      Enter the name of the data storage table with watermarks embedded. The name can contain only letters, digits, underscores (_), and hyphens (-), and cannot exceed 255 characters.

      Test_Table

    • Lossy - column watermark: Embed watermarks directly to the column data.
      Figure 4 Lossy column watermarks
      Table 4 Source and destination parameters of lossy column watermarks

      Parameter

      Description

      Example Value

      Data Source Type

      Select a Data Source Type from the drop-down list. The following data source types are supported:

      • DWS
      • MRS-HIVE

      DWS

      Database Instance

      Select a Database Instance from the drop-down list. If no database instance is available, add databases by following the instructions provided in sections Authorizing Access to Database Assets and Authorizing Access to Big Data Assets.

      DWS-dsc-Test

      Database

      Select a Database from the drop-down list.

      gaussdb

      Mode

      This parameter is displayed when the Database is DWS. Click a Mode as required.

      pg_catalog

      Source Table

      Select the corresponding Source Table name.

      pg_proc

      Watermark Embedding Bar

      Click to select the column data to which watermarks are embedded. You can select multiple columns.

      NOTE:
      • The source database character set must be UTF-8.
      • A single column in the embedding target cannot have more than 30% redundant data.

      -

      Target Table

      Enter the name of the data storage table with watermarks embedded. The name can contain only letters, digits, underscores (_), and hyphens (-), and cannot exceed 255 characters.

      Test_Table

  7. Click Next. The Configuring scheduling page is displayed.

    Figure 5 Configuring scheduling
    • If the Scheduling Parameter is set to Once, you can select Now or As scheduled to start the watermark embedding task.
    • If the Scheduling Parameter is set to Daily, Weekly, or Monthly, start the watermark embedding task at a specified time daily, weekly, or monthly.

  8. Click Finish.

Running Tasks

  1. Log in to the management console.
  2. Click in the upper left corner and select a region or project.
  3. In the navigation tree on the left, click . Choose Security & Compliance > Data Security Center .
  4. In the navigation pane, choose Data Asset Protection > Database Watermark.

    Figure 6 Database watermark injection

  5. In the Operation column of the target task, choose More > Running.

Starting a Task

This parameter is displayed when the watermarking task is a scheduled task.

  1. Log in to the management console.
  2. Click in the upper left corner and select a region or project.
  3. In the navigation tree on the left, click . Choose Security & Compliance > Data Security Center .
  4. In the navigation pane, choose Data Asset Protection > Database Watermark.

    Figure 7 Database watermark injection

  5. In the Operation column of the target task, choose More > Start Task.

Stopping a Task

This parameter is displayed when the watermarking task is a scheduled task.

  1. Log in to the management console.
  2. Click in the upper left corner and select a region or project.
  3. In the navigation tree on the left, click . Choose Security & Compliance > Data Security Center .
  4. In the navigation pane, choose Data Asset Protection > Database Watermark.

    Figure 8 Database watermark injection

  5. In the Operation column of the target task, choose More > Stop Task.

Editing and Deleting an Embedded Watermark Task

A running watermark embedding task cannot be edited or deleted.

  • Click Edit in the Operation column to modify the watermark embedding task configuration.
    Figure 9 Edit embedded watermark task
  • Click Delete in the Operation column of the target task. You can also select multiple tasks and click Batch Delete to delete them.
    Figure 10 Deleting a watermark embedding task

    The deletion cannot be undone.