Inserting Watermarks
Prerequisites
- DSC has been allowed to access cloud assets.
- Database assets have been added. For details, see Database Assets.
- You have added MRS assets. For details, see Adding MRS Assets.
- You have configured the GaussDB(DWS) and MRS_Hive permissions. For details, see Configuring GaussDB(DWS) and MRS Hive.
Constraints
- GaussDB(DWS) data supports the following watermarks: smallint, integer, bigint, float4, float8, varchar, text, and char.
- MRS Hive data supports the following watermarks: smallint, int, long, float, double, and string.
- A single column in the embedding target cannot have more than 30% redundant data.
- The database encoding is UTF-8.
- The database injection is a non-primary key column.
- It is recommended that the number of data rows in a data table be greater than 1500.
Creating a Sensitive Data Identification Task
- Log in to the management console.
- Click
in the upper left corner and select a region or project.
- In the navigation pane on the left, click
and choose .
- In the navigation pane, choose Data privacy protection > Database Watermark.
Figure 1 Database watermark injection
- Click Create Task. The Configure basic information page is displayed.
Figure 2 Configuring basic information
Table 1 Parameters for configuring basic information Parameter
Description
Task Name
Enter a task name.
Watermark ID
Enter the watermark identifier to be injected.
Embedding Scheme
Click the drop-down list box to select a watermark embedding scheme. The options are as follows:
- Lossless - pseudo-column watermark: A pseudocolumn related to other attributes of the relationship table is generated. The pseudocolumn is deceptive to attackers. Watermarks are embedded into the pseudocolumn to reduce damage to the original data.
- Lossless - pseudo-line watermark: Pseudo lines are generated based on the data type, data format, and value range. Watermarks are embedded into these pseudo lines to reduce damage to the original data.
- Lossy - column watermark: If you directly add watermarks to column data, the data will be modified or damaged.
- Click Next. On the Configure source and target information page, set related parameters.
- Lossless - pseudocolumn watermark: Embed watermarks to newly created columns to avoid data loss.
Figure 3 Pseudocolumn watermarks
Table 2 Source and destination parameters of pseudocolumn watermarks Parameter
Description
Data Source Type
Select a data source type from the drop-down list box.
- When Embedding Scheme is set to Lossy - Column Watermark, the following data source types are supported:
- GaussDB(DWS): For details about how to add assets, see Adding a Cloud Database.
- MRS_HIVE: For details about how to add assets, see Adding MRS Assets.
- When Embedding Scheme is set to Lossless-column watermark or Lossless-line watermark, the following data types are supported:
- GaussDB(DWS): For details about how to add assets, see Adding a Cloud Database.
- PostgreSQL: For details about how to add assets, see Adding a Self-Built Database.
- MySQL: For details about how to add assets, see Adding a Self-Built Database.
Database Instance
Select a Database Instance from the drop-down list.
Database
Select a Database from the drop-down list.
Schema
This parameter is displayed when Database is DWS or PostgreSQL. Click a Mode as required.
Source Table
Select the corresponding Source Table name.
Column Name
Only letters, numbers, underscores (_), and hyphens (-) are allowed (255 characters max).
Column Data Type
Click to select the data type of the embedded pseudocolumn.
- Numeric
- String
- Date
Example Value
Choose Setting Field Rules. The embedded pseudocolumn data example is displayed.
Setting Field Rules
- If Column Data Type is set to Numeric, this parameter is a random number. You can specify the range and precision of the random number. If the range and precision are not specified, pseudo data will be randomly generated.
- When Column Data Type is set to String, you can select pseudo data such as the person name, ID card number, and mobile number from the drop-down list box.
- When the Column Data Type is set to Date, you can specify a date range. If no date range is specified, pseudo data is randomly generated.
Add a Pseudo Column
You can click Add a Pseudo Column to add two pseudocolumns,
Target Table
Enter the target table name. The name can contain only letters, digits, underscores (_), and hyphens (-) and cannot exceed 255 characters.
- When Embedding Scheme is set to Lossy - Column Watermark, the following data source types are supported:
- Lossless - pseudo-line watermark: Watermarks are embedded into line copies to avoid data loss.
Figure 4 Pseudo-line watermark
Table 3 Source and destination parameters of pseudo-line watermarks Parameter
Description
Example Value
Data Source Type
Select a Data Source Type from the drop-down list. The following data source types are supported:
- GaussDB(DWS): For details about how to add assets, see Adding a Cloud Database.
- PostgreSQL: For details about how to add assets, see Adding a Self-Built Database.
- MySQL: For details about how to add assets, see Adding a Self-Built Database.
DWS
Database Instance
Select a Database Instance from the drop-down list.
DWS-dsc-Test
Database
Select a Database from the drop-down list.
gaussdb
Mode
This parameter is displayed when Database is DWS or PostgreSQL. Click a Mode as required.
pg_catalog
Source Table
Click and select the corresponding source data table name.
pg_proc
Number of Pseudo-Line Spans
Enter the number of pseudo lines. The value must be a valid integer greater than 1.
10
Target Table
Enter the name of the data storage table with watermarks embedded. The name can contain only letters, digits, underscores (_), and hyphens (-), and cannot exceed 255 characters.
Test_Table
- Lossy - column watermark: Embed watermarks directly to the column data.
Figure 5 Lossy column watermarks
Table 4 Source and destination parameters of lossy column watermarks Parameter
Description
Example Value
Data Source Type
Select a Data Source Type from the drop-down list. The following data source types are supported:
- GaussDB(DWS): For details about how to add assets, see Adding a Cloud Database.
- MRS-HIVE: For details about how to add an asset, see Adding MRS Assets.
DWS
Database Instance
Select a Database Instance from the drop-down list.
DWS-dsc-Test
Database
Select a Database from the drop-down list.
gaussdb
Mode
This parameter is displayed when the Database is DWS. Click a Mode as required.
pg_catalog
Source Table
Select the corresponding Source Table name.
pg_proc
Watermark Embedding Bar
Click to select the column data to which watermarks are embedded. You can select multiple columns.
NOTE:- The source database character set must be UTF-8.
- A single column in the embedding target cannot have more than 30% redundant data.
-
Target Table
Enter the name of the data storage table with watermarks embedded. The name can contain only letters, digits, underscores (_), and hyphens (-), and cannot exceed 255 characters.
Test_Table
- Lossless - pseudocolumn watermark: Embed watermarks to newly created columns to avoid data loss.
- Click Next. The Configuring scheduling page is displayed.
Figure 6 Configuring scheduling
- If the Scheduling Parameter is set to Once, you can select Now or As scheduled to start the watermark embedding task.
- If the Scheduling Parameter is set to Daily, Weekly, or Monthly, start the watermark embedding task at a specified time daily, weekly, or monthly.
- Click Finish.
Running Tasks
- Log in to the management console.
- Click
in the upper left corner and select a region or project.
- In the navigation pane on the left, click
and choose .
- In the navigation pane, choose Data privacy protection > Database Watermark.
Figure 7 Database watermark injection
- In the Operation column of the target task, choose More > Running.
Enable Task
- Log in to the management console.
- Click
in the upper left corner and select a region or project.
- In the navigation pane on the left, click
and choose .
- In the navigation pane, choose Data privacy protection > Database Watermark.
Figure 8 Database watermark injection
- In the Operation column of the target task, choose More > Start Task.
Stopping a Task
- Log in to the management console.
- Click
in the upper left corner and select a region or project.
- In the navigation pane on the left, click
and choose .
- In the navigation pane, choose Data privacy protection > Database Watermark.
Figure 9 Database watermark injection
- In the Operation column of the target task, choose More > Stop Task.
Editing and Deleting an Embedded Watermark Task
The status of embedded watermark task is Waiting or Execution and the task cannot be edited or deleted.
- Click Edit in the Operation column to modify the watermark embedding task configuration.
Figure 10 Edit embedded watermark task
- Click Delete in the Operation column of the target task. You can also select multiple tasks and click Batch Delete to delete them.
Figure 11 Deleting a watermark embedding task
The deletion cannot be undone.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.