Injecting Watermarks to Databases
Prerequisites
- Access to cloud assets has been authorized. For details, see Allowing or Disallowing Access to Cloud Assets.
- An RDS or GaussDB(DWS) database has been authorized. For details, see Authorizing Access to Database Assets.
- An MRS database has been authorized. For details, see Authorizing Access to Big Data Assets.
- You have configured the GaussDB(DWS) and MRS_Hive permissions. For details, see (Optional) Configuring GaussDB(DWS) and MRS Hive.
Constraints
- GaussDB(DWS) data supports the following watermarks: smallint, integer, bigint, float4, float8, varchar, text, and char.
- MRS Hive data supports the following watermarks: smallint, int, long, float, double, and string.
- A single column in the embedding target cannot have more than 30% redundant data.
- The database encoding is UTF-8.
- The database injection is a non-primary key column.
- It is recommended that the number of data rows in a data table be greater than 1500.
Creating a Sensitive Data Identification Task
- Log in to the management console.
- Click in the upper left corner and select a region or project.
- In the navigation tree on the left, click . Choose .
- In the navigation pane, choose Data Asset Protection > Database Watermark.
- Click Create Task. The Configure basic information page is displayed.
Figure 1 Configuring basic information
Table 1 Parameters for configuring basic information Parameter
Description
Task Name
Enter a task name.
Watermark ID
Enter the watermark identifier to be injected.
Embedding Scheme
Click the drop-down list box to select a watermark embedding scheme. The options are as follows:
- Lossless - pseudo-column watermark: A pseudocolumn related to other attributes of the relationship table is generated. The pseudocolumn is deceptive to attackers. Watermarks are embedded into the pseudocolumn to reduce damage to the original data.
- Lossless - pseudo-line watermark: Pseudo lines are generated based on the data type, data format, and value range. Watermarks are embedded into these pseudo lines to reduce damage to the original data.
- Lossy - column watermark: If you directly add watermarks to column data, the data will be modified or damaged.
NOTE:A higher error correction level indicates more watermark bits and a lower bit error rate (BER) during source tracing. Note that a higher error correction level requires more data as the embedding target to ensure embedding integrity. The default value is 1.
- Click Next. On the Configure source and target information page, set related parameters.
- Lossless - pseudocolumn watermark: Embed watermarks to newly created columns to avoid data loss.
Figure 2 Pseudocolumn watermarks
Table 2 Source and destination parameters of pseudocolumn watermarks Parameter
Description
Data Source Type
Select a data source type from the drop-down list box.
- When Embedding Scheme is set to Lossy - Column Watermark, the following data source types are supported:
- DWS
- MRS_HIVE
- When Embedding Scheme is set to Lossless pseudo-column watermarking or Lossless pseudo-line watermarking, the following data types are supported:
- DWS
- PostgreSQL
- MySQL
Database Instance
Select a Database Instance from the drop-down list. If no database instance is available, add databases by following the instructions provided in sections Authorizing Access to Database Assets and Authorizing Access to Big Data Assets.
Database
Select a Database from the drop-down list.
Schema
This parameter is displayed when Database is DWS or PostgreSQL. Click a Mode as required.
Source Table
Select the corresponding Source Table name.
Column Name
Only letters, numbers, underscores (_), and hyphens (-) are allowed (255 characters max).
Column Data Type
Click to select the data type of the embedded pseudocolumn.
- Numeric
- String
- Date
Example Value
Choose Setting Field Rules. The embedded pseudocolumn data example is displayed.
Setting Field Rules
- If Column Data Type is set to Numeric, this parameter is a random number. You can specify the range and precision of the random number. If the range and precision are not specified, pseudo data will be randomly generated.
- When Column Data Type is set to String, you can select pseudo data such as the person name, ID card number, and mobile number from the drop-down list box.
- When the Column Data Type is set to Date, you can specify a date range. If no date range is specified, pseudo data is randomly generated.
Add a Pseudo Column
You can click Add a Pseudo Column to add two pseudo-columns,
Target Table
Enter the target table name. The name can contain only letters, digits, underscores (_), and hyphens (-) and cannot exceed 255 characters.
- When Embedding Scheme is set to Lossy - Column Watermark, the following data source types are supported:
- Lossless - pseudo-line watermark: Watermarks are embedded into line copies to avoid data loss.
Figure 3 Pseudo-line watermark
Table 3 Source and destination parameters of pseudo-line watermarks Parameter
Description
Example Value
Data Source Type
Select a Data Source Type from the drop-down list. The following data source types are supported:
- DWS
- PostgreSQL
- MySQL
DWS
Database Instance
Select a Database Instance from the drop-down list. If no database instance is available, add databases by following the instructions provided in sections Authorizing Access to Database Assets and Authorizing Access to Big Data Assets.
DWS-dsc-Test
Database
Select a Database from the drop-down list.
gaussdb
Mode
This parameter is displayed when Database is DWS or PostgreSQL. Click a Mode as required.
pg_catalog
Source Table
Click and select the corresponding source data table name.
pg_proc
Number of Pseudo-Line Spans
Enter the number of pseudo lines. The value must be a valid integer greater than 1.
10
Target Table
Enter the name of the data storage table with watermarks embedded. The name can contain only letters, digits, underscores (_), and hyphens (-), and cannot exceed 255 characters.
Test_Table
- Lossy - column watermark: Embed watermarks directly to the column data.
Figure 4 Lossy column watermarks
Table 4 Source and destination parameters of lossy column watermarks Parameter
Description
Example Value
Data Source Type
Select a Data Source Type from the drop-down list. The following data source types are supported:
- DWS
- MRS-HIVE
DWS
Database Instance
Select a Database Instance from the drop-down list. If no database instance is available, add databases by following the instructions provided in sections Authorizing Access to Database Assets and Authorizing Access to Big Data Assets.
DWS-dsc-Test
Database
Select a Database from the drop-down list.
gaussdb
Mode
This parameter is displayed when the Database is DWS. Click a Mode as required.
pg_catalog
Source Table
Select the corresponding Source Table name.
pg_proc
Watermark Embedding Bar
Click to select the column data to which watermarks are embedded. You can select multiple columns.
NOTE:- The source database character set must be UTF-8.
- A single column in the embedding target cannot have more than 30% redundant data.
-
Target Table
Enter the name of the data storage table with watermarks embedded. The name can contain only letters, digits, underscores (_), and hyphens (-), and cannot exceed 255 characters.
Test_Table
- Lossless - pseudocolumn watermark: Embed watermarks to newly created columns to avoid data loss.
- Click Next. The Configuring scheduling page is displayed.
Figure 5 Configuring scheduling
- If the Scheduling Parameter is set to Once, you can select Now or As scheduled to start the watermark embedding task.
- If the Scheduling Parameter is set to Daily, Weekly, or Monthly, start the watermark embedding task at a specified time daily, weekly, or monthly.
- Click Finish.
Running Tasks
- Log in to the management console.
- Click in the upper left corner and select a region or project.
- In the navigation tree on the left, click . Choose .
- In the navigation pane, choose Data Asset Protection > Database Watermark.
Figure 6 Database watermark injection
- In the Operation column of the target task, choose More > Running.
Starting a Task
This parameter is displayed when the watermarking task is a scheduled task.
- Log in to the management console.
- Click in the upper left corner and select a region or project.
- In the navigation tree on the left, click . Choose .
- In the navigation pane, choose Data Asset Protection > Database Watermark.
Figure 7 Database watermark injection
- In the Operation column of the target task, choose More > Start Task.
Stopping a Task
This parameter is displayed when the watermarking task is a scheduled task.
- Log in to the management console.
- Click in the upper left corner and select a region or project.
- In the navigation tree on the left, click . Choose .
- In the navigation pane, choose Data Asset Protection > Database Watermark.
Figure 8 Database watermark injection
- In the Operation column of the target task, choose More > Stop Task.
Editing and Deleting an Embedded Watermark Task
A running watermark embedding task cannot be edited or deleted.
- Click Edit in the Operation column to modify the watermark embedding task configuration.
Figure 9 Edit embedded watermark task
- Click Delete in the Operation column of the target task. You can also select multiple tasks and click Batch Delete to delete them.
Figure 10 Deleting a watermark embedding task
The deletion cannot be undone.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.