Managing Masking Algorithms
Masking algorithms are mandatory for creating masking policies. The system provides more than 20 built-in masking algorithms. If you want to use these algorithms, you need to configure their parameters. If the built-in algorithms cannot meet your needs, you can create algorithms.
This section describes built-in masking algorithms and how to create masking algorithms.
Notes and Constraints
- When creating a random or character replacement masking algorithm, if you select Sample library for Random Mode or Replacement Mode, the sample file for testing the algorithm cannot be larger than 10 KB. This restriction applies only to the algorithm test and does not apply to real static masking tasks.
- During the creation of a masking algorithm of the hash type, the dws-SM3 cryptographic hash algorithm is a dedicated algorithm of the DWS engine. The result is a hexadecimal string of lowercase letters. The DWS cluster version must be 8.1.3 or later. The general-SM3 cryptographic hash algorithm is a general algorithm of the DLI or MRS engine. The result is a hexadecimal string of uppercase letters.
Built-in Masking Algorithms
DataArts Security provides the following built-in masking algorithms. Before selecting an algorithm, you can use the algorithm configuration and testing functions to check whether the algorithm suits your needs.
Type |
Name |
Description |
Configurable |
---|---|---|---|
Hash |
HMAC-SHA256 hash |
Use the HMAC-SHA256 algorithm for hash processing. |
A salt value and a key can be configured. |
SHA-256 |
Use the SHA-256 algorithm for hash processing. |
A salt value can be configured.
NOTE:
You need to set a salt value rather than use the secure random number provided by the system. Pay attention to the risks. |
|
Cut |
Value truncation |
Retain x digits before the decimal point and replace the x-1 digits from the first digit before the decimal point and the digits after the decimal point with 0. For example, if x is 3, 1234 is truncated to 1200, 999.999 is truncated to 900, and 10.7 is truncated to 0. |
The number of digits before the decimal point can be configured. |
Date truncation |
Truncate a specified date. |
The date format and masking range can be configured. |
|
Mask |
Masking of specified GaussDB(DWS) columns |
Masks specified columns in GaussDB(DWS). This algorithm can be used only when both the source and destination of a static masking task are GaussDB(DWS) and the execution engine is GaussDB(DWS). |
Not supported |
Masking with specified characters for GaussDB(DWS) |
Replaces the characters from the start to end position with specified characters. This algorithm can be used only when both the source and destination of a static masking task are GaussDB(DWS) and the execution engine is GaussDB(DWS). |
The start position, end position, and mask flag can be configured. |
|
Masking with specified digits for GaussDB(DWS) |
Replaces the characters from the start to end position with specified digits. This algorithm can be used only when both the source and destination of a static masking task are GaussDB(DWS) and the execution engine is GaussDB(DWS). |
The start position, end position, and mask flag can be configured. |
|
ID masking |
Masks an ID card No. |
Not supported |
|
Bank card No. masking |
Masks a bank card No. |
Not supported |
|
Email masking |
Masks email information. |
Not supported |
|
Mobile equipment identity masking |
Masks the device code, such as IMEI, MEDI, and ESN. |
The type can be configured. |
|
IPv6 masking |
Masks an IPv6 address. |
Not supported |
|
IPv4 masking |
Masks an IPv4 address. |
Not supported |
|
MAC address masking |
Masks a MAC address. |
Not supported |
|
Phone No. masking |
Masks a phone number. |
Not supported |
|
Date type masking |
Masks a specified date format, such as ISO, EUR, and USA. |
The date format and masking range can be configured. |
|
Masking X to Y |
Masks the characters from X to Y of a string. |
X and Y can be configured. |
|
Retaining X to Y |
Retains the characters from X to Y of a string. |
X and Y can be configured. |
|
Masking first n and last m characters |
Masks the first n and last m characters of a string. |
n and m can be configured. |
|
Retaining first n and last m characters |
Retains the first n and last m characters of a string. |
n and m can be configured. |
|
Encryption |
GaussDB(DWS) column encryption |
The symmetric cryptographic algorithm gs_encrypt_aes128(encryptstr,keystr) provided by GaussDB (DWS) is invoked to encrypt DWS data columns. This algorithm uses keystr as the key to encrypt the encryptstr character string and returns the encrypted character string. Note the following:
|
The key can be configured. The key length ranges from 1 byte to 16 bytes.
NOTE:
Before using the algorithm, you must configure a key. |
Hive column encryption |
Invokes the Hive column encryption function provided by MRS to encrypt and decrypt Hive data columns. Cryptographic algorithms AES and SMS4 are supported. Note the following:
|
The encryption type can be configured. |
Creating a Masking Algorithm
If the built-in algorithms do not meet your needs, you can create custom masking algorithms, such as mask, truncation, hash, encryption, nulling, random masking, character replacement, key-value masking, value range conversion, and fuzzy masking.
- On the DataArts Studio console, locate a workspace and click DataArts Security.
- In the left navigation pane, choose Masking Algorithms.
- Click Create.
Figure 1 Creating a masking algorithm
- Set the parameters listed in Table 2 and click OK.
Figure 2 Configuring algorithm parameters
The following table lists the masking algorithm parameters.Table 2 Parameters for the masking algorithm Parameter
Description
*Algorithm
Algorithm name, which can contain a maximum of 64 characters
Description
Brief description of the algorithm. It can contain a maximum of 255 characters.
*Masking Algorithm
The following options are available:
- Mask: This algorithm supports characters, numeric values, and date values. It replaces data at specified positions with fixed values.
- Truncate: This algorithm supports date and numeric values. It truncates a date to the month, day, hour, minute, or second and rounds the value.
- Hash: This algorithm supports all types of data. The selected algorithm is used to calculate the hash value.
Compared with built-in algorithms, two extra algorithms are available, including dws-SM3 and general-SM3 cryptographic hash algorithms. The dws-SM3 cryptographic hash algorithm is a dedicated algorithm of the DWS engine. The result is a hexadecimal string of lowercase letters. The DWS cluster version must be 8.1.3 or later. The general-SM3 cryptographic hash algorithm is a common algorithm of the DLI or MRS engine. The result is a hexadecimal string of uppercase letters.
- ENCRYPT: This algorithm supports all types of data. The selected encryption algorithm is used to encrypt data from a specified source.
- SET_NULL: This algorithm supports all types of data. It sets the value to null.
- RANDOM: Replaces date or numeric values with values within a specified range or in a sample library. For how to create a sample library, see Managing Sample Libraries. If you select Sample library for Random Mode, the OBS sample file can only be used for static DLI data masking tasks and the HDFS sample file can only be used for static MRS data masking tasks. For details about the mapping between static masking scenarios and engines, see Reference: Static Data Masking Scenarios.
If you enable Keep Association with Source Data, the same result will be generated for the same data in different databases after the data is masked using the same rule. If this parameter is enabled, data may be cracked. If you need to enable this parameter, you are advised to configure a random salt value to defend against dictionary attacks.
- CHARACTER_REPLACEMENT: This algorithm replaces numeric values and characters at specified positions with fixed values or values in sample files in the sample library. Random digits or lowercase letters can be used to replace the characters at custom positions. If you select Replace the last digit of an ID card number, Bits can only be 1, and there must be 17 or more bits to be masked before the selected bit.
For how to create a sample library, see Managing Sample Libraries. If you select Sample library for Replacement Mode, the OBS sample file can only be used for static DLI data masking tasks and the HDFS sample file can only be used for static MRS data masking tasks. For details about the mapping between static masking scenarios and engines, see Reference: Static Data Masking Scenarios.
If you enable Keep Association with Source Data, the same result will be generated for the same data in different databases after the data is masked using the same rule. If this parameter is enabled, data may be cracked. If you need to enable this parameter, you are advised to configure a random salt value to defend against dictionary attacks.
- KEY_VALUE: This algorithm replaces numeric keys and values with values that are calculated using custom expressions. The source data supports the following operations: addition (+), subtraction (-), multiplication (*), division (/), parentheses (()), and modulo (%). For example, expression ((X*4+3)%100)/2-1 can replace 3 with 6.5.
- INTERVAL_TRANSFORMATION: This algorithm converts digits in a specified range into specified values.
- FUZZY: This algorithm replaces a numeric value with a random value within a fuzzy percentage or absolute value range. For example, in percentage blurring mode, if the percentage ranges from –10% to 20%, value 10 will be replaced with a random value from 9 to 12.
If you enable Keep Association with Source Data, the same result will be generated for the same data in different databases after the data is masked using the same rule. If this parameter is enabled, data may be cracked. If you need to enable this parameter, you are advised to configure a random salt value to defend against dictionary attacks.
Test
Enter the data to be tested and click Test. You can view the masking result in the Test Result area.
NOTE:When creating a random or character replacement masking algorithm, if you select Sample library for Random Mode or Replacement Mode, the sample file for testing the algorithm cannot be larger than 10 KB.
Test Result
Related Operations
- Editing an algorithm: On the Masking Algorithms page, locate an algorithm and click Edit in the Operation column.
The parameters that can be edited vary depending on the algorithm type.
- Testing an algorithm: On the Masking Algorithms page, locate an algorithm and click Test in the Operation column.
Before using an algorithm, you are advised to test it to ensure that it meets your needs.
Whether the test function is available varies depending on the algorithm type.
- Deleting algorithms: On the Masking Algorithms page, locate an algorithm and click Delete in the Operation column. To delete multiple algorithms, select them and click Delete above the list.
Built-in algorithms cannot be deleted. Custom algorithms that are used by masking policies or specified column masking cannot be deleted. To delete such algorithms, cancel the reference first.
The deletion operation cannot be undone. Exercise caution when performing this operation.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot