Creating Rule Templates
DataArts Quality can monitor offline data, in which quality rules play a vital role. There are 26 built-in rule templates, such as database-level, table-level, field-level, and cross-field rule templates.
Rule Type |
Dimension |
Template |
Description |
---|---|---|---|
Database-level |
Integrity |
Database null value scan |
Calculates the number of rows in all the tables of a database in which all fields have a null value. |
Table-level |
Accuracy |
Table rows |
Calculates the number of rows in a data table. |
Integrity |
Data table null value scan |
Calculates the number of rows in a table in which all fields have a null value. |
|
Field-level |
Uniqueness |
Field with a unique value |
Calculates the number of rows in a data table in which a specified field has a unique value. |
Field with duplicate values |
Calculates the number of rows in a data table in which a specified field has duplicate values. |
||
Unique combination of multiple fields |
Checks whether the combination of multiple fields in a DWS table is unique. A maximum of 10 fields can be combined. |
||
Integrity |
Field with a null value |
Calculates the number of rows in a data table in which a specified field has a null value. |
|
Accuracy |
Average field value |
Calculates the average value of a specified field in a data table. |
|
Total field values |
Calculates the total values of a specified field in a data table. |
||
Maximum field value |
Calculates the maximum value of a specified field in a data table. |
||
Minimum field value |
Calculates the minimum value of a specified field in a data table. |
||
Field length verification |
Checks whether the length of a field in the DWS table is within the allowed range. |
||
Field value range verification |
Checks whether the value of a field in the DWS table is within the allowed range. |
||
Field time verification |
Checks whether the time of a field in the DWS table is within the allowed range. Currently, only fields of the date and timestamp types are supported. Fields of the time type is not supported. |
||
Effectiveness |
ID card verification |
Verifies the validity of a specified field in a data table based on built-in regular expression rules. |
|
Mailbox verification |
Verifies the validity of a specified field in a data table based on built-in regular expression rules. |
||
Regular expression verification |
Verifies the validity of a specified field in a data table based on a custom regular expression. |
||
IP address verification |
Verifies the validity of a specified field in a data table based on built-in regular expression rules. |
||
Phone number format verification |
Verifies the validity of a specified field in a data table based on built-in regular expression rules. |
||
Postal code format verification |
Verifies the validity of a specified field in a data table based on built-in regular expression rules. |
||
Date format verification |
Verifies the validity of a specified field in a data table based on built-in regular expression rules. |
||
Validity verification |
Verifies the validity of a specified field in a data table based on a custom regular expression. |
||
Enumerated value verification |
Verifies the validity of a specified field in a data table based on a custom enumerated value. |
||
Ignoring of null values in enumerated value verification |
Verifies the validity of a specified field in a data table based on a custom enumerated value. Null values are counted in valid rows. |
||
Cross-field level |
Consistency |
Field consistency verification |
Checks consistency between different fields from the same data source. |
Accuracy |
Cross-field time verification |
Checks whether the time relationship between a specified field in the data table and the reference field meets the expectation. Currently, only fields of the date and timestamp types are supported. Fields of the time type is not supported. |
- Custom template: Choose and click Create. The created rule template will be automatically classified into the corresponding rule type and displayed as a custom template. A quality job using a custom template does not support exceptional data output and quality scoring.
- Custom rule: When creating a quality job, set Rule Type to Custom rule and enter an SQL statement to define how to monitor the quality of data objects.
This section describes how to create a rule using a custom template. For details about how to create a custom rule, see Creating Quality Jobs.
- Choose Create on the page displayed.
, and click Figure 1 Rule Template page
- In the dialog box displayed, enter the rule template name, select the rule matching dimension, define the SQL template, and describe the output result.
- Dimension: You can complete single-column, cross-column, cross-row, and cross-table analysis from six dimensions: completeness, validity, timeliness, consistency, accuracy, and uniqueness. When customizing a quality rule, select a dimension for rule matching.
- Directory: Select the directory where the rule is located.
- Define Relationship: Enter SQL statements to search for data.
For example, to count the number of rows in a table, enter select count(${Column1}) from ${Schema_Table1}. The value of ${Column1} is generated by clicking Add Field Parameter, and the value of ${Schema_Table1} is generated by clicking Add Database/Table Parameter.
- Output Description describes each column in the SQL result. Column descriptions are separated by commas (,).
For example, if the relationship is set to select max (${Column1}), min(${Column2}) from ${Schema_Table1}, the output result is Maximum value,Minimum value. The result description must correspond to the output result sequence defined by the relationship.Figure 2 Configuring a rule template
- Abnormal Table Template: You need to enter a complete SQL statement to specify the abnormal data to be exported.
- After you click Yes, the system publishes the rule template by default. The default version is V1.0.
Managing a Rule Template
A published version of a custom rule template cannot be directly modified. If you want to modify a rule template, you can publish a new version. In addition, you can suspend the historical version and migrate jobs associated with the historical version to the new version.
- On the DataArts Quality homepage, select Rule Template from the left navigation bar. Locate the target rule template in the displayed list and click Publish in the Operation column.
Figure 3 Publishing a rule template
- Dimensions and output description can be modified, and relationships can be redefined.
- Click Publish. In the displayed dialog box, set the version.
Figure 4 Publishing a new version
- After the rule template is submitted for publishing, you can click View Publish History in the Operation column. You can view the publish history, change the version, and suspend the version.
Figure 5 Publish History page
- To suspend a historical version, click Suspend on the right of the historical version.
- If the version is not associated with any job, click OK to suspend it.
- If the version has associated jobs, select a new version, associate the jobs with the new version, and click OK.
Figure 6 Migrating and suspending a version
- On the Version Comparison tab page, you can compare the versions to see their differences.
Figure 7 Version comparison
Exporting Rule Templates
To export custom rule templates, perform the following steps (you can export a maximum of 200 rule templates at a time):
- In the left navigation pane, choose Quality Monitoring > Rule Templates, and select the templates to export in the right pane.
- Click Export. The Export Rule Template dialog box is displayed.
- Click Export to switch to the Export Records tab.
- In the list of exported files, locate an exported template and click Download in the Operation column to download the Excel file of the rule template to the local PC.
Importing Rule Templates
You can import a file containing a maximum of 4 MB data.
- In the left navigation pane, choose Quality Monitoring > Rule Templates. In the right pane, click Import.
- On the Import Configurations tab page, set Duplicate Name Policy.
- Terminate: If template names repeat, all templates will fail to be imported.
- Skip: If template names repeat, the templates will still be imported.
- Click Upload and select the prepared data file.
Edit the data file using either of the following methods:
- (Recommended) Click Export to export data and import the data to the system directly or import it after modification.
- Click Download Template, fill in the template with the data to import, and import the data to the system.
- Configure the mapped resource for the catalog and select the directory where rule template has been imported.
- Click Import to import the Excel template to the system.
- Click the Import Records tab to view the import records.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.