Updated on 2024-10-23 GMT+08:00

Creating Data Standards

Data standards describe data meanings and business rules that are stipulated and commonly recognized by enterprises and that those enterprises must comply with.

A data standard, also called a data element, is the smallest unit of data used. It cannot be further divided. A data standard is a data unit whose definition, identifiers, representations, and allowed values are specified by a group of properties. You can associate data standards with databases of a wide range of businesses. The identifier, data type, expression format, and value range are the basis of data exchange. They are used to describe field metadata of a table and standardize data information stored in a field.

This topic describes how to create a data standard. A created data standard can be associated with fields in a business table created during ER modeling, ensuring that fields in the business table comply with the specified data standards.

Constraints

A maximum of 500 data standard directories and 20,000 standards can be created in a workspace.

Creating a Data Standard Directory

  1. On the DataArts Studio console, locate a workspace and click DataArts Architecture.
  1. On the DataArts Architecture page, choose Standards > Data Standards in the left navigation pane.
  2. When you access the Data Standards page for the first time, the page where you can customize a standard template is displayed. Select the required options for Optional, add custom items, and click Update.

    After saving the template settings, you can modify it on the Standard Templates tab page of Configuration Center. For details, see Standard Templates. When creating a data standard, you must set the selected options in the template.

  3. On the Data Standards page, select a directory and click to create a directory under the selected one. When creating a directory for the first time, you can create a directory under the root directory.
    Figure 1 Data Standards page
  4. In the dialog box displayed, set the parameters and click OK.
    Figure 2 Create Directory dialog page
    Table 1 Parameters for creating directories

    Parameter

    Description

    *Name

    The following characters are not allowed: / \ < > .

    *Select Directory

    Select an existing directory, and create a subdirectory under it.

    Click to refresh the directories.

    Click to refresh the directories and synchronize subject directories to data standard directories.

    • Before synchronizing subject directories, check whether there are released subjects in the current workspace. If there are no released subjects, an error will occur during the synchronization.
    • A maximum of five levels of subject directories can be synchronized to data standard directories. Subject directories beyond this range will not be synchronized. The number of directories after the synchronization cannot exceed the upper limit (generally 500). Otherwise, an error will occur and the synchronization will be canceled. Before a synchronization, the system checks for and deletes empty data standard directories. These directories and their subdirectories do not contain any data standard.
    • The synchronized subject directories are displayed as L1 to L5 icons, and the existing data standard directories are displayed as their original icons.

Creating a Data Standard

  1. On the Data Standards page, select a directory and click Create.
  2. Set the parameters based on Table 2.

    On the page for creating a data standard, only the selected parameters and custom parameters that have been added on the Standard Templates tab page of the Configuration Center are displayed. Table 2 lists all parameters that are available in a data standard template. For details on how to configure a data standard template, see Standard Templates.

    Table 2 Parameters for creating a data standard

    Parameter

    Description

    *Standard Name

    Newline characters and the following characters are not allowed: \ < > % " ' ;

    If Data Standard Allows Duplicate Names is disabled, ensure that the standard name is unique in the current workspace. To check whether Data Standard Allows Duplicate Names is enabled, go to DataArts Architecture > Configuration Center > Functions.

    *Standard Code

    The value can be Auto Generate or Custom.

    The value must be unique in the current workspace. It is used to identify a data standard record. For details, see Table 2.

    *Data Type

    The possible values are STRING, BIGINT, DOUBLE, TIMESTAMP, DATE, BOOLEAN, and DECIMAL.

    The data type varies according to the system. The system converts the data type internally. If the required data type does not exist, you can add one. See Field Types.

    Name (EN)

    English name of the data standard

    It must start with a letter. Only letters, digits, brackets, spaces, and underscores (_) are allowed.

    Data Length

    Data length
    • You can leave this parameter blank. If it is left blank, there is no limit to the data length.
    • Select and enter a number ranging from 1 to 10000.
    • Select and set a range from 1 to 10000.

    If you set this parameter and select STRING for Data Type, a data quality job will be created for the attribute matching the data standard. If you select any other data type, no data quality job will be created.

    Allowed Value Exist

    If Allowed Value Exist is enabled, you can specify one or more allowed values.

    Allowed Value

    This parameter is available only when Allowed Value Exist is enabled. You can type a value and press Enter to add it. You can add up to 20 allowed values.

    Lookup Table

    • Select a created lookup table and the corresponding table fields. In this way, the lookup table fields can be associated with data standard. If no lookup table is created, create one. See Creating a Lookup Table. If Create Data Quality Jobs is selected for Model Design Process on the Function Settings tab page of Configuration Center, and the data standard of the referenced lookup tables is associated with the business tables in ER modeling, the system will automatically create quality jobs in DataArts Quality when the business tables are published, and generate quality rules based on the associated data standard and lookup tables. If the quality jobs have already been published, the system will automatically update the quality jobs and add the quality rules generated based on the data standard and lookup tables.
    • If a public workspace is available, you need to manually set the reference lookup table source to Public workspace or Current workspace when selecting a lookup table in a common workspace. When Public workspace is enabled, lookup tables of the public workspace can be referenced in common workspace.

    Quality Rule

    This parameter is available if Quality rule is selected on the Standard Templates tab page on the Configuration Center page. You can associate a system quality rule or a quality rule you have created.

    Click . In the dialog box displayed, click Add Rule.

    For example, add a rule named Unique value, select the rule, click OK, enter an alarm condition expression in the Alarm Condition text box, add other rules in the same way, and click OK.

    An alarm condition expression consists of alarm parameters and logical operators. When a quality job is running, the system calculates the result of the alarm condition expression and determines whether to trigger the alarm based on the result of the expression. If the expression result is true, the alarm will be triggered. Otherwise, no quality alarm will be triggered.

    The alarm parameters of each data quality rule are listed as buttons.

    Figure 3 Associate Quality Rule dialog box

    Rule Designer

    Select a rule designer from the drop-down list box. This owner is responsible for making quality rules. You can enter an owner name or select an existing owner.

    Rule Implementer

    Select a rule implementer from the drop-down list box. This owner is responsible for implementing quality rules. You can enter an owner name or select an existing owner.

    Level

    • global indicates the global level.
    • domain indicates non-global level.

    Custom Item

    A custom item added on the Standard Templates tab page in Metrics > Configuration Center. You can add one or more custom items based on project requirements. For more information about adding custom items, see Standard Templates.

    Description

    A description of the data standard to create. Up to 600 characters are supported.

  3. Click Save.
  4. Select the standard and click Publish. In the displayed dialog box, select a reviewer and click OK. After the application is approved, the Data Standards page is displayed. You can view the created data standard in the list, and the status of the data standard is Published. Only published data standards can be used.

    If you have been added as a reviewer, you can select Auto-review and click OK. After the application is approved, the status changes to Published.

    If you select multiple reviewers, the logical entities will be published only after all reviewers have approved the publishing request. If any reviewer rejects the request, the logical entities will not be published.

Importing a Data Standard

  1. On the DataArts Architecture page, choose Standards > Data Standards in the left navigation pane.
  2. In the directory structure of data standards, select a directory and choose More > Import.

    Figure 4 Import Data Standard dialog box

  3. In the Import Data Standard dialog box, determine whether to update the existing data. Existing data is uniquely identified by a standard code. If a standard code in the import template already exists in the current workspace, the system considers that the group of data to which the standard code in the import template belongs already exists.
  4. On the Import tab page, click Data standard import template to download the template. Open the template, set the parameters in the template based on service requirements, and save the settings.

    Table 3 and Table 4 describe the parameters required for importing a data standard. Parameters whose names start with an asterisk (*) are mandatory, and other parameters are optional.

    Table 3 Parameters in the Standards sheet

    Parameter

    Description

    *Directory

    The directory that the imported data standard belongs to.

    *Standard Name

    The name of the data standard to import.

    Newline characters and the following characters are not allowed: \ < > % " ' ;

    *Standard Code

    You can select Auto Generate or Custom.

    The value must be unique in the workspace. It is used to identify a data standard record. For details, see Table 2.

    *Data Type

    The possible values are STRING, BIGINT, DOUBLE, TIMESTAMP, DATE, BOOLEAN, and DECIMAL.

    The data type varies according to the system. The system converts the data type internally. If the required data type does not exist, you can add one. See Field Types.

    Data Length

    Data length
    • You can leave this parameter blank. If it is left blank, there is no limit to the data length.
    • Enter a number ranging from 1 to 10000.
    • Set a range from 1 to 10000, for example (1,20).

    If you enter a value and select STRING for Data Type, a data quality job will be created for the attribute matching the data standard. If you select any other data type, no data quality job will be created.

    Allowed Value

    The value true indicates that there are allowed values, and the value false indicates that there are no allowed values.

    Allowed Value List

    If you select true for Allowed Value, you must enter an allowed value.

    You can add up to 20 values. Multiple values must be separated by commas (,), for example, 1,2,3.

    Lookup Table

    Set this parameter to the name of a created lookup table.

    Lookup Table Field

    If Lookup Table is not left blank, you must set Lookup Table Field. In this way, the code table field can be associated with the data standard.

    Owner of Business Rules

    Enter the business rule owner. You can enter the name of an owner or select an existing owner.

    Owner of Data Monitoring

    Enter the data monitoring owner. You can enter the name of an owner or select an existing owner.

    Standard Level

    • global indicates the global level.
    • domain indicates non-global level.

    Description

    A description of the data standard to import. Up to 600 characters are supported.

    (Optional) Custom Item

    If you have added one or more custom fields when customizing a data standard template, you must also fill in the corresponding fields in the import template. If no custom field is added, you do not need to fill in the fields. For details on how to customize a data standard template, see Standard Templates.

    If Quality rule is selected on the Standard Templates tab page on the Configuration Center page, the downloaded template contains the Quality Rules sheet on which you can add quality rules for the data standard.

    Table 4 Parameters in the Quality Rules sheet

    Parameter

    Description

    *Code

    The code of the data standard that a quality rule is added to.

    Rule Name

    Enter an existing rule name. In the upper left corner of the DataArts Studio console, select DataArts Quality from the drop-down list box. Then, you can view the existing rule names on the Rule Templates page.

    Alarm Config

    An alarm condition expression consists of alarm parameters and logical operators. When a quality job is running, the system calculates the result of the alarm condition expression and determines whether to trigger the alarm based on the result of the expression. If the expression result is true, the alarm will be triggered. Otherwise, no quality alarm will be triggered.

    In the alarm condition expression, alarm parameters are represented by variables such as ${1}, ${2}, and ${3}. The variable name indicates the alarm parameter of the specified quality rule. The variable $1 indicates the first alarm parameter, $2 indicates the second alarm parameter, and so on. In the upper left corner of the DataArts Studio console, select DataArts Quality from the drop-down list box. Access the Rule Templates page and view the alarm parameters supported by the data quality rule in the Result Description column.

    Example: ${1} > 100

    Expression

    This parameter must be configured when Rule Name is set to Expression or Validity Verification.

  5. Return to the Import Data Standard dialog box, select the data standard template file configured in the previous step, and click Upload.

    If the uploaded template file fails the verification, modify the file and upload it again.

  6. In the Import Data Standard dialog box, the import result is displayed on the Last Import tab page. If the import is successful, click Close. If the import fails, you can view the failure cause, correct the template file, and upload it again.

    Figure 5 Last Import tab page

Managing a Data Standard

On the DataArts Architecture page, choose Standards > Data Standards in the left navigation pane. On the page displayed, you can manage data standards as required.

  • The data standards created in the public workspace can be queried in a common workspace, but the data standards created in a common workspace cannot be queried in the public workspace.
  • A common workspace has the edit permission of only the data standards and directories created in the same workspace, and can view indexes in the public workspace rather than perform any operation on the data standards and directories in the public workspace.
Figure 6 List of data standards
On the Data Standards page, you can perform the following operations:
  • Search

    Above the data standard list, select a filter such as the standard name, data type, creator, and reviewer, and click the search icon to search for data standards.

    After locating the specified data standards, you can perform the following operations:

    • Edit
    • Publish
    • Suspend
  • Import

    Choose More > Import to import a data standard. Download the template, fill in it and upload it, and click Close.

  • Export
    • Export data standards from a specified directory.

      In the data standard directory structure, select a directory and choose More > Export above the data standard list to export all data standards in the directory.

    • Export specified data standards.

      In the data standard list, select the data standards you want to export and choose More > Export above the list to export the selected data standards.

  • Delete

    Select a data standard, and choose More > Delete. A data standard in publishing review, published, or suspension review state cannot be deleted. Referenced data standards cannot be deleted as well.

  • Publish

    Select a data standard and click Publish. In the displayed dialog box, perform either of the following operations:

    • Select a reviewer. If no reviewer is available in the drop-down list, click to add one.
    • Select Auto-review.

      Auto-review is available only when the current account is in the reviewer list.

    Click OK. If a reviewer is selected, the data standard is published after the application is approved. If Auto-review is selected, the data standard will be published immediately.

Exporting a Data Standard

  1. On the DataArts Architecture page, choose Standards > Data Standards in the left navigation pane.
  2. In the data standard directory structure, right-click a directory name and choose Export.