Updated on 2022-09-23 GMT+08:00

Creating Summary Tables

A summary table consists of specific analysis objects (such as members) and related statistical metrics. The metrics included in a summary table all have the same level of granularity (such as members). A summary table provides users with all of the available statistics on themed data (such as a member theme market), sorted by levels of granularity.

A summary table can be manually or automatically aggregated. This topic describes how to manually create a summary table.

On the DataArts Architecture page, choose Metrics > Configuration Center in the left navigation pane, and click the Functions tab. On the page displayed, if Create data development jobs is selected for Model Design Process, the system creates a data development job with a name starting with Database name_Table code. Choose DataArts Factory > Develop Job to view the created job. By default, this job has no scheduling configuration. You need to configure scheduling for the job in the DataArts Factory module.

Prerequisites

A dimension, a dimension table, a fact table, and a derivative metric have been created, published, and reviewed.

Creating and Publishing a Summary Table

  1. On the DataArts Studio console, locate an instance and click Access. On the displayed page, locate a workspace and click DataArts Architecture.
    Figure 1 DataArts Architecture
  2. On the DataArts Architecture page, choose Models > Dimensional Modeling in the left navigation pane. On the displayed page, click the Summary Tables tab.
  3. Select a subject from the subject tree on the left and click Create.
  4. On the Create Summary Table page, perform the following operations:
    1. Set the parameters in the Basic Settings area.
      Figure 2 Basic Settings area
      Table 1 Parameters in the Basic Settings area

      Parameter

      Description

      *Subject

      Select a subject catalog (business domain group > business domain > business object) where you can place the summary table.

      *Table Name

      The name of the table to create. It must start with letters. Only letters, digits, and the following special characters are allowed: ()-_

      *Table Code

      The code of the table to create. Table codes must start with letters. Only letters, numbers, and underscores (_) are allowed.

      *Statistical Dimension

      Select a statistical dimension. The drop-down list displays only the statistical dimensions configured on the Derivative Metrics page. If no statistical dimension is available in the drop-down list box, create one by referring to Creating Dimensions.

      After a summary table is created, all dimension attributes of the specified dimensions are automatically added to the summary table as fields in the summary table. After creating a summary table, go to the summary table page and click the table name to view the field details in the table.

      *Data Connection Type

      The parameter value must be the same as that of the dimension table and fact table.

      *Data Connection Name

      It is recommended that the same data connection be used for dimension modeling.

      *Database

      The name of the database. Select a database from the drop-down list box.

      Queue

      DLI queue. This parameter is available only for DLI data connections.

      Schema

      DWS or POSTGRESQL mode. This parameter is displayed only for DWS and POSTGRESQL data connections.

      Table Type

      DWS connections support the following tables:

      • DWS_ROW: Tables are stored to disk partitions by row.
      • DWS_COLUMN: Tables are stored to disk partitions by column.

      MRS_HIVE supports only HIVE_TABLE.

      Distributed By

      This parameter is displayed only for DWS data connections. Currently, only REPLICATION and HASH are supported. You can select multiple fields.

      • REPLICATION: A full table is stored on each DN. The advantage of this option is that each DN has all the data of a table. During the join operation, data redistribution can be avoided, reducing network overhead. The disadvantage is that each DN retains the complete data of a table, resulting in data redundancy. Generally, this option is recommended for small dimension tables.
      • HASH: If you select this option, you must specify a distribution key for the user table. When a record is inserted, the system performs hash computing based on values in the distribute keys and then stores data on the corresponding DN. In a Hash table, I/O resources on each node can be used during data read/write, which improves the read/write speed of a table. Generally, this option is recommended for large dimension tables (a large dimension table contains over 1 million records).

      *Owner

      You can enter an owner name or select an existing owner.

      * Description

      A description of the summary table to create. It allows 1 to 600 characters.

    2. In the Time Partition area, enter the field code and select the data type. After a table is published, data is written to the table based on the time partition fields.
      Figure 3 Time Partition area
    3. In the Metric Fields area, click Add to add derivative or compound metrics that are associated with the specified statistical dimensions.
      Figure 4 Metric Fields area
  5. Click Publish. In the dialog box displayed, click OK.
  6. Select a reviewer to approve the summary table.

    After the summary table is approved, it is automatically created in the database.

  7. Go back to the summary table list and locate the table just published. View its synchronization status in the Sync Status column.
    • If the synchronization is successful, the summary table is successfully published and created in the database.
    • If the synchronization failed, choose More > View History in the row where the summary table is located. On the page displayed, click the History tab to view logs. Troubleshoot the problem based on the logs. After the error is rectified, choose More > Synchronize above the summary table list to issue the synchronization command again. If the problem persists, contact technical support personnel.

Managing a Summary Table

  1. On the DataArts Architecture page, choose Models > Dimensional Modeling in the left navigation pane. On the displayed page, click the Summary Tables tab.
    Figure 5 Summary Tables page
  2. Manage your summary tables as required. Refer to the following table for details.

    Operation

    Helpful Link

    Create

    Creating and Publishing a Summary Table

    Edit

    3

    Publish

    4

    View History

    5

    Preview SQL

    6

    Suspend

    7

    Associate Rule

    8

    Delete

    9

    Import

    10

    Export

    11

  3. Edit a summary table.
    1. Click Edit to the right of the target summary table.
    2. Edit the summary table as required.
    3. Click Publish.
  4. Publish a summary table.
    1. Click Publish to the right of the target summary table.
    2. In the Submit for Publication dialog box displayed, select a reviewer from the drop-down list box.
    3. Click OK.
  5. View the publish history.
    1. Select the target summary table in the list and choose More > View History on the right.
    2. On the page displayed, you can view the publish history and version comparison information of the summary table.

      If the publish log includes error logs, the publishing has failed. You can click Resynchronize to retry.

  6. Previewing an SQL statement.
    1. Select the target summary table in the list and choose More > Preview SQL on the right.
    2. On the page displayed, you can view or copy the SQL statement.
  7. Suspend a summary table.
    1. Click Suspend to the right of the target summary table.
    2. In the Submit for Suspension dialog box displayed, select a reviewer from the drop-down list box.
    3. Click OK.

      After a summary table is suspended, you can determine how to process APIs based on the actual situation in DataArts DataService. DataArts Architecture does not process the APIs.

  8. Associate a summary table with a quality rule.
    1. Select the target summary table in the summary table list and click Associate Rule above the list.
    2. In the Associate Quality Rule dialog box, you can add rules to the fields in the summary table in batches and associate the rules with the fields.
    3. Click OK.
  9. Delete a summary table.
    1. Select the target summary table and choose More > Delete above the list.
    2. In the dialog box displayed, click Yes.
  10. Import

    You can import summary tables to the system quickly.

    1. Above the summary table list, choose More > Import.
      Figure 6 Import Summary Table
    2. Download the summary table template, and edit and save it.
    3. Choose whether to update existing data.

      If a code in the template already exists in the system, the data is considered duplicate.

      • No: If the data to be imported already exists in the system, the existing data in the system will not be replaced.
      • Yes: If the data to be imported already exists in the system:
        • If the existing data in the system is in draft state, the data will be replaced and new draft data will be generated.
        • If the existing data in the system is in published state, expanded data will be generated.
    4. Click Select File and select the edited template to import.
    5. Click Upload. When the template is uploaded, the Last Import page is displayed. You can view the imported data.
    6. Click Close.
  11. Export summary tables.

    You can export summary tables to a local file.

    1. Select the summary tables to export on the Manually Created or Automatically Aggregated page.
    2. Above the summary table list, choose More > Export.
    • You can export all the summary tables of a subject by selecting the subject in the subject list on the left.
    • You can export all the summary tables of a workspace, as long as there are no more than 500 summary tables in the workspace.

Associating a Summary Table with a Quality Rule

  1. On the DataArts Architecture page, choose Models > Dimensional Modeling in the left navigation pane.
  2. Click the Summary Tables tab.
  3. Select the target summary table in the list, and click Associate Rule.
    Figure 7 Associating a summary table with a quality rule
  4. On the page displayed, set the parameters. After the configuration is complete, click OK.
    • Update Existing Rule: If this option is selected, the newly added rule will overwrite the old rule.
    • Table Field: This parameter applies to all fields by default. You can enter a regular expression to filter fields as required.
    • WHERE Clause: This parameter can be used to filter fields.
    • Generate Anomaly Data: If this option is selected, anomaly data is stored in the specified database based on the configured parameters.
    • Database/Schema: database or schema that stores anomaly data. This parameter is displayed when Generate Anomaly Data is enabled.
    • Table Prefix: prefix of the table that stores anomaly data. This parameter is displayed when Generate Anomaly Data is enabled.
    • Table Suffix: suffix of the table that stores anomaly data. This parameter is displayed when Generate Anomaly Data is enabled.
    • Add Rule: You can click Add Rule to add a rule. For example, add a rule named Unique value, select the rule, click OK, enter an alarm condition expression in the Alarm Condition text box, add other rules in the same way, and click OK. An example alarm expression is as follows:

    • An alarm condition expression consists of alarm parameters and logical operators. When a quality job is running, the system calculates the result of the alarm condition expression and determines whether to trigger the alarm based on the result of the expression. If the expression result is true, the alarm will be triggered. Otherwise, no quality alarm will be triggered. In the Associate Quality Rule dialog box, the alarm parameters of each quality rule are displayed as buttons.
    Figure 8 Associating a summary table with a quality rule

Associating a Summary Table Field with a Data Standard

  1. On the DataArts Architecture page, choose Models > Dimensional Modeling in the left navigation pane.
  2. Click the Summary Tables tab.
  3. Click the name of the target summary table in the list.
  4. In the table field list on the details page of the summary table, search for the target field, click corresponding to the field to configure the association between the field and the data standard.
    Figure 9 Associating a summary table field with a data standard
  5. After the configuration is complete, click OK. For details on the sources of data standards, see Creating a Data Standard.
    Figure 10 Associating a data standard

Associating a Single Field with a Quality Rule

  1. On the DataArts Architecture page, choose Models > Dimensional Modeling in the left navigation pane.
  2. Click the Summary Tables tab.
  3. In the summary table list, click the name of the target summary table.
  4. In the table field list on the summary table details page, locate the target field and click to associate the field with a quality rule.
    Figure 11 Associating a single table field with a quality rule
  5. After the configuration is complete, click OK.
    • Update Existing Rule: If this option is selected, the newly added rule will overwrite the old rule.
    • Add Rule: You can click Add Rule to add a rule. For example, add a rule named Unique value, select the rule, click OK, enter an alarm condition expression in the Alarm Condition text box, add other rules in the same way, and click OK.
    • An alarm condition expression consists of alarm parameters and logical operators. When a quality job is running, the system calculates the result of the alarm condition expression and determines whether to trigger the alarm based on the result of the expression. If the expression result is true, the alarm will be triggered. Otherwise, no quality alarm will be triggered. In the Associate Quality Rule dialog box, the alarm parameters of each quality rule are displayed as buttons.
    Figure 12 Associating a quality rule

Associating Table Fields with a Quality Rule in Batches

  1. On the DataArts Architecture page, choose Models > Dimensional Modeling in the left navigation pane.
  2. Click the Summary Tables tab.
  3. In the summary table list, click the name of the target summary table.
  4. In the table field list on the summary table details page, select the target table fields and click Associate Rule.
    Figure 13 Associating fields with a quality rule
  5. On the page displayed, add a rule and set the rule parameters.
    • Update Existing Rule: If this option is selected, the newly added rule will overwrite the old rule.
    • Add Rule: You can click Add Rule to add a rule. For example, add a rule named Unique value, select the rule, click OK, enter an alarm condition expression in the Alarm Condition text box, add other rules in the same way, and click OK.
    • An alarm condition expression consists of alarm parameters and logical operators. When a quality job is running, the system calculates the result of the alarm condition expression and determines whether to trigger the alarm based on the result of the expression. If the expression result is true, the alarm will be triggered. Otherwise, no quality alarm will be triggered. In the Associate Quality Rule dialog box, the alarm parameters of each quality rule are displayed as buttons.
    Figure 14 Adding a rule
  6. After the configuration is complete, click OK.