Creating Summary Tables
A summary table consists of specific analysis objects (such as members) and related statistical metrics. The metrics included in a summary table all have the same level of granularity (such as members). A summary table provides users with all of the available statistics on themed data (such as a member theme market), sorted by levels of granularity.
A summary table can be manually or automatically aggregated. This topic describes how to manually create a summary table.
On the DataArts Architecture page, choose
in the left navigation pane, and click the Functions tab. On the page displayed, if Create data development jobs is selected for Model Design Process, the system creates a data development job with a name starting with Database name_Table code. Choose to view the created job. By default, this job has no scheduling configuration. You need to configure scheduling for the job in the DataArts Factory module.Prerequisites
A dimension, a dimension table, a fact table, and a derivative metric have been created, published, and reviewed.
Creating and Publishing a Summary Table
- On the DataArts Studio console, locate an instance and click Access. On the displayed page, locate a workspace and click DataArts Architecture.
Figure 1 DataArts Architecture
- On the DataArts Architecture page, choose Summary Tables tab. in the left navigation pane. On the displayed page, click the
- Select a subject from the subject tree on the left and click Create.
- On the Create Summary Table page, perform the following operations:
- Set the parameters in the Basic Settings area.
Figure 2 Basic Settings area
Table 1 Parameters in the Basic Settings area Parameter
Description
*Subject
Select a subject catalog (business domain group > business domain > business object) where you can place the summary table.
*Table Name
The name of the table to create. It must start with letters. Only letters, digits, and the following special characters are allowed: ()-_
*Table Code
The code of the table to create. Table codes must start with letters. Only letters, numbers, and underscores (_) are allowed.
*Statistical Dimension
Select a statistical dimension. The drop-down list displays only the statistical dimensions configured on the Derivative Metrics page. If no statistical dimension is available in the drop-down list box, create one by referring to Creating Dimensions.
After a summary table is created, all dimension attributes of the specified dimensions are automatically added to the summary table as fields in the summary table. After creating a summary table, go to the summary table page and click the table name to view the field details in the table.
*Data Connection Type
The parameter value must be the same as that of the dimension table and fact table.
*Data Connection Name
It is recommended that the same data connection be used for dimension modeling.
*Database
The name of the database. Select a database from the drop-down list box.
Queue
DLI queue. This parameter is available only for DLI data connections.
Schema
DWS or POSTGRESQL mode. This parameter is displayed only for DWS and POSTGRESQL data connections.
Table Type
DWS connections support the following tables:
- DWS_ROW: Tables are stored to disk partitions by row.
- DWS_COLUMN: Tables are stored to disk partitions by column.
MRS_HIVE supports only HIVE_TABLE.
Distributed By
This parameter is displayed only for DWS data connections. Currently, only REPLICATION and HASH are supported. You can select multiple fields.
- REPLICATION: A full table is stored on each DN. The advantage of this option is that each DN has all the data of a table. During the join operation, data redistribution can be avoided, reducing network overhead. The disadvantage is that each DN retains the complete data of a table, resulting in data redundancy. Generally, this option is recommended for small dimension tables.
- HASH: If you select this option, you must specify a distribution key for the user table. When a record is inserted, the system performs hash computing based on values in the distribute keys and then stores data on the corresponding DN. In a Hash table, I/O resources on each node can be used during data read/write, which improves the read/write speed of a table. Generally, this option is recommended for large dimension tables (a large dimension table contains over 1 million records).
*Owner
You can enter an owner name or select an existing owner.
* Description
A description of the summary table to create. It allows 1 to 600 characters.
- In the Time Partition area, enter the field code and select the data type. After a table is published, data is written to the table based on the time partition fields.
Figure 3 Time Partition area
- In the Metric Fields area, click Add to add derivative or compound metrics that are associated with the specified statistical dimensions.
Figure 4 Metric Fields area
- Set the parameters in the Basic Settings area.
- Click Publish. In the dialog box displayed, click OK.
- Select a reviewer to approve the summary table.
After the summary table is approved, it is automatically created in the database.
- Go back to the summary table list and locate the table just published. View its synchronization status in the Sync Status column.
- If the synchronization is successful, the summary table is successfully published and created in the database.
- If the synchronization failed, choose in the row where the summary table is located. On the page displayed, click the History tab to view logs. Troubleshoot the problem based on the logs. After the error is rectified, choose above the summary table list to issue the synchronization command again. If the problem persists, contact technical support personnel.
Managing a Summary Table
- On the DataArts Architecture page, choose Summary Tables tab.
Figure 5 Summary Tables page
in the left navigation pane. On the displayed page, click the - Manage your summary tables as required. Refer to the following table for details.
- Edit a summary table.
- Click Edit to the right of the target summary table.
- Edit the summary table as required.
- Click Publish.
- Publish a summary table.
- Click Publish to the right of the target summary table.
- In the Submit for Publication dialog box displayed, select a reviewer from the drop-down list box.
- Click OK.
- View the publish history.
- Previewing an SQL statement.
- Select the target summary table in the list and choose on the right.
- On the page displayed, you can view or copy the SQL statement.
- Suspend a summary table.
- Click Suspend to the right of the target summary table.
- In the Submit for Suspension dialog box displayed, select a reviewer from the drop-down list box.
- Click OK.
After a summary table is suspended, you can determine how to process APIs based on the actual situation in DataArts DataService. DataArts Architecture does not process the APIs.
- Associate a summary table with a quality rule.
- Select the target summary table in the summary table list and click Associate Rule above the list.
- In the Associate Quality Rule dialog box, you can add rules to the fields in the summary table in batches and associate the rules with the fields.
- Click OK.
- Delete a summary table.
- Select the target summary table and choose above the list.
- In the dialog box displayed, click Yes.
- Import
You can import summary tables to the system quickly.
- Above the summary table list, choose More > Import.
Figure 6 Import Summary Table
- Download the summary table template, and edit and save it.
- Choose whether to update existing data.
If a code in the template already exists in the system, the data is considered duplicate.
- No: If the data to be imported already exists in the system, the existing data in the system will not be replaced.
- Yes: If the data to be imported already exists in the system:
- If the existing data in the system is in draft state, the data will be replaced and new draft data will be generated.
- If the existing data in the system is in published state, expanded data will be generated.
- Click Select File and select the edited template to import.
- Click Upload. When the template is uploaded, the Last Import page is displayed. You can view the imported data.
- Click Close.
- Above the summary table list, choose More > Import.
- Export summary tables.
You can export summary tables to a local file.
- Select the summary tables to export on the Manually Created or Automatically Aggregated page.
- Above the summary table list, choose More > Export.
- You can export all the summary tables of a subject by selecting the subject in the subject list on the left.
- You can export all the summary tables of a workspace, as long as there are no more than 500 summary tables in the workspace.
Associating a Summary Table with a Quality Rule
- On the DataArts Architecture page, choose in the left navigation pane.
- Click the Summary Tables tab.
- Select the target summary table in the list, and click Associate Rule.
Figure 7 Associating a summary table with a quality rule
- On the page displayed, set the parameters. After the configuration is complete, click OK.
- Update Existing Rule: If this option is selected, the newly added rule will overwrite the old rule.
- Table Field: This parameter applies to all fields by default. You can enter a regular expression to filter fields as required.
- WHERE Clause: This parameter can be used to filter fields.
- Generate Anomaly Data: If this option is selected, anomaly data is stored in the specified database based on the configured parameters.
- Database/Schema: database or schema that stores anomaly data. This parameter is displayed when Generate Anomaly Data is enabled.
- Table Prefix: prefix of the table that stores anomaly data. This parameter is displayed when Generate Anomaly Data is enabled.
- Table Suffix: suffix of the table that stores anomaly data. This parameter is displayed when Generate Anomaly Data is enabled.
- Add Rule: You can click Add Rule to add a rule. For example, add a rule named Unique value, select the rule, click OK, enter an alarm condition expression in the Alarm Condition text box, add other rules in the same way, and click OK. An example alarm expression is as follows:
- An alarm condition expression consists of alarm parameters and logical operators. When a quality job is running, the system calculates the result of the alarm condition expression and determines whether to trigger the alarm based on the result of the expression. If the expression result is true, the alarm will be triggered. Otherwise, no quality alarm will be triggered. In the Associate Quality Rule dialog box, the alarm parameters of each quality rule are displayed as buttons.
Figure 8 Associating a summary table with a quality rule
Associating a Summary Table Field with a Data Standard
- On the DataArts Architecture page, choose in the left navigation pane.
- Click the Summary Tables tab.
- Click the name of the target summary table in the list.
- In the table field list on the details page of the summary table, search for the target field, click corresponding to the field to configure the association between the field and the data standard.
Figure 9 Associating a summary table field with a data standard
- After the configuration is complete, click OK. For details on the sources of data standards, see Creating a Data Standard.
Figure 10 Associating a data standard
Associating a Single Field with a Quality Rule
- On the DataArts Architecture page, choose in the left navigation pane.
- Click the Summary Tables tab.
- In the summary table list, click the name of the target summary table.
- In the table field list on the summary table details page, locate the target field and click to associate the field with a quality rule.
Figure 11 Associating a single table field with a quality rule
- After the configuration is complete, click OK.
- Update Existing Rule: If this option is selected, the newly added rule will overwrite the old rule.
- Add Rule: You can click Add Rule to add a rule. For example, add a rule named Unique value, select the rule, click OK, enter an alarm condition expression in the Alarm Condition text box, add other rules in the same way, and click OK.
- An alarm condition expression consists of alarm parameters and logical operators. When a quality job is running, the system calculates the result of the alarm condition expression and determines whether to trigger the alarm based on the result of the expression. If the expression result is true, the alarm will be triggered. Otherwise, no quality alarm will be triggered. In the Associate Quality Rule dialog box, the alarm parameters of each quality rule are displayed as buttons.
Figure 12 Associating a quality rule
Associating Table Fields with a Quality Rule in Batches
- On the DataArts Architecture page, choose in the left navigation pane.
- Click the Summary Tables tab.
- In the summary table list, click the name of the target summary table.
- In the table field list on the summary table details page, select the target table fields and click Associate Rule.
Figure 13 Associating fields with a quality rule
- On the page displayed, add a rule and set the rule parameters.
- Update Existing Rule: If this option is selected, the newly added rule will overwrite the old rule.
- Add Rule: You can click Add Rule to add a rule. For example, add a rule named Unique value, select the rule, click OK, enter an alarm condition expression in the Alarm Condition text box, add other rules in the same way, and click OK.
- An alarm condition expression consists of alarm parameters and logical operators. When a quality job is running, the system calculates the result of the alarm condition expression and determines whether to trigger the alarm based on the result of the expression. If the expression result is true, the alarm will be triggered. Otherwise, no quality alarm will be triggered. In the Associate Quality Rule dialog box, the alarm parameters of each quality rule are displayed as buttons.
Figure 14 Adding a rule
- After the configuration is complete, click OK.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.