Updated on 2023-07-13 GMT+08:00

Practices

After you have obtained an account, a DataArts Studio instance, and a workspace by performing the operations in Preparations, you can use a series of common practices provided by DataArts Studio as you need.

Table 1 Common best practices

Practice

Description

Data migration

Advanced Data Migration Guidance

This best practice provides advanced guidance for using CDM, such as how to enable incremental migration and how to write expressions with macro variables of date and time.

Data development

Advanced Data Development Guidance

This best practice provides advanced guidance for using DataArts Factory, such as how to use the IF condition and the For Each node.

DataArts Studio+X

Cross-Workspace DataArts Studio Data Migration

Each workspace in an instance provides complete functions. Workspaces are allocated by branch or subsidiary (such as the group, subsidiary, and department), business domain (such as procurement, production, and sales), or implementation environment (such as the development, test, and production environment). There are no fixed rules.

As your business grows, you may allocate workspaces in a more detailed manner. In this case, you can migrate data from a workspace to another. The data includes data connections in Management Center, links and jobs in CDM, tables in DataArts Architecture, scripts and jobs in DataArts Factory, and jobs in DataArts Quality.

Authorizing Other Users to Use DataArts Studio

A data operations engineer is responsible for monitoring the data quality of a company and only needs the permissions of DataArts Quality. If the admin assigns the preset developer role to the data operations engineer, the engineer also has permissions of other modules, which may pose risks.

To address this issue, the admin can create a custom role Developer_Test based on the preset developer role with the addition, deletion, modification, and operation permissions of other modules removed, and assign the custom role to the data operations engineer. This method meets service requirements while avoiding the risk of excessive permissions.

How Do I View the Number of Table Rows and Database Size?

In the data governance process, you need to obtain the number of rows in a data table or the size of a database. The number of rows in a data table can be obtained using SQL statements or data quality jobs. The database size can be obtained in DataArts Catalog.

Comparing Data Before and After Data Migration Using DataArts Quality

Data comparison checks data consistency before and after data migration or processing. This section describes how to use the DataArts Quality module of DataArts Studio to check data consistency before and after data is migrated from GaussDB(DWS) to an MRS Hive partitioned table.

Scheduling a CDM Job by Transferring Parameters Using DataArts Factory

You can use EL expressions in DataArts Factory to transfer parameters to a CDM job to schedule it.

Enabling Incremental Data Migration Through DataArts Factory

The DataArts Factory module of DataArts Studio is a one-stop, collaborative big data development platform. You can enable incremental data migration through online script editing in DataArts Factory and periodic scheduling of CDM jobs. This practice describes how to use DataArts Factory together with CDM to migrate incremental data from GaussDB(DWS) to OBS.

Creating Table Migration Jobs in Batches Using CDM Nodes

In a service system, data sources are usually stored in different tables to reduce the size of a single table and meet the requirements in complex application scenarios. When using CDM to integrate data, you need to create a data migration job for each table. This tutorial describes how to use the For Each and CDM nodes provided by DataArts Factory to create table migration jobs.

Building Graph Data Based on MRS Hive Tables and Automatically Importing the Data to GES

In DataArts Studio, you can convert raw data tables into standard vertex data sets and edge data sets based on GES data import requirements, periodically import graph data (vertex data sets, edge data sets, and metadata) to GES using the automatic metadata generation function, and perform visualized graphical analysis on the latest data in GES.

Case study

Case: Trade Data Statistics and Analysis

Consulting company H uses CDM to import local trade statistics to OBS, and uses Data Lake Insight (DLI) to analyze the trade statistics. In a simple way, company H builds its big data analytics platform at an extremely low cost, allowing the company to focus on its businesses and make innovation continuously.

Case: IoV Big Data Service Migration to Cloud

Company H intends to build an enterprise-class cloud management platform for its IoV service to centrally manage and deploy hardware resources and general-purpose software resources, and implement cloud-based and service-oriented transformation of IT applications. CDM helps company H build the platform with no code change or data loss.

Case: Building a Real-Time Alarm Platform

In this practice, you will learn how to set up a simple real-time alarm platform using the job editing and scheduling functions of DataArts Factory, as well as other cloud services.