What's New

Updated on 2024/11/06 GMT+08:00

The tables below describe the functions released in each DataArts Studio version and corresponding documentation updates. New features will be successively launched in each region.

November 2024

No.

Feature

Description

Phase

Document

1

Open APIs

  • APIs for accessing DataArts Architecture are available.
  • APIs for accessing DataArts Quality are available.
  • APIs for accessing DataArts DataService are available.

Commercial use

DataArts Architecture API Overview

DataArts Quality API Overview

DataArts DataService API Overview

October 2024

No.

Feature

Description

Phase

Document

1

DataArts Factory

DataArts Studio supports offline processing migration jobs, cross-cluster delivery of data migration jobs, and common batch job migration capabilities. To use offline processing migration jobs, submit a service ticket to apply for the trustlist membership.

Open beta testing

Offline Processing Migration Job Development

2

DataArts Factory

DataArts Studio supports real-time data synchronization. This function allows you to synchronize data in some or all tables from one database to another in real time to ensure data consistency between the databases.
Real-time processing migration jobs are available in AP-Singapore, and LA-Mexico City. (It will be available in other regions soon.) To use such jobs, submit a service ticket to apply for the trustlist membership.

Open beta testing

Real-Time Processing Migration Job Development

3

Version Mode Has Changed

The version mode of DataArts Studio has changed in some regions to provide flexible resource configuration and lightweight data governance capabilities.
  • Now you can purchase DataArts Studio instances of the starter, expert, or enterprise version.
  • The version mode change does not affect the DataArts Studio instances you have purchased before, which may be of the starter, basic, advanced, professional, or enterprise version.

Compared with the old version mode, the new version mode provides more favorable prices and more flexible resource scaling. If you want to experience the new version mode, you are advised to buy a new DataArts Studio instance, migrate service data from the original instance to the new instance by referring to DataArts Studio Data Migration Configuration, and then unsubscribe from the original instance.

Commercial use

Versions

Buying an Incremental Package for Job Node Scheduling Times/Day

Buying an Incremental Package for Technical Asset Quantity

Buying an Incremental Package for Data Model Quantity

June 2024

No.

Feature

Description

Phase

Document

1

DataArts Migration

Managing Agents became unavailable.

Commercial use

Managing Links

October 2023

No.

Feature

Description

Phase

Document

1

Instance management

When a DataArts Migration incremental package is purchased, the CDM cluster can be associated with multiple workspaces.

Commercial use

Buying a DataArts Migration Incremental Package

September 2023

No.

Feature

Description

Phase

Document

1

DataArts Factory

DataArts Studio supports yarn queue and space binding. Jobs automatically distinguish real-time offline jobs and submit them to their respective queues. The MRS job operator adds the "MRS Resource Queue" parameter, involving 5 operator nodes (MRS Spark SQL, MRS Spark, MRS Hive SQL, MRS Spark Python, and MRS Flink Job).

Commercial use

MRS Spark SQL

MRS Spark

MRS Hive SQL

MRS Spark Python

2

DataArts Factory

  • Complementary data supports discrete business date supplementary data.
  • DataArts Studio supports batch configuration of job timeout retry and job operator nodes support timeout retry configuration.
  • The job operator will retry once by default if it fails.

Commercial use

Batch Job Monitoring > PatchData

Configuring Jobs

Configuring a Default Item > Default Retry Policy upon Job Operator Failure

3

DataArts Factory

  • If the job fails and is rerun successfully, the user can configure it.
  • When re-running a job instance, the latest job version will be used for re-running.
  • Configure retry jobs to support reporting alarms after the first failure.
  • For jobs with cross-cycle dependencies, skip waiting instances and run the most recent batch of job instances that support minute or hourly scheduling.

Commercial use

Managing Notifications > Configuring a Notification

Monitoring an Instance > Rerunning Job Instances

Configuring a Default Item > Alarm Upon First Job Operator Failure

Setting Up Scheduling for a Job > Cross-Cycle Dependencies

4

DataArts Architecture

  • When correlating technical indicators, you can choose atomic indicators.
  • Supports the search function when configuring mapping on the page, including dimension tables, physical tables, dimensions, and fact tables.
  • Data standard business objects support automatic encoding, and L3 supports automatic encoding.
  • Relational modeling logical entities transform physical tables.

Commercial use

Business Metrics

Creating a Physical Model

Managing the Configuration Center > Encoding Rules

Designing Physical Models > Converting a Logical Model to a Physical Model

5

DataArts Quality

  • When setting up an exception table, a suffix is ​​added by default.
  • The data quality operation and maintenance management interface job instance has a new operator column and supports search and export.
  • Data quality reports can be downloaded after being exported.
  • Data quality maintains modification of currently existing rule templates.

Commercial use

Creating Quality Jobs

Viewing Job Instances

Viewing Quality Reports

Creating Rule Templates

6

DataArts Quality

  • The number of concurrency can be manually adjusted during quality job cycle scheduling.
  • The quality rules interface supports searching based on business objects or logical entities and attribute names.
  • Data quality reports can refresh historical data.
  • The search box is case sensitive when creating quality tasks.

Commercial use

Viewing Job Instances

Creating Quality Jobs

Viewing Quality Reports

Creating Quality Jobs

7

DataArts Factory

  • A search box function is added to the job name of complement monitoring.
  • The yarn queue is bound to the space, and the jobs are automatically distinguished and real-time offline jobs are submitted to their respective queues. The job operator can configure the MRS resource queue (supporting MRS Spark SQL, MRS Spark, MRS Hive SQL, MRS Spark Python, MRS Flink Job and other operators) .

Commercial use

Monitoring PatchData

MRS Spark SQL

August 2023

No.

Feature

Description

Phase

Document

1

DataArts Factory

  • Tasks that run abnormally/failed can be configured with multiple alarm reminders before the task is repaired.
  • When a job fails to run, and the job is rerun and the job runs successfully, a job instance recovery notification will be sent. The monitoring message notification service supports selecting task leaders.
  • Importing the GES operator point-edge data set supports directly selecting the point-edge data set csv file in the corresponding OBS bucket, and also supports selecting the OBS path of the corresponding edge data set.
  • Added "Run Cancel" to the alarm notification type.

Commercial use

Managing Notifications > Configuring a Notification

Managing Terminal Subscriptions

Import GES

Managing Notifications

2

DataArts DataService

When APIs are generated in configuration mode, request parameters can be copied so that multiple input binding parameters can match binding fields.

Commercial use

Generating an API Using Configuration

July 2023

No.

Feature

Description

Phase

Document

1

DataArts Factory

Jobs that specify baselines can be code reviewed.

Commercial use

Review Center

2

DataArts Quality

  • The data quality exception table generates logical links, which can be reflected in the job configuration and logs.
  • Data quality SQL rules support flexible configuration of multiple tables and parameters.
  • When data quality management configures rules for multiple tables, the data ranges of different tables can be set independently.

Commercial use

Creating Quality Jobs

3

DataArts Quality

Data quality business indicators, quality operations, and reconciliation operations support clickhouse data sources.

Data quality supports parameter passing.

Data Quality supports the ability to batch stop scheduling/running.

Data quality operation and maintenance management supports ascending and descending sorting when viewing quality job results.

Commercial use

Creating a Metric

Creating Quality Jobs

Scheduling Quality Jobs

Viewing Job Instances

4

DataArts Architecture

  • The data architecture supports MySQL and Oracle data sources.
  • When modeling relationships, custom naming of data warehouse hierarchies is supported.
  • When creating a table in the physical model, it supports logical attributes associated with table fields.
  • The composite indicator type supports year-on-year growth rate and month-on-month growth rate.

Commercial use

Data Sources

Creating a Physical Model

Creating and Publishing a Table

Creating Compound Metrics

5

DataArts Factory

  • Batch job monitoring, instance monitoring, and supplementary data monitoring support removing dependencies on a single upstream instance.
  • The data supplement task needs to add a stop time and a failure stop button.
  • When setting alarm notifications, jobs that are not configured with a certain notification type can be filtered out for batch settings.
  • Configure environment variables and add variable explanation fields.

Commercial use

Monitoring an Instance

Monitoring a Batch Job > PatchData

Monitoring a Batch Job

Configuring Environment Variables > Configuration Method

6

DataArts Factory

  • Supports setting job priority when supplementing data.
  • The range of days of job instances that the alarm notification configured in notification management can monitor.
  • Configure the number of days to wait for the job instance to expire. When a job instance waits to run for more than the configured expiration number of days, the job instance cancels execution.
  • The python spark task of DataArts needs to support the ability to write python code online (including Spark python scripts and job operators).

Commercial use

Configuring a Default Item > Setting PatchData Priority

Configuring a Default Item > Historical Job Instance Alarm Policy

Configuring a Default Item > Historical Job Instance Cancellation Policy

MRS Spark Python

June 2023

No.

Feature

Description

Phase

Document

1

Instance management

Allows users to pin a workspace to the top and delete a workspace.

Commercial use

Creating a Workspace in Simple Mode

2

DataArts Factory

  • Added parameter value preview for Pipeline operator script parameters.
  • DataArts Studio supports viewing graphics of task count statistics in the operation and maintenance overview.

Commercial use

Developing a Pipeline Job

Overview

3

DataArts Factory

In enterprise mode, when publishing scripts/job tasks, you can designate approvers for approval.

  • All administrators and deployers under the workspace can be designated as approvers.
  • Each release must have an approver assigned.
  • Approver information can be maintained through approver management

Commercial use

Releasing a Job Task

Releasing a Script Task

4

DataArts Factory

  • You can view the scheduling configuration information of the job on the job monitoring details page.
  • The job relationship dependency graph supports downloading the job's dependency files according to the dependency name.
  • EL expression supports using DateUtil to get the quarter of a date.
  • DataArts Studio supports batch setting of processing and scheduling strategies after dependent job failure.

Commercial use

Monitoring a Batch Job

Viewing a Job Dependency Graph

DateUtil Embedded Objects

Configuring Jobs

5

DataArts Factory

  • Export jobs support exporting to OBS paths.
  • DataArts Studio adds job parameter preview function.
  • Subjob and For Each nodes support configuring whether the job node name changes synchronously.
  • DataArts Studio supplementary data supports batch concurrency.

Commercial use

Exporting and Importing a Job

Developing a Pipeline Job

Configuring a Default Item

Batch Job Monitoring: PatchData

6

Management Center

  • When editing the data connection, you do not need to enter the password again.
  • When importing a resource, you can select OBS or Local Upload.

Commercial use

Creating a Data Connection

Migrating Resources

May 2023

No.

Feature

Description

Phase

Document

1

DataArts Factory

  • The DataArts Studio submission version system keeps the latest 100 version record.
  • Support for creating partitions when creating DLI tables.
  • DataArts Studio supports early freezing of job instances.

Commercial use

Submitting a Version and Unlocking the Script

Creating a Table

Batch Job Monitoring: Job Instances

2

Enterprise Mode

A new enterprise model is added to support the isolation of development and production environments to achieve a safe and standardized code release management and control process.

  • Support new enterprise mode, or upgrade simple mode to enterprise mode.
  • Support job tasks to be released to the production environment.
  • Support script tasks to be released to production environment.

Open beta testing

Enterprise Mode

Releasing a Job Task

Releasing a Script Task

3

DataArts Factory

  • Support for saving multiple uncommitted versions.
  • The scheduling cycle supports minute-level configuration in discrete hours.
  • Single-task jobs support the shortcut key Ctrl + S to save.
  • Support SQL editor for style configuration.

Commercial use

Developing an SQL Script

Setting Up Scheduling for a Job

Developing a Single-Task Job

Job Development Process

4

DataArts Factory

  • O&M Overview Optimization.
  • DataArts Studio adds job type filtering function.
  • DataArts Studio supplementary data monitoring supports filtering based on operator and creation time.
  • When creating a DLI table, the OBS directory can be automatically created.

Commercial use

Overview

Monitoring a Batch Job

Monitoring PatchData

Creating a Table

5

DataArts Factory

  • You can directly jump to the job monitoring page by right-clicking the job tree.
  • Support for global search.
  • Query job instance list API supports precise query.
  • SQL script execution results can be viewed through the download center.

Commercial use

Going to Monitor Job page

Script Development Process

Viewing a Job Instance List

Download Center

6

DataArts Factory

  • The generated job instance is waiting to run. After a new job version is released, whether the instance will run with the latest job version supports setting through the default item configuration.
  • Whether the time for waiting to run is calculated within the timeout period. It is supported to be set through the default item configuration.
  • DataArts Studio job scheduling supports scheduling jobs to force priority execution operations.

Commercial use

Synchronization of Job Version by Waiting Instance

Exclude Waiting Time from Instance Timeout Duration

Monitoring an Instance

April 2023

No.

Feature

Description

Phase

Document

1

DataArts Factory

  • Single-task jobs support to associate quality jobs.
  • DataArts Studio supports Python3 scripts.
  • It is convenient to copy a long script name after finding the job. Unsubmitted or unscheduled jobs are identified by color.
  • The pages that DataArts Studio depends on support search and copy by dependency name.

Commercial use

Developing a Single-Task Job

Developing a Python Script

Job Development Process

Viewing a Job Dependency Graph

2

DataArts Factory

  • Script parameter interface optimization.
  • View the job dependency graph through the job tree directory.
  • Flink SQL supports custom templates.
  • DataArts Studio supports the API interface of the last modification person of the open assignment.

Commercial use

Script Development Process

Viewing a Job Dependency Graph

Configuring a Template

Creating a Job

3

DataArts Factory

  • Natural cycle scheduling as default option for new instances of DataArts Studio.
  • Add script and job approval capabilities. When submitting the task version, you can designate reviewers for approval.
  • Batch job monitoring supports screening according to scheduling method and scheduling cycle.
  • DataArts Studio supports single-task streaming Flink SQL.

Commercial use

Setting Up Scheduling for a Job

Review Center

Monitoring a Batch Job

Creating a Job

March 2023

No.

Feature

Description

Phase

Document

1

DataArts Factory

  • Intercycle dependencies support jobs to skip blocking unexecuted batches.
  • DLI Spark node supports choosing Spark version.
  • Support setting workspace public execution user.

Commercial use

Setting Up Scheduling for a Job

DLI Spark

Configuring a Scheduling Identity

2

DataArts Factory

  • Support setting the upper limit of the number of nodes that the workspace can run in parallel at the same time.
  • When supplementing data, when selecting upstream and downstream jobs, it supports displaying complete job dependencies.
  • Support for notification policies that configure jobs to ignore failures.

 

Commercial use

Configuring the Number of Concurrently Running Nodes

Monitoring a Batch Job

Configuring a Default Item

February 2023

No.

Feature

Description

Phase

Document

1

Management Center

  • MRS Hive data connection supports LDAP authentication.
  • The password for editing a connection is changed to an optional parameter.
  • During migrate resources, you can upload resources from OBS or local file.

Commercial use

Creating Data Connections

Migrating Resources

December 2022

No.

Feature

Description

Phase

Document

1

DataArts Migration

  • MRS Hudi as the data source.
  • MRS ClickHouse as the data source
  • Using the first row as the header row during CSV file migration.
  • Enabling display of column names during CSV file migration.

Commercial use

Supported Data Sources

Table/File Migration Jobs

From OBS

August 2022

No.

Feature

Description

Phase

Document

1

DataArts Migration

CDM is interconnected with Tag Management Service (TMS). You can filter CDM clusters by tag.

Commercial use

Managing Cluster Tags

July 2022

No.

Feature

Description

Phase

Document

1

Service name change

The service name was changed from Data Lake Governance Center (DGC) to DataArts Studio.

Commercial use

DataArts Studio

April 2022

No.

Feature

Description

Phase

Document

1

DataArts Migration

Clusters cannot be stopped. (When a cluster is stopped, its resources may be occupied, and the cluster may become unavailable.)

Commercial use

Managing Clusters

December 2021

No.

Feature

Description

Phase

Document

1

DataArts Architecture

Custom attributes were added in "Configuration Center- Functions"

Commercial use

Configuration Center

November 2021

No.

Feature

Description

Phase

Document

1

DataArts Quality

  • Rule templates can be exported and imported.
  • Quality jobs can be exported and imported.
  • Comparison jobs can be exported and imported.
  • The scoring system can be customized for quality reports.

Commercial use

DataArts Quality Overview

2

DataArts Factory

  • The UI style was reconstructed to improve experience and visual effect.
  • During job development, you can right-click a node and select Test from Current Node.
  • Job parameters can be displayed in masks.
  • During SQL script development, data tables can be read and SQL statements can be generated.
  • A maximum of 1,000 data records can be displayed during DLI SQL script execution.
  • The script development and data development tabs can be dragged to adjust their positions.
  • During script development and data development, browser data can be cached to prevent data loss caused by misoperations.

Commercial use

DataArts Factory Overview

September 2021

No.

Feature

Description

Phase

Document

1

DataArts Migration

Scenario migration became unavailable.

Commercial use

Related content was removed from the documentation.

July 2021

No.

Feature

Description

Phase

Document

1

DataArts Architecture

Configuration item Data Standard Allows Duplicate Names was add to the Functions tab page of Configuration Center.

Commercial use

Configuration Center.

2

DataArts Migration

  • Incremental data of Oracle, SQL Server, and MySQL databases can be extracted in batches by ID.
  • Data sources not supported were removed from the official website.
  • The repair API of Hive was optimized.

Commercial use

Supported Data Sources.

May 2021

No.

Feature

Description

Phase

Document

1

DataArts Factory

  • The View Reference option is available in the shortcut menu when a connection, script, or resource is right-clicked.
  • The Copy Name option is available in the shortcut menu when a script or job is right-clicked (the name can contain a maximum of 64 characters).
  • Shortcut keys are available for the SQL editor during script development.
  • The script execution results can be displayed on multiple pages. You can query, filter, and copy multiple result pages.
  • During job development, ECSs to be configured on the Open/Close Resource node can be searched.
  • The Edit CDM Job option is available in the shortcut menu when a CDM Job node is right-clicked.
  • During script and job development, a message is displayed when multiple users are editing the same object at the same time.
  • The Submit Version button was replaced by the Save and Submit button, and the button position changed.
  • Jobs on the Monitor Job page can be filtered by priority.
  • Instances on the Monitor Instance page can be sorted by Planned Start TimeActual Start TimeEnd Time, and Running Duration, and can be filtered by Status. The Version column was added.

Commercial use

DataArts Factory

2

DataArts Migration

DataArts Migration supports the import and export of some data sources.

Commercial use

Supported Data Sources

February 2021

No.

Feature

Description

Phase

Document

1

DataArts Migration

  • When GaussDB(DWS) serves as the destination, DataArts Migration supports both insert into and update.
  • DataArts Migration supports read and write operations on Data Lake Insight (DLI) foreign tables.
  • MySQL synchronization supports the following modes for destination tables: insert into, update, and overwrite.

Commercial use

Supported Data Sources.

2

DataArts Factory

  • A job whose schedule cycle is Minute can depend on the job whose schedule cycle is Day.

Commercial use

DataArts Factory

3

DataArts Architecture

  • Time fields can be set based on common filters.

Commercial use

DataArts Architecture

January 2021

No.

Feature

Description

Phase

Document

1

DataArts Migration

  • DataArts Migration supports the Dameng database.

Commercial use

Supported Data Sources

2

DataArts Factory

  • The display of job dependency graphs is optimized. You can view the complete dependency between upstream and downstream jobs, and drag and zoom in/out the graph to display the job relationship clearly.
  • The version submission function is added for jobs and scripts to distinguish jobs (scripts) in the development state from those in the scheduling state. During formal scheduling, the latest submitted version is used in scenarios such as job dependency, instance rerunning, and PatchData.

Commercial use

DataArts Factory

3

DataArts Quality

  • User-defined jobs can be bound to tables and dimensions, and scoring settings are supported.
  • Rule templates can be brought online or offline and can be migrated in batches.
  • The DataArts Quality overview function is enhanced, and job statistics information is supplemented.

Commercial use

DataArts Quality