Help Center > > Service Overview> Overview

Overview

Updated at: Mar 25, 2021 GMT+08:00

Challenges Faced by Enterprise Digital Transformation

Enterprises often face challenges in the following aspects when managing data:

  • Governance
    • Inconsistent data system standards hinder data sharing and exchange between different departments.
    • There are no effective search tools to help service personnel locate the data they need.
    • Metadata fails to define data in business terms that data consumers are familiar with, making the data difficult to understand.
    • There are no good methods to evaluate and control data quality, making the data hard to trust.
  • Operations
    • Data analysts and decision makers require efficient data operations. There is no efficient data operations platform to address the growing and diversified demands for analytics and reporting.
    • Repeat development of the same data wastes time, slows down development, and results in too many copies of the same data. Inconsistent data standards result in a waste of resources and drive up costs.
  • Innovation
    • Data silos prevent data from being shared across departments or domains.
    • Most enterprises primarily use data for analytical or reporting purposes. Enterprises still have a long way to go to achieve widespread, true data-driven innovation.

What Is DGC?

Data Lake Governance Center (DGC) is a one-stop data operations platform that drives digital transformation. It allows you to perform many operations, such as integrate and develop data, design data standards, control data quality, create data services, and manage data assets. Incorporating big data storage, computing, and analytical engines, DGC can also be used to construct industry knowledge libraries and help your enterprise build an intelligent end-to-end data system. This system can eliminate your data silos, unify your data standards, accelerate data monetization, and accelerate your enterprise's digital transformation.

Figure 1 shows the DGC architecture.

Figure 1 DGC architecture
  • Data Integration

    Migrate batch data, integrate real-time data, synchronize databases in real time, and ingest data from 20+ heterogeneous data sources. Integrate a single table or an entire database, or data that is generated over time or repeatedly.

  • Data Design

    Plan the data architecture, customize models, unify data standards, visualize data modeling, and label data. Data Design defines how data will be processed and utilized to solve business problems and helps you make informed decisions.

  • Data Development

    Build a big data processing center, create data models, integrate data, develop scripts, and orchestrate workflows.

  • Data Quality Control (DQC)

    Monitor the data quality in real time with data lifecycle management and generate real-time notifications on abnormal events.

  • Data Assets

    Manage your metadata to make sense of your data assets. A data map shows the lineage of your data and allows you to have a global view of your data assets. Data search, operations, and monitoring are more intelligent than before.

  • Data Lake Mall (DLM)

    Develop, test, and deploy your data services. Ensure agile response to data service needs, easier data retrieval, better experience for data consumers, higher efficiency, and better monetization of data assets.

  • Data Security

    Discover sensitive data; grade, classify, and protect your data; implement access control; encrypt data during transmission and storage; identify data risks; and audit compliance. Data Security is an efficient tool to establish a security risk warning mechanism and improve your enterprise's overall data protection capability, securing your data while making your data more accessible.

  • Intelligent Data Lake

    Integrate diversified data engines, HUAWEI CLOUD data lakes, cloud database services, and traditional enterprise data warehouses, for example, Data Lake Insight (DLI), GaussDB (DWS), Oracle, and Greenplum.

DGC Specifications

Table 1 DGC specifications

Specification

Starter

Basic

Advanced

Professional

Enterprise

DGC data integration node

Node quantity: 1

Name: cdm.medium

vCPUs | memory: 4 vCPUs | 8 GB

Baseline/Max. bandwidth: 0.4/1.5 Gbit/s

Concurrent jobs: 20

Node quantity: 1

Name: cdm.medium

vCPUs | memory: 4 vCPUs | 8 GB

Baseline/Max. bandwidth: 0.4/1.5 Gbit/s

Concurrent jobs: 20

Node quantity: 1

Name: cdm.large

vCPUs | memory: 8 vCPUs | 16 GB

Baseline/Max. bandwidth: 0.8/3 Gbit/s

Concurrent jobs: 30

Node quantity: 1

Name: cdm.xlarge

vCPUs | memory: 16 vCPUs | 32 GB

Baseline/Max. bandwidth: 4/10 Gbit/s

Concurrent jobs: 100

Node quantity: 1

Name: cdm.xlarge

vCPUs | memory: 16 vCPUs | 32 GB

Baseline/Max. bandwidth: 4/10 Gbit/s

Concurrent jobs: 100

Job scheduling times/day

(including data development jobs, data quality monitoring jobs, and metadata collection jobs)

5,000/day

20,000/day

40,000/day

80,000/day

200,000/day

Data objects that can be governed

(including tables and schemas in which you can implement metadata collection, data modeling, and quality monitoring)

Not supported

1,000

2,000

4,000

10,000

Table 2 Components supported by DGC

Component

Starter

Basic

Advanced

Professional

Enterprise

Data Integration-CDM

Supported

Supported

Supported

Supported

Supported

Data Integration-DIS

Supported. You need to buy a DIS incremental package.

Supported. You need to buy a DIS incremental package.

Supported. You need to buy a DIS incremental package.

Supported. You need to buy a DIS incremental package.

Supported. You need to buy a DIS incremental package.

Management Center

Supported

Supported

Supported

Supported

Supported

Data Design

Not supported

Supported

Supported

Supported

Supported

Data Development

Supported

Supported

Supported

Supported

Supported

Data Quality Control (DQC)

Not supported

Supported

Supported

Supported

Supported

Data Assets

Not supported

Supported

Supported

Supported

Supported

Data Lake Mall (DLM)

Not supported

Supported

Supported

Supported

Supported

How Do I Select a DGC Version?

Version

Application Scenario

Starter

A primary data lake project with no full-time data development engineers and no data governance needs

Basic

One or two full-time data development engineers, and up to 1,000 data tables

Advanced

Five to ten full-time data development engineers, clear data standards and efficient data quality management, and up to 2,000 data tables

Professional

Large or medium enterprises with a team of 10 to 30 full-time data development engineers and well-designed systems

Enterprise

Large enterprises and enterprises with multiple branches.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel