Help Center/ GaussDB(DWS)/ Service Overview/ Application Scenarios
Updated on 2024-12-27 GMT+08:00

Application Scenarios

Data Warehouse Migration

The data warehouse is an important data analysis system for enterprises. As the service volume grows, performance of their own data warehouses cannot meet the actual service requirements due to scalability limitation and high costs. As an enterprise-class data warehouse on the cloud, GaussDB(DWS) features high performance, low cost, and easy scalability, satisfying requirements in the big data era.

Figure 1 Data warehouse migration

Advantages

  • Seamless migration

    GaussDB(DWS) provides tools for easy migration of widely used data analysis systems like TeraData, Oracle, MySQL, SQL Server, PostgreSQL, Greenplum, and Impala.

  • Compatible with conventional data warehouses

    GaussDB(DWS) supports the SQL 2003 standard and stored procedures. It is compatible with some Oracle syntax and data structures, and can be seamlessly interconnected with typical BI tools, saving service migration efforts.

  • Secure and reliable

    GaussDB(DWS) supports data encryption and connects to DBSS to ensure data security on the cloud. In addition, GaussDB(DWS) supports automatic full and incremental backup of data, improving data reliability.

Converged Big Data Analysis

Data has become the most important asset. Enterprises must be able to integrate their data resources and build big data platforms to mine the full value of their data. In predictive analysis use cases, massive volumes of data must be processed. GaussDB(DWS) delivers the needed processing power to handle these intense compute scenarios.

Figure 2 Converged big data analysis

Advantages

  • Unified Analysis Entrance

    GaussDB(DWS) SQL acts as a unified entry point for upper-layer applications, enabling developers to access all data using SQL.

  • Real-Time Interactive Analysis

    Analysis personnel can obtain immediately actionable information from the big data platform in real time.

  • Auto Scaling

    Adding nodes allows you to easily expand into PB-range capacity while enhancing query and analysis performance of the system.

Enhanced ETL + Real-Time BI Analytics

The data warehouse is the pillar of the BI system for collecting, storing, and analyzing massive volumes of data. It powers business decision analysis for the finance, education, mobile Internet, and Online to Offline (O2O) industries.

Advantages

  • Data Migration

    Ability to import data in batches in real time from multiple data sources.

  • High Performance

    Cost-effective PB-level data storage and response to correlation analysis of trillions of data records within seconds.

  • Real-Time

    Real-time consolidation of service data to produce actionable insights in operational decision-making.

Figure 3 Enhanced ETL + real-time BI analysis

Real-Time Data Analytics

In the mobile internet field, processing and analyzing massive amounts of data in real-time is crucial to extract its full potential. GaussDB(DWS) offers fast data import and query capabilities that speed up data analysis, allowing for real-time ingestion, processing, and value generation.

Figure 4 Real-time data analysis

Advantages

  • Real-Time Import of Streaming Data

    Data from Internet applications can be written into GaussDB(DWS) in real time after being processed by the stream computing and AI services.

  • Real-Time Monitoring and Prediction

    Device monitoring, control, optimization, supply, self-diagnosis, and self-healing based on data analysis and prediction.

  • Converged AI Analysis

    Correlation analysis can be conducted on results of image and text data analysis by AI services and other service data on GaussDB(DWS).

Lakehouse

  • Seamless access to the data lake
    • With the interconnection with Hive Metastore metadata management, you can directly access the data table definitions in the data lake. You do not need to create a foreign table. You only need to create an external schema.
    • The following data formats are supported: ORC and Parquet.
  • Convergent query
    • Hybrid query of any data in the data lake and warehouse is supported.
    • The query result is directly sent to the warehouse or data lake. No data needs to be transferred or copied.
  • Excellent query performance
    • High-quality query plans and efficient execution engines
    • Precise load management methods

Real-Time Write

DWS 3.0 utilizes the H-Store storage engine to store micro-batch data locally and syncs it to OBS at regular intervals. It enables high-throughput real-time write and update, as well as large-scale data writes.

Real-time data is written and calculated, and can be used for dashboard statistics, analysis, monitoring, risk control, and recommendations.

Service Isolation and Ultimate Elasticity with Multiple VWs (Storage-Compute Decoupling)

  • VWs isolate service loads more effectively than soft isolation methods, using VM-level hard isolation to minimize service impact.
  • Multiple classic VWs and multiple elastic VWs are supported.

  • Classic VWs are used to isolate services.
    • VWs can be deployed based on service needs, with different services bound to different VWs. Classic VWs allow table creation.
    • Resources are isolated between VWs so that services do not affect each other.
    • Data is shared between VWs in real time.
    • The performance ceiling for a single SQL statement within our MPP architecture is determined by the size of a fixed VW.
    • Fixed VWs are optimized for consistent workloads and low-latency operations, such as real-time data access and processing. The size of fixed VWs can be proactively planned to accommodate anticipated service fluctuations.
  • Concurrent expansion through elastic VW
    • In high-concurrency scenarios, elastic VWs are dynamically created to handle queued services. These VWs support read and write operations, but not table creation.
    • Elastic VWs automatically handle queuing queries.
    • Elastic VWs seamlessly absorbs queued queries to enhance service concurrency.
    • As demand subsides, elastic VWs are automatically decommissioned.
    • Elastic VWs offer on-demand resource allocation, with the flexibility for users to define upper limits.
    • Despite their dynamic nature, elastic VWs maintain the same specifications as fixed VWs, ensuring consistent SQL statement performance.
    • Elastic VWs adopt a usage-based billing.
    • Elastic VWs are suitable for handling sporadic and cyclical workloads.

For example, if a customer has multiple service departments, each can be assigned a classic VW to isolate resources. If Service 1 uses a three-node VW and Service 2 uses a four-node VW, and Service 1 has peak hours from 10:00 to 12:00, elastic VWs can be configured to scale during peak hours and be destroyed afterward.