Updated on 2024-05-21 GMT+08:00

What Is HTAP?

Hybrid Transaction and Analytical Process (HTAP) instances are based on open-source ClickHouse. They use column-based storage engine and Single Instruction Multiple Data (SIMD) for parallel compute, improving query performance in massive data analysis, especially for large and wide tables.

HTAP instances free you independently maintaining data extraction and synchronization links, reduce data management costs, and provide simple and efficient real-time data analysis capabilities.

Overview

An HTAP instance can be used as a standby database of a GaussDB(for MySQL) instance and provides high-performance data analysis capabilities. Data is synchronized to the HTAP instance in real time. You can perform online transaction processing and online data analysis on your GaussDB(for MySQL) DB instance.

Supported Regions

HTAP instances are only available in the following regions:

  • CN North-Beijing4
  • CN East-Shanghai1
  • CN South-Guangzhou
  • AP-Singapore

Architecture

HTAP instances are deployed on ECSs and use extreme SSDs or ultra-high I/O disks.

You can enable binlog of your GaussDB(for MySQL) instance to synchronize data and operations to HTAP instances. The operations include inserting, deleting, modifying, and querying tables and changing table structures. After data is synchronized to an HTAP instance, you can access the HTAP instance through its private IP address and EIP for data analysis.

Figure 1 Architecture

Features

  • Multi-Version Concurrency Control (MVCC) and transaction-level read consistency

    You can select required isolation levels among four isolation levels by configuring parameters in data synchronization task creation.

    • READ_UNCOMMITTED: Read operations are not committed, and transaction consistency cannot be ensured.
    • READ_COMMITTED: To ensure read consistency, read data is committed last.
    • QUERY_SNAPSHOT: Snapshot query can avoid data deduplication and merging, providing high query performance and ensuring read consistency.
    • QUERY_RAW: All raw data is returned, including data of different versions that have been deleted and updated.
  • Quick deduplication

    Based on snapshots, data is quickly deduplicated to improve query performance.

  • Data compression for storage

    In HTAP instances, data is compressed for storage by default, which greatly reduces storage costs under any given set of conditions.

  • Parallel data synchronization

    In the initial full data synchronization phase, data is automatically sliced based on data statistics, and parallel processing improves synchronization performance. You can set the number of concurrent threads when creating a database for synchronization.

  • Table definition rewriting

    When creating a synchronization task, you can modify tables to further improve the analysis and query performance. The modification operations include ORDER BY, PARTITION BY, SAMPLE BY, PRIMARY KEY, TTL and COLUMNS.

  • Table filtering based on a blacklist and a whitelist

    When creating a synchronization task, you can select required tables or excluded tables based on a blacklist and a whitelist.

  • Binlog

    When a database has multiple tasks for data synchronization, one binlog is used to reduce network resource consumption.

  • Higher stability for data replication

    Most GaussDB(for MySQL) DDLs are supported for synchronization. The character set of the source database can be automatically converted to the UTF-8 character set of the destination database.

  • Various data types

    All data types of GaussDB(for MySQL) are supported. For details, see Data Type Conversion.

  • Aggregation of multiple data sources

    Data in multiple GaussDB(for MySQL) databases can be synchronized to the same HTAP instance.

  • Enhanced security

    User account information is encrypted for storage.

Billing

Table 1 Billing items

Billing Item

Description

HTAP Instance

Yearly/monthly or pay-per-use

Storage space

Pay-per-use. If you select the storage space when purchasing an HTAP instance, the storage will be billed by the hour.

Public network traffic

GaussDB(for MySQL) instances are accessible from both private and public networks, but only the traffic from public networks is billed.

Table 2 Specifications billing description for pay-per-use HTAP instances

Specification

Region

Price (USD/Hour)

Single

Primary/Standby

4 vCPUs | 16 GB

CN North-Beijing 4, CN East-Shanghai 1, and CN South-Guangzhou

0.37

0.74

AP-Singapore

0.544

1.088

8 vCPUs | 32 GB

CN North-Beijing 4, CN East-Shanghai 1, and CN South-Guangzhou

0.75

1.50

AP-Singapore

1.088

2.176

16 vCPUs | 64 GB

CN North-Beijing 4, CN East-Shanghai 1, and CN South-Guangzhou

1.49

2.98

AP-Singapore

2.176

4.352

32 vCPUs | 128 GB

CN North-Beijing 4, CN East-Shanghai 1, and CN South-Guangzhou

2.98

5.96

AP-Singapore

4.352

8.704

64 vCPUs | 256 GB

CN North-Beijing 4, CN East-Shanghai 1, and CN South-Guangzhou

5.96

11.92

AP-Singapore

8.704

17.408

88 vCPUs | 352 GB

CN North-Beijing 4, CN East-Shanghai 1, and CN South-Guangzhou

8.19

16.18

AP-Singapore

11.968

23.936

Table 3 Storage billing for pay-per-use HTAP instances

Storage

Region

Price (USD/GB/Hour)

Single

Primary/Standby

Ultra-high I/O

CN North-Beijing 4, CN East-Shanghai 1, and CN South-Guangzhou

0.00022

0.00044

AP-Singapore

0.00028

0.00056

Extreme SSD

CN North-Beijing 4, CN East-Shanghai 1, and CN South-Guangzhou

0.00065

0.0013

AP-Singapore

0.001

0.002