TaurusDB Kernel Overview

Introduction

As enterprise workloads grow, database read/write pressure intensifies. Traditional open-source MySQL hits scalability limits. For example, adding read replicas requires full data replication, which takes a long time and is costly. To address these issues, Huawei provides TaurusDB, an enterprise-grade cloud-native database fully compatible with MySQL. It decouples compute from storage and supports up to 128 TB of storage per instance. With TaurusDB, a failover can be completed within seconds. It provides the high availability and superior performance of a commercial database at the price of an open-source database.

Product Architecture

The TaurusDB architecture consists of three layers. From bottom to top, they are:

Storage node layer: This layer is built on Huawei Data Function Virtualization (DFV) storage, which provides distributed, strong-consistency, and high-performance storage. This layer ensures data reliability and horizontal scalability, with a reliability rate of no less than 99.999999999% (11 nines). DFV is a high-performance and high-reliability distributed storage system that is vertically integrated with databases. Storage clusters are deployed in pools to improve storage utilization and build a data-centric full-stack data service architecture.
Storage abstraction layer: This layer is key to ensuring database performance. It connects to the DFV storage pool below it and provides semantics upward for ensuring efficient storage scheduling. Table file operations are abstracted into distributed storage.
SQL parsing layer: This layer is fully compatible with open-source MySQL 8.0, allowing you to easily migrate your workloads from MySQL to TaurusDB using MySQL-native syntax and tools. This saves you time and efforts. In addition to full compatibility with MySQL, TaurusDB comes with an optimized kernel and a hardened system.

Figure 1 Product architecture
Click to enlarge

Key Features

TaurusDB is a cloud-native database that decouples compute from storage, allowing all compute nodes to share the same storage data. This is the biggest difference between TaurusDB and open-source MySQL.

Its key features include:

Adding read replicas does not require full data replication. The time for adding read replicas is irrelevant to the data volume. Read replicas can be added in minutes (about 7 to 10 minutes). Up to one primary node and 15 read replicas are supported.
Data synchronization between the primary node and read replicas does not require binlog replication. Only redo metadata needs to be synchronized. Read replicas read redo logs from the DFV storage pool and replay the redo logs to synchronize data.

TaurusDB uses a shared storage architecture, but the primary node and read replicas still need to synchronize redo metadata, which can cause data delays between them. For details, see Self-Healing of Read Replicas upon a Replication Latency.
Data is automatically backed up to OBS, which ensures data security.
You can create proxy instances for read/write splitting, which reduces read load on the primary node.
You can enable SQL Explorer to record all executed SQL statements, which facilitates fault locating.

Functions

**Table 1** Functions
Category	Function	Description
Parallel query	Parallel Query	Parallel query (PQ) reduces the processing time of analytical queries to satisfy the low latency requirements of enterprise-grade applications.
Compute pushdown	Near Data Processing	For data-intensive queries, operations such as column extraction, aggregation calculation, and condition filtering are pushed down to multiple nodes on a distributed storage layer for parallel execution.
Compute pushdown	LIMIT...OFFSET Pushdown	If you use LIMIT(N) and OFFSET(P) in a SELECT statement, data is pushed down to the engine layer for processing, speeding up queries.
Query optimization	Statement Outline	Statement Outline is a method that uses the MySQL optimizer and index hints to stabilize plan execution.
	Conversion of IN Predicates Into Subqueries	The optimizer can convert some big IN predicates into IN subqueries to improve the performance of complex queries.
	Backward Index Scan	Backward Index Scan eliminates the need for sorting by scanning an index in reverse order. However, it is not compatible with other features like Index Condition Pushdown (ICP), which can lead to decreased performance once the optimizer selects Backward Index Scan. TaurusDB adds a switch to fix this issue.
	DISTINCT Optimization for Multi-Table Joins	To improve DISTINCT query efficiency in the case of multi-table joins, TaurusDB adds the pruning function to the optimizer to remove unnecessary scanning branches.
DDL optimization	Parallel Index Creation	When database hardware resources are idle, you can use parallel index creation to speed up DDL execution. This prevents subsequent DML operations from being blocked and shortens the DDL operation window.
	DDL Fast Timeout	TaurusDB lets you set an MDL wait time (DDL timeout) for ALTER TABLE, CREATE INDEX, and DROP INDEX operations. This prevents subsequent DML operations from being blocked.
	Non-blocking DDL	Non-blocking DDL allows new transactions to enter the table even if the MDL-X lock cannot be acquired, ensuring the stability of the entire service system.
	Progress Queries for Creating Secondary Indexes	This function displays progress for time-consuming index creation operations even after performance schema has been disabled.
Transaction optimization	Idle Transaction Disconnection	TaurusDB can proactively terminate idle transactions. Different parameters are used to control different types of transactions. When idle transactions timed out, they are automatically rolled back and disconnected.
	Diagnosis on Large Transactions	TaurusDB provides diagnosis for large transactions. When there is a large transaction, an alarm is generated to notify you to commit the transaction in a timely manner.
	Hot Row Update	TaurusDB optimizes hot row update, which can be automatically or manually enabled. After hot row update is enabled, hot rows can be updated efficiently.
Partitioned tables	Subpartitioning	Compared with MySQL Community Edition, TaurusDB provides more functions in terms of partitioned tables.
	LIST DEFAULT HASH	LIST DEFAULT HASH supports two types of partitions at the same level: LIST and HASH.
	INTERVAL RANGE	An INTERVAL RANGE partitioned table is an extension of a RANGE partitioned table. When data to be inserted exceeds the range of an existing partition, the database can create a partition based on rules specified by the INTERVAL clause.
	Partition-level MDL	TaurusDB introduces partition-level metadata lock (MDL) to refine the lock granularity of a partitioned table from the table level to the partition level. After partition-level MDL is enabled, DML operations and specific DDL operations (such as adding and deleting partitions) on different partitions can be both performed, greatly improving concurrency between partitions.
Backup and restoration	Flashback Query	TaurusDB supports undo log-based table-level flashback query in scenarios such as accidental data deletion, data errors caused by vulnerabilities, and service audit requiring historical data tracing.
Binlog management	Fast Binlog Positioning	If you pull binlogs from a TaurusDB instance using auto-positioning, it can be time-consuming to locate the correct binlog position when there are a large number of unread binlogs. After fast binlog positioning is enabled, the time required is greatly reduced.
Security and encryption	Dynamic Data Masking	Dynamic data masking is a security feature where sensitive data in a database is masked before being returned when an application queries the database. TaurusDB allows you to add masking rules to mask data in specified databases, tables, and columns.
Recycle bin	Database/Table Recycle Bin	TaurusDB supports database/table recycle bin. After a table or database is deleted, it is temporarily moved to the __recyclebin__ database with a new name.
Multi-tenancy	Multi-tenancy	TaurusDB provides multi-tenancy to maximize database resource utilization. Data is isolated among tenants. Different tenants can only access their own data.
Other functions	Column Compression	To reduce the storage occupied by data pages and costs, TaurusDB provides algorithms ZLIB and ZSTD for fine-grained column compression. You can select either of them to compress large columns that are not frequently accessed based on the compression ratio and performance.
	Cold Data Preloading for Read Replicas	When a cluster TaurusDB instance is running, the primary node monitors the least recently used (LRU) linked list and synchronizes active data pages (pages read from storage or moved within a cache pool) to read replicas. The read replicas preload the pages to the cache pool to improve the cache hit ratio and reduce the performance jitter after a read replica is promoted to primary.
	Self-Healing of Read Replicas upon a Replication Latency	Although the primary node and read replicas share storage data, the primary node and the read replicas still need to regularly communicate with each other to ensure that the data cache of the read replicas is consistent with that of the primary node.