What Is HTAP of Standard Edition?
Hybrid Transactional and Analytical Processing (HTAP) is a data architecture that handles both online transactional processing (OLTP) and online analytical processing (OLAP) workloads.
It uses the column-based storage engine and Single Instruction Multiple Data (SIMD) for parallel compute. In massive data analysis scenarios, HTAP analysis provided by TaurusDB frees you from having to independently maintain data extraction and synchronization links. It reduces data management costs and provides simple and efficient real-time data analysis.
HTAP of Standard Edition is developed based on the open-source StarRocks.
Product Architecture
HTAP instances are deployed on ECSs and use extreme SSDs or ultra-high I/O disks.
You can enable binlog for your TaurusDB instance to synchronize data and operations to HTAP instances. Synchronized operations include inserting table, deleting tables, and changing table structures. After data is synchronized to an HTAP instance, you can access the HTAP instance through its private IP address for data analysis.
An HTAP instance of the standard edition provides frontend (FE) and backend (BE) nodes. The FE nodes manage metadata, manage client connections, and plan and schedule queries. Each FE node stores and maintains a complete metadata backup in the memory to ensure data consistency between FE nodes. The BE nodes are for data storage and SQL computing.
There are three roles for FE nodes.
- The fe-leader nodes read and write metadata. The fe-follower and fe-observer nodes can only read metadata and route write requests for metadata to the fe-leader nodes. The fe-leader nodes update the metadata and synchronize the metadata changes to the fe-follower and fe-observer nodes.
- The fe-follower nodes can only read metadata.
- The fe-observer nodes synchronize and replay logs from the fe-leader nodes to update metadata. The fe-observer nodes are used to increase query concurrency of a cluster.
Main Features
- Massively Parallel Processing (MPP) architecture
- High performance
It supports vectorized engines and CBO optimizers and excels in queries for large and wide tables and multi-table join operations.
- Standard SQL
- Data compression for storage
Column-based storage and data compression greatly reduce your storage costs for any given set of conditions.
- Aggregation of multiple data sources
Data in multiple TaurusDB databases can be synchronized to a given HTAP instance.
Precautions
- To use HTAP of Standard Edition, submit an application by choosing Service Tickets > Create Service Ticket in the upper right corner of the management console.
- When you query data in an HTAP instance, the names of databases, tables, views, users, and roles are case sensitive, but the names of columns and partitions are case insensitive.
- Tables in a TaurusDB database can only be synchronized when primary keys are defined for the tables.
- Some DDL statements executed on TaurusDB instances cannot be synchronized to HTAP instances, which may cause synchronization failures or data inconsistencies.
The DDL statements that can be synchronized and cannot be synchronized are as follows:
- DDL statements that can be synchronized
Table 1 DDL statements that can be synchronized DDL Name
SQL Example
Creating a table
CREATE TABLE tbl_name (c_id int not null, c_d_id integer not null, primary key (c_id));
Dropping a table
DROP TABLE tbl_name;
Renaming a table
RENAME TABLE tbl_name to new_tbl_name;
ALTER TABLE tbl_name RENAME TO new_tbl_name;
Clearing table data
TRUNCATE TABLE tbl_name;
Altering table comments
ALTER TABLE tbl_name COMMENT='test';
Adding a column (non-primary key column)
ALTER TABLE tbl_name ADD c_varchar varchar(2000) AFTER c_tinytext;
Deleting a column (non-primary key column)
ALTER TABLE tbl_name DROP c_vchar;
Changing the type and sequence of a column (non-primary key column)
ALTER TABLE tbl_name CHANGE c_vchar c_vchar varchar(2000) default 'test' AFTER c_tinytext;
CAUTION:The column name and default value cannot be changed.
ALTER TABLE tbl_name MODIFY c_vchar varchar(2100) default 'test' AFTER c_tinytext;
CAUTION:The default value cannot be changed.
- DDL statements that cannot be synchronized
After a database synchronization task, only tables and data can be synchronized. Operations for databases, tablespaces, indexes, foreign keys, functions, stored procedures, triggers, partitions (DELETE operations), primary keys (INSERT/DELETE/ALTER operations), transactions, users, roles, privileges, and events cannot be synchronized.
Table 2 list index-related operations that cannot be synchronized.
Table 2 Index-related operations that cannot be synchronized DDL Name
SQL Example
Adding an index
ALTER TABLE tbl_name ADD INDEX name;
Renaming an index
ALTER TABLE tbl_name RENAME INDEX old_index_name TO new_index_name;
Dropping an index
DROP INDEX name ON table;
Adding a full-text index
CREATE FULLTEXT INDEX name ON table(column);
Adding a spatial index
ALTER TABLE geom ADD SPATIAL INDEX(g);
Modifying the type of an index
ALTER TABLE tbl_name DROP INDEX i1, ADD INDEX i1(key_part,...) USING BTREE;
Adding an index constraint
ALTER TABLE tbl_name ADD CONSTRAINT UNIQUE USING BTREE (column);
ALTER TABLE tbl_name ADD CONSTRAINT UNIQUE USING HASH(column);
Table 3 list partitioned table-related operations that cannot be synchronized
Table 3 Partitioned table-related operations that cannot be synchronized DDL Name
SQL Example
Analyzing a table partition
ALTER TABLE {db}.tp ANALYZE PARTITION p0;
Checking a table partition
ALTER TABLE {db}.tp CHECK PARTITION p0;
Optimizing a table partition
ALTER TABLE {db}.tp OPTIMIZE PARTITION p0;
Re-building a table partition
ALTER TABLE {db}.tp REBUILD PARTITION p0;
Repairing a table partition
ALTER TABLE {db}.tp REPAIR PARTITION p0;
Creating a database
CREATE DATABASE ddl_test_2;
Modifying a row format
ALTER TABLE tbl_name ROW_FORMAT = row_format;
Setting persistent table statistics
ALTER TABLE tbl_name STATS_PERSISTENT=0, STATS_SAMPLE_PAGES=20,STATS_AUTO_RECALC=1, ALGORITHM=INPLACE, LOCK=NONE;
Setting a table character set
ALTER TABLE tbl_name CHARACTER SET = charset_name;
Converting a table character set
ALTER TABLE tbl_name CONVERT TO CHARACTER SET charset_name;
Optimizing a table
OPTIMIZE TABLE tbl_name;
Rebuilding a table using the FORCE option
ALTER TABLE tbl_name FORCE;
Rebuilding a table without data
ALTER TABLE tbl_name ENGINE=InnoDB;
Renaming a tablespace
ALTER TABLESPACE tablespace_name RENAME TO new_tablespace_name;
Adding a table partition
ALTER TABLE {db}.tp ADD PARTITION (PARTITION p3 VALUES LESS THAN (2006) );
Setting the default character set and verification rules for a table
ALTER TABLE tbl_name DEFAULT CHARACTER SET = utf8 COLLATE = utf8_general_ci;
Table creation statements cannot contain CHECK or table options.
During data synchronization, operations in Table 4 may cause data inconsistency between HTAP instances and TaurusDB instances. You should avoid these operations.
They do not affect data query and analysis on HTAP instances.
Table 4 DDL operations that result in data inconsistency DDL Name
SQL Example
Deleting a primary key
ALTER TABLE tbl_name DROP PRIMARY KEY;
Adding a primary key
ALTER TABLE {db}.t1 ADD PRIMARY KEY (id);
Adding a primary key and deleting a primary key
ALTER TABLE tbl_name DROP PRIMARY KEY, ADD PRIMARY KEY (column);
Setting a primary key to NULL
ALTER TABLE tbl_name MODIFY COLUMN key_column_name data_type;
Changing the type of a primary key
ALTER TABLE tbl_name MODIFY COLUMN key_column_name data_type not null;
Adding a foreign key
ALTER TABLE tbl1 ADD CONSTRAINT fk_name FOREIGN KEY index (col1)REFERENCES tbl2(col2) referential_actions;
Deleting a foreign key
ALTER TABLE tbl DROP FOREIGN KEY fk_name;
Adding a column
NOTE:Common columns can be added. If columns contain the following default values, they cannot be added.
- Double quotation marks (")
- Functions, character strings, and identifiers that cannot be found in HTAP instances
ALTER TABLE tbl_name ADD COLUMN column_name column_definition
c VARCHAR(10) DEFAULT (CONCAT('1', '2'));
Setting the default value of a column
NOTE:If columns contain the following default values, you cannot reset default values for the columns.
- Double quotation marks (")
- Functions, character strings, and identifiers that cannot be found in HTAP instances
ALTER TABLE tbl_name ALTER COLUMN col SET DEFAULT literal;
Changing NULL in tables to NOT NULL
ALTER TABLE tbl_name MODIFY COLUMN column_name data_type NOT NULL;
Changing the column name and type at the same time
ALTER TABLE t1 CHANGE b b1 VARCHAR(100);
Changing the name of a column
ALTER TABLE t1 RENAME COLUMN a TO b;
Creating a table without a primary key
ALTER TABLE t1 ADD COLUMN (c2 INT GENERATED ALWAYS AS (c1 + 1)STORED);
Adding a STORED derived column
ALTER TABLE {db}.t1 ADD COLUMN (st2 INT GENERATED ALWAYS AS (c2 + 2)STORED), ALGORITHM=COPY;
Adding a VIRTUAL derived column
ALTER TABLE t1 ADD COLUMN (c2 INT GENERATED ALWAYS AS (c1 + 1)VIRTUAL);
Dropping a table partition
ALTER TABLE {db}.tp DROP PARTITION p4;
Discarding a table partition
ALTER TABLE {db}.tp DISCARD PARTITION p2 TABLESPACE;
Importing a table partition
ALTER TABLE {db}.tp IMPORT PARTITION p2 TABLESPACE;
Truncating a table partition
ALTER TABLE {db}.tp TRUNCATE PARTITION p2;
Truncating a partitioned table
TRUNCATE {db}.tp;
Coalescing table partitions
ALTER TABLE {db}.tp_hash COALESCE PARTITION 2;
Reorganizing table partitions
ALTER TABLE {db}.tp REORGANIZE PARTITION p0,p1,p2,p3 INTO ( ... );
Exchanging table partitions
ALTER TABLE {db}.tp EXCHANGE PARTITION p0 WITH TABLE {db}.tp2;
Removing a table partition
ALTER TABLE {db}.tp REMOVE PARTITIONING;
Using a REPLACE clause
CREATE OR REPLACE TABLE;
- DDL statements that can be synchronized
- The names of the databases and tables to be synchronized cannot contain Chinese characters.
- To improve performance, you can use the following methods to optimize queries:
- Simplify SQL statements by reducing invalid calculations, deleting unused fields, and avoiding SELECT.
- Instead of querying all columns, delete those that are unnecessary.
- Tables to be synchronized use the OLAP engine and primary key model by default.
- After connecting to a standard HTAP instance, run the following command to view the databases synchronized from TaurusDB to the standard HTAP instance and the synchronization status.
As this command queries the binlog information in TaurusDB, it may consume the hourly query quota. You are advised to run this command at most once a minute.
show sync job;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot