Subscribing to Hybrid Data Warehouse Binlog
Binlog Usage
The HStore table within the GaussDB(DWS) hybrid data warehouse offers a Binlog feature to facilitate the capture of database events. This enables the export of incremental data to third-party systems like Flink. By consuming Binlog data, you can synchronize upstream and downstream data, improving data processing efficiency.
Unlike traditional MySQL Binlog, which logs all database changes and focuses on data recovery and replication. The GaussDB(DWS) hybrid data warehouse Binlog is optimized for real-time data synchronization, recording DML operations—Insert, Delete, Update, and Upsert—while excluding DDL operations.
Constraints and Limitations
- Binlog is currently in beta and is only available for commercial use on a restricted basis. To use Binlog, contact technical support.
- Binlog recording for HStore and HStore-op operations is only supported in versions 8.3.0.100 and later.
- Binlog requires a primary key, an HStore or HStore-opt table, and supports only hash distribution.
- Binlog tables log DML operations like insert, delete, and update (upsert), excluding DDL operations.
- Binlog tables do not support insert overwrite, altering distribution columns, enabling Binlog on temporary tables, or partition operations like exchange, merge, or split.
- Users can perform certain DDL operations (ADD COLUMN, DROP COLUMN, SET TYPE, VACUUM FULL, TRUNCATE),
but these will reset incremental data and synchronization details.
- The system waits for Binlog consumption before further scaling. The default wait duration is 1 hour, which can be set through binlog_consume_timeout. Timeouts or errors will fail the scaling process.
- Binlog tables are backed up as standard HStore tables. Post-restoration, you must restart data synchronization as incremental data and sync details are reset.
Binlog Formats and Principles
Field |
Type |
Meaning |
---|---|---|
sync_point |
BIGINT |
Binlog system field, which indicates the synchronization point. In common GTM mode, the value is unique and ordered. |
event_sequence |
BIGINT |
Binlog system field, which indicates the sequence of operations of the same transaction type. |
type |
CHAR |
Binlog system field, which indicates the operation type of the current record. The options are as follows:
|
user_column_1 |
User Column |
User-defined data column |
... |
... |
... |
usert_column_n |
User Column |
User-defined data column |
- For each UPDATE (or UPSERT-triggered update), two Binlog types—BEFORE_UPDATE and AFTER_UPDATE—are created. The BEFORE_UPDATE Binlog verifies the accuracy of data processed by external components like Flink.
- During UPDATE and DELETE operations, the GaussDB(DWS) hybrid data warehouse generates BEFORE_UPDATE and DELETE Binlogs without querying or populating all user columns, enhancing database import efficiency.
- Enabling Binlog for an HStore table in the GaussDB(DWS) hybrid data warehouse is in fact the process of creation of a supplementary table. This table includes three system columns sync_point, event_sequence, and type, and a value column that serializes all user columns.
Enabling Binlog
1 2 3 4 5 6 7 8 9 10 |
CREATE TABLE hstore_binlog_source ( c1 INT PRIMARY KEY, c2 INT, c3 INT ) WITH ( ORIENTATION = COLUMN, enable_hstore=true, enable_binlog=on, binlog_ttl = 86400 ); |
- Binlog recording starts only after a synchronization point is registered for the task, not during the initial data import. (Once Flink is activated, it cyclically acquires the synchronization point and incremental data.)
- The binlog_ttl parameter is optional, defaulting to 86,400 seconds. If a registered synchronization point not undergo incremental synchronization within this TTL, it will be cleared. Subsequently, Binlogs before the oldest synchronization point are asynchronously deleted to free up space.
Run the ALTER command to enable the binlog function for an existing HStore table.
1 2 3 4 5 6 7 8 9 |
CREATE TABLE hstore_binlog_source ( c1 INT PRIMARY KEY, c2 INT, c3 INT ) WITH ( ORIENTATION = COLUMN, enable_hstore=true ); ALTER TABLE hstore_binlog_source SET (enable_binlog=on); |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot