Updated on 2025-05-29 GMT+08:00

RCR UB-Tree

RCR UB-Tree Multi-Version Management

The multi-version management of row consistency read (RCR) B-tree is based on data row levels. XIDs are recorded in data rows, which increases the key size, and the index expands by 5% to 20%. The latest and historical versions are on the B-tree, and the index does not record the undo information. Keys are inserted or deleted in the sequence of key + TID. Tuples with the same index column are sorted based on their TIDs as the second sorting keyword, and Xmin and Xmax are added to the end of the key. During index splitting, multi-version information is migrated with key migration.

RCR UB-Tree Visibility Mechanism

RCR UB-tree visibility is determined by xmin/xmax on the key, which is similar to xmin/xmax on Astore heap table data rows.

Adding, Deleting, Modifying, and Querying RCR UB-Tree

  • Insert: The insertion logic of UB-tree is basically not changed, except that you need to directly obtain the transaction information and fill in the xmin column during index insertion.
  • Delete: The index deletion process is added to the UB-tree. The main procedure of index deletion is similar to that of index insertion. That is, obtain the transaction information, fill in the xmax column, and update active_tuple_count on pages. If the value of active_tuple_count is reduced to 0, the system attempts to recycle the page.
  • Update: In Ustore, data updates require different processing on UB-tree index columns compared to Astore. Data updates include index column updates and non-index column updates. Figure 1 illustrates the UB-tree processing during data updates.
    Figure 1 UB-tree processing during data updates

    Figure 1 shows the differences between UB-tree updates in index columns and non-index columns.

    • In the case of non-index column updates, the index does not change. The index tuple still points to the data tuple inserted for the first time. No data tuple is inserted to the Uheap. Instead, the Uheap modifies the current data tuple and saves historical data to the undo segment.
    • When the index column is updated, a new index tuple is inserted into UB-tree and points to the same data linepointer and data tuple. To scan the historical data, you need to read it from the undo segment.
  • Scan: When reading data, you can use index to speed up scanning. UB-tree supports multi-version management and visibility check of index data. The visibility check at the index layer improves the performance of index scan and index-only scan.

    For index scan:

    • If the index column contains all columns to be scanned (index-only scan), binary search is performed on indexes based on the scan conditions. If a tuple that meets the conditions is found, data is returned.
    • If the index column does not contain all columns to be scanned (index scan), binary search is performed on indexes based on the scan conditions to find TIDs of the tuples that meet the conditions, and then the corresponding data tuples are found in data tables based on the TIDs. See the following figure.

RCR UB-Tree Space Management

Currently, Astore indexes depend on AutoVacuum and FSM for space management. The space may not be recycled in a timely manner. However, Ustore indexes use the UB-tree recycle queue (URQ) to manage idle index space. The URQ contains two circular queues: potential empty page queue and available empty page queue. Completing space management of indexes in a DML process can effectively alleviate the sharp space expansion caused during the DML process. Index recycle queues are separately stored in FSM files corresponding to the B-tree indexes.

As shown in the preceding figure, the index page flow in the URQ is as follows:

  1. From an empty page a potential queue

    The index page tail column records the number of active tuples (activeTupleCount) on the page. During the DML process, all tuples on a page are deleted, that is, when activeTupleCount is set to 0, the index page is placed in the potential queue.

  1. From the potential queue to an available queue

    The flow from a potential queue to an available queue mainly achieves an income and expense balance for the potential queue and ensure that pages are available for the available queue. That is, after an index empty page is used up in an available queue, at least one index page is transferred from a potential queue to the available queue. Besides, if a new index page is added to a potential queue, at least one index page can be removed from the potential queue and inserted into the available queue.

  2. From the available queue to an empty page

    When an empty index page is obtained during index splitting, the system first searches an available queue for an index page that can be reused. If such index page is found, it is directly reused. If no index page can be reused, physical page expansion is performed.