Updated on 2023-10-23 GMT+08:00

Data Damage Detection and Repair Functions

  • gs_verify_data_file(verify_segment bool)

    Description: Checks whether files in the current database of the current instance are lost. The verification only checks whether intermediate segments are lost in the main file of the data table. The default value is false, indicating that the segment-page table data file is not verified. If this parameter is set to true, only segment-page table files are verified. By default, only initial users, users with the sysadmin permission, and users with the O&M administrator attribute in the O&M mode can view the information. Other users can view the information only after being granted with permissions.

    The returned result is as follows:

    • Non-segment-page table: rel_oid and rel_name indicate the table OID and table name of the corresponding file, and miss_file_path indicates the relative path of the lost file.
    • Segment-paged table: All tables are stored in the same file. Therefore, rel_oid and rel_name cannot display information about a specific table. For a segment-page table, if the first file is damaged, the subsequent files such as .1 and .2 are not checked. For example, if 3, 3.1, and 3.2 are damaged, only 3 damage can be detected. When the number of segment-page files is less than 5, the files that are not generated are also detected during function detection. For example, if there are only files 1 and 2, files 3, 4, and 5 are detected during segment-page file detection. In the following examples, the first is an example of checking a non-segment-page table, and the second is an example of checking a segment-page table.

    Parameter description:

    • verify_segment

      Specifies the range of files to be checked. false indicates that non-segment-page tables are verified. true indicates that segment-page tables are verified.

      The value can be true or false (default value).

    Return type: record

    Example:

    Verify a non-segment-page table.

    openGauss=# select * from gs_verify_data_file();
    node_name         | rel_oid |  rel_name    |  miss_file_path
    ------------------+---------+--------------+------------------
    dn_6001_6002_6003 |   16554 |     test     | base/16552/24745

    Verify a segment-page table.

    openGauss=# select * from gs_verify_data_file(true);
         node_name     | rel_oid | rel_name | miss_file_path
    -------------------+---------+----------+----------------
     dn_6001_6002_6003 |       0 | none     | base/16573/2
  • gs_repair_file(tableoid Oid, path text, timeout int)

    Description: Repairs the file based on the input parameters. Only the primary DN with normal primary/standby connection is supported. The parameter is set based on the OID and path returned by the gs_verify_data_file function. The value of table OID for a segment-page table ranges from 0 to 4294967295. (The internal verification determines whether a file is a segment-page table file based on the file path. The table OID is not used for a segment-page table file.) If the repair is successful, true is returned. If the repair fails, the failure cause is displayed. By default, only the initial user, users with the sysadmin permission, and users with the O&M administrator attribute in O&M mode on the primary DN can view the table. Other users can view the table only after being granted with permissions.

    1. If a file on a DN is damaged, a verification error at the PANIC level is reported when the DN is promoted to primary. The DN cannot be promoted to primary, which is normal.
    2. If a file exists but its size is 0, the file will not be repaired. To repair the file, you need to delete the file whose size is 0 and then repair it.
    3. You can delete a file only after the file descriptor is automatically closed. You can manually restart the process or perform a primary/standby switchover.

    Parameter description:

    • tableoid

      OID of the table corresponding to the file to be repaired. Set this parameter based on the rel_oid column in the list returned by the gs_verify_data_file function.

      Value range: OID ranging from 0 to 4294967295. Note: A negative value will be forcibly converted to a non-negative integer.

    • path

      Path of the file to be repaired. Set this parameter based on the miss_file_path column in the list returned by the gs_verify_data_file function.

      Value range: a string

    • timeout

      Specifies the duration for waiting for the standby DN to replay. The repair file needs to wait for the standby DN to be put back to the corresponding location on the current primary DN. Set this parameter based on the replay duration of the standby DN.

      Value range: 60s to 3600s.

    Return type: Boolean

    Example:

    openGauss=# select * from gs_repair_file(16554,'base/16552/24745',360);
    gs_repair_file
    ----------------
    t
  • local_bad_block_info()

    Description: Displays the page damage of the instance. You can read the page from the disk and record the page CRC failure. By default, only initial users, users with the sysadmin permission, users with the monitoring administrator attribute, users with the O&M administrator attribute in the O&M mode, and monitoring users can view the information. Other users can view the information only after being granted with permissions.

    In the displayed information, file_path indicates the relative path of the damaged file. If the table is a segment-page table, the logical information instead of the actual physical file information is displayed. block_num indicates the number of the page where the file is damaged. The page number starts from 0. check_time indicates the time when the page damage is detected. repair_time indicates the time when the page is repaired.

    Return type: record

    Example:

    openGauss=# select * from local_bad_block_info();
    node_name    | spc_node | db_node | rel_node| bucket_node | fork_num | block_num |    file_path     |  check_time            |   repair_time
    -----------------+-------+--------+--------+--------------+----------+-----------+-----------------+--------------------------+-------------------------------
    dn_6001_6002_6003|  1663 |  16552 |  24745 |        -1    |    0    | 0        | base/16552/24745 | 2022-01-13 20:19:08.385004+08 | 2022-01-13 20:19:08.407314+08
    
  • local_clear_bad_block_info()

    Description: Deletes data of repaired pages from local_bad_block_info, that is, information whose repair_time is not empty. By default, only initial users, users with the sysadmin permission, users with the O&M administrator attribute in the O&M mode, and monitoring users can view the information. Other users can view the information only after being granted with permissions.

    Return type: Boolean

    Example:

    openGauss=# select * from local_clear_bad_block_info();
    result
    --------
    t

  • gs_verify_and_tryrepair_page (path text, blocknum oid, verify_mem bool, is_segment bool)

    Description: Verifies the page specified by the instance. By default, only the initial user, users with the sysadmin permission, and users with the O&M administrator attribute in O&M mode on the primary DN can view the table. Other users can view the table only after being granted with permissions.

    In the command output, disk_page_res indicates the verification result of the page on the disk, mem_page_res indicates the verification result of the page in the memory, and is_repair indicates whether the repair function is triggered during the verification. t indicates that the page is repaired, and f indicates that the page is not repaired.

    Note: If a page on a DN is damaged, a verification error at the PANIC level is reported when the DN is promoted to primary. The DN cannot be promoted to primary, which is normal. Damaged pages of hash bucket tables cannot be repaired.

    Parameter description:

    • path

      Path of the damaged file. Set this parameter based on the file_path column in the local_bad_block_info file.

      Value range: a string

    • blocknum

      Page number of the damaged file. Set this parameter based on the block_num column in the local_bad_block_info file.

      Value range: OID ranging from 0 to 4294967295. Note: A negative value will be forcibly converted to a non-negative integer.

    • verify_mem

      Specifies whether to verify a specified page in the memory. If this parameter is set to false, only pages on the disk are verified. If this parameter is set to true, pages in the memory and on the disk are verified. If a page on the disk is damaged, the system verifies the basic information of the page in the memory and flushes the page to the disk to restore the page. If a page is not found in the memory during memory page verification, the page on the disk is read through the memory API. During this process, if the disk page is faulty, the remote read automatic repair function is triggered.

      Value range: The value is of a Boolean type and can be true or false.

    • is_segment

      Determines whether the table is a segment-page table. Set this parameter based on the value of bucket_node in the local_bad_block_info file. If the value of bucket_node is –1, the table is not a segment-page table. In this case, set is_segment to false. If the value of bucket_node is not –1, set is_segment to true.

      Value range: The value is of a Boolean type and can be true or false.

    Return type: record

    Example:

    openGauss=# select * from gs_verify_and_tryrepair_page('base/16552/24745',0,false,false);
    node_name         |       path      |  blocknum  |        disk_page_res        | mem_page_res | is_repair
    ------------------+------------------+------------+-----------------------------+---------------+----------
    dn_6001_6002_6003 | base/16552/24745 |     0      | page verification succeeded.|              | f

  • gs_repair_page(path text, blocknum oid, is_segment bool, timeout int)

    Description: Restores the specified page of the instance. This function can be used only by the primary DN that is properly connected to the primary and standby DNs. If the page is successfully restored, true is returned. If an error occurs during the restoration, an error message is displayed. By default, only the initial user, users with the sysadmin permission, and users with the O&M administrator attribute in O&M mode on the primary DN can view the table. Other users can view the table only after being granted with permissions.

    Note: If a page on a DN is damaged, a verification error at the PANIC level is reported when the DN is promoted to primary. The DN cannot be promoted to primary, which is normal. Damaged pages of hash bucket tables cannot be repaired.

    Parameter description:

    • path

      Path of the damaged page. Set this parameter based on the file_path column in local_bad_block_info or the path column in the gs_verify_and_tryrepair_page function.

      Value range: a string

    • blocknum

      Number of the damaged page. Set this parameter based on the block_num column in local_bad_block_info or the blocknum column in the gs_verify_and_tryrepair_page function.

      Value range: OID ranging from 0 to 4294967295. Note: A negative value will be forcibly converted to a non-negative integer.

    • is_segment

      Determines whether the table is a segment-page table. The value of this parameter is determined by the value of bucket_node in local_bad_block_info. If the value of bucket_node is –1, the table is not a segment-page table and is_segment is set to false. If the value of bucket_node is not –1, is_segment is set to true.

      Value range: The value is of a Boolean type and can be true or false.

    • timeout

      Duration of waiting for standby DN replay. The page to be repaired needs to wait for the standby DN to be played back to the location of the current primary DN. Set this parameter based on the playback duration of the standby DN.

      Value range: 60s to 3600s.

    Return type: Boolean

    Example:

    openGauss=# select * from gs_repair_page('base/16552/24745',0,false,60);
    result
    --------
    t
  • gs_verify_urq(index_oid oid, queue_type text, blocknum bigint, verify_type text)

    Description: Verifies the correctness of the index recycling queue or the performance of obtaining index pages from the recycling queue.

    Parameter description:

    • index_oid (UB-tree index OID)
    • queue_type (queue type)

      empty queue: potential queue.

      free queue: available queue.

      single page: single page of the queue

    • blocknum (page number)

      If the queue type is single page, all tuples of blocknum on a single page are verified. The value range is [0,Queue file size/8192).

      If the queue type is empty queue or free queue and blocknum is not set to 0, all tuples on all pages of this queue are verified. If blocknum is set to 0, page tuples are not verified.

    • verify_type (verification type)

      physics verifies the correctness of the physical structure of the queue.

      performance: verifies the performance of obtaining pages from the recycling queue.

    Return type: record

    Table 1 gs_verify_urq parameters

    Category

    Parameter

    Type

    Description

    Input parameter

    index_oid

    oid

    UB-tree index OID.

    Input parameter

    queue_type

    text

    Queue type.

    • empty queue: potential queue
    • free queue: available queue
    • single page: single page of the queue

    Input parameter

    blocknum

    bigint

    Page number.

    Input parameter

    verify_type

    text

    Specifies the verification type:

    • physics: verifies the physical structure of a column.
    • performance: verifies the performance of obtaining pages from the recycling queue.

    Output parameter

    verify_code

    text

    Error code

    Output parameter

    detail

    text

    Error description

    Example:

    openGauss=# select * from gs_verify_urq(16387,'free queue',1,'physics');
     verify_code | detail
    -------------+--------

    Currently, this interface only supports USTORE index tables and does not support partition local indexes.

  • gs_repair_urq(index_oid oid)

    Description: Recreates an index to recycle queues (potential and available queues). If the repair is successful, reinitial the recycle queue of index relation successfully is displayed.

    Parameter description:

    • index_oid (UB-tree index OID)

    Return type: text

    Example:

    openGauss=# select * from gs_repair_urq(16387);
                               result
    ------------------------------------------------------------
     reinitial the recycle queue of index relation sucessfully.

    Currently, this interface only supports USTORE index tables and does not support partition local indexes.