Help Center/ MapReduce Service/ Component Operation Guide (LTS) (Ankara Region)/ Using ClickHouse/ ClickHouse FAQ/ What Should I Do If a File System Error Is Reported and Core Dump Occurs During Process Startup and part Loading After a ClickHouserServer Instance Node Is Power Cycled?
Updated on 2024-11-29 GMT+08:00

What Should I Do If a File System Error Is Reported and Core Dump Occurs During Process Startup and part Loading After a ClickHouserServer Instance Node Is Power Cycled?

Symptom

The ClickHouseServer instance fails to be restarted. The following shows the displayed error message.

A core dump occurred during restart. The key error message is as follows:
2023.09.11 15:34:49.085595 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:338] : (version 23.3.2.1, build id: 86C97F3EED917A2F2D9A691B4FB845F860FE7FF2) (from thread 29814) (no query) Received signal Aborted (6)
2023.09.11 15:34:49.085636 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:354] : 
2023.09.11 15:34:49.085662 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:367] : Stack trace: 0x7f2ed263a207 0x7f2ed263b8f8 0xb97032b 0x7f2ed30c7b83 0x7f2ed30c7b18 0x16de788c 0x151ccd63 0x151cf0ea 0x151cf77d 0xb7b8958 0xb7bc720 0x7f2ed29d8dd5 0x7f2ed2701ead
2023.09.11 15:34:49.085739 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:371] : 3. gsignal @ 0x36207 in /usr/lib64/libc-2.17.so
2023.09.11 15:34:49.085775 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:371] : 4. __GI_abort @ 0x378f8 in /usr/lib64/libc-2.17.so
2023.09.11 15:34:49.085820 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:371] : 5. terminate_handler() @ 0xb97032b in /opt/AA/BB/Bigdata/FusionInsight_ClickHouse_8.3.0/install/FusionInsight-ClickHouse-v23.3.2.37-lts/clickhouse/bin/clickhouse
2023.09.11 15:34:49.085854 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:371] : 6. std::__terminate(void (*)()) @ 0x99b83 in /opt/AA/BB/Bigdata_func/comp/ck/lib_lemmagen.so
2023.09.11 15:34:49.085875 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:371] : 7. std::terminate() @ 0x99b18 in /opt/AA/BB/Bigdata_func/comp/ck/lib_lemmagen.so
2023.09.11 15:34:49.085898 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:371] : 8. DB::MergeTreeData::loadOutdatedDataParts(bool) @ 0x16de788c in /opt/AA/BB/Bigdata/FusionInsight_ClickHouse_8.3.0/install/FusionInsight-ClickHouse-v23.3.2.37-lts/clickhouse/bin/clickhouse
2023.09.11 15:34:49.085920 [ 30174 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:371] : 9. DB::BackgroundSchedulePoolTaskInfo::execute() @ 0x151ccd63 in /opt/AA/BB/Bigdata/FusionInsight_ClickHouse_8.3.0/install/FusionInsight-ClickHouse-v23.3.2.37-lts/clickhouse/bin/clickhouse

The core dump is caused by the following file system errors:
2023.09.11 15:34:49.084809 [ 28762 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:280] : (version 23.3.2.1, build id: 86C97F3EED917A2F2D9A691B4FB845F860FE7FF2) (from thread 29814) Terminate called for uncaught exception:
2023.09.11 15:34:49.084883 [ 28762 ] {} <Fatal> BaseDaemon [BaseDaemon.cpp:291] : std::exception. Code: 1001, type: std::__1::__fs::filesystem::filesystem_error, e.what() = filesystem error: in directory_iterator::directory_iterator(...): Structure needs cleaning ["/srv/AA/BB/clickhouse/data1/clickhouse/store/b0b/b0b1f040-4bdb-4584-9be6-782e81fafeae/202309_46191_47131_657"]

Procedure

  1. Log in to the ClickHouseServer instance node that fails to be restarted, search for Structure needs cleaning in /var/log/Bigdata/clickhouse/clickhouseServer/clickhouse-server.log to locate the damaged part directory.
  2. Go the the damaged part directory, as shown in the following log (/srv/AA/BB/clickhouse/data1/clickhouse/store/b0b/b0b1f040-4bdb-4584-9be6-782e81fafeae/202309_46191_47131_657), clear the 202309_46191_47131_657 directory. If it cannot be cleared, clear the upper level directory (b0b1f040-4bdb-4584-9be6-782e81fafeae).
  3. Log in to FusionInsight Manager and restart the ClickHouseServer instance.