Skip to content

TiFlash failed to restart with "Invalid page id, entry not exist [page_id=...]" #9714

@JaySon-Huang

Description

@JaySon-Huang

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

[2024/12/10 00:04:02.093 +08:00] [INFO] [BaseDaemon.cpp:1312] ["Received termination signal (Terminated)"] [source=Application] [thread_id=977]
...
[2024/12/10 00:04:03.027 +08:00] [INFO] [Server.cpp:1589] ["Shutting down storages."] [thread_id=1]
...
[2024/12/10 00:04:03.050 +08:00] [INFO] [LocalIndexerScheduler.cpp:170] ["Removed 0 tasks, keyspace_id=4294967295 table_id=451"] [thread_id=1]
[2024/12/10 00:04:03.050 +08:00] [INFO] [RegionTable.cpp:152] ["remove table from RegionTable success, keyspace=4294967295 table_id=451"] [thread_id=1]
...
[2024/12/10 00:04:03.065 +08:00] [INFO] [LocalIndexerScheduler.cpp:69] ["LocalIndexerScheduler is destroying. Waiting scheduler and tasks to finish..."] [thread_id=1]
[2024/12/10 00:05:28.470 +08:00] [INFO] [DMFileV3IncrementWriter.cpp:104] ["Write incremental update for DMFile, local_path=/data1/gengliqi/tidb/data/tiflash-9000/data/t_451/stable/dmf_13172 dmfile_path=/data1/gengliqi/tidb/data/tiflash-9000/data/t_451/stable/dmf_13172 old_meta_version=0 new_meta_version=1"] [thread_id=403]
[2024/12/10 00:05:28.474 +08:00] [INFO] [DeltaMergeStore_InternalSegment.cpp:772] ["EnsureStableLocalIndex - Finish building index, dm_files=[dmf_13172(v=0)]"] [source="keyspace=4294967295 table_id=451 segmentEnsureStableLocalIndex source_segment=<segment_id=8822 epoch=4 range=[249948,499989)>"] [thread_id=403]
[2024/12/10 00:05:28.479 +08:00] [INFO] [DeltaMergeStore_InternalSegment.cpp:690] ["SegmentUpdateMeta - Finish, old_segment=<segment_id=8822 epoch=4 range=[249948,499989)> new_segment=<segment_id=8822 epoch=5 range=[249948,499989)>"] [source="keyspace=4294967295 table_id=451"] [thread_id=403]
[2024/12/10 00:05:28.479 +08:00] [INFO] [DeltaMergeStore.cpp:372] ["Release DeltaMerge Store start"] [source="keyspace=4294967295 table_id=451"] [thread_id=403]
[2024/12/10 00:05:28.479 +08:00] [INFO] [DeltaMergeStore.cpp:376] ["Release DeltaMerge Store end"] [source="keyspace=4294967295 table_id=451"] [thread_id=403]
[2024/12/10 00:05:28.479 +08:00] [INFO] [LocalIndexerScheduler.cpp:85] ["LocalIndexerScheduler is destroyed"] [thread_id=1]
[2024/12/10 00:05:28.918 +08:00] [INFO] [KVStore.cpp:443] ["Destroy KVStore"] [thread_id=1]
[2024/12/10 00:05:28.918 +08:00] [INFO] [ReadIndex.cpp:387] ["KVStore shutdown, deleting read index worker"] [thread_id=1]
[2024/12/10 00:05:28.919 +08:00] [INFO] [KVStore.cpp:445] ["Destroy KVStore Finished"] [thread_id=1]
[2024/12/10 00:05:28.924 +08:00] [INFO] [JointThreadAllocInfo.cpp:194] ["Stop collecting thread alloc metrics"] [thread_id=1]
[2024/12/10 00:05:28.931 +08:00] [INFO] [Server.cpp:1103] ["Unlink tiflash_instance_wrap.tmt"] [thread_id=1]
[2024/12/10 00:05:28.932 +08:00] [INFO] [Server.cpp:1072] ["Let tiflash proxy shutdown"] [thread_id=1]
[2024/12/10 00:05:28.932 +08:00] [INFO] [Server.cpp:1075] ["Wait for tiflash proxy thread to join"] [thread_id=1]
[2024/12/10 00:05:29.031 +08:00] [INFO] [Server.cpp:1077] ["tiflash proxy thread is joined"] [thread_id=1]
[2024/12/10 00:05:29.031 +08:00] [INFO] [<unknown>] ["shutting down"] [source=Application] [thread_id=1]
[2024/12/10 00:05:29.031 +08:00] [INFO] [BaseDaemon.cpp:303] ["Stop SignalListener thread"] [source=BaseDaemon] [thread_id=977]

-- next, tiflash fails to restart
[2024/12/10 00:05:29.502 +08:00] [INFO] [BaseDaemon.cpp:1174] ["Welcome to TiFlash"] [thread_id=1]
...
[2024/12/10 00:05:37.345 +08:00] [ERROR] [Exception.cpp:96] ["Code: 10015, e.displayText() = DB::Exception: Invalid page id, entry not exist [page_id=451.26] [resolve_id=451.26], e.what() = DB::Exception, Stack trace:
  0x55e8fe90b3b3    StackTrace::StackTrace() [tiflash+34321331]
                    dbms/src/Common/StackTrace.cpp:23
  0x55e8fe87f805    DB::Exception::Exception<DB::UInt128 const&, DB::UInt128&>(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, DB::UInt128 const&, DB::UInt128&) [tiflash+33748997]
                    dbms/src/Common/Exception.h:46
  0x55e8fe87f0a1    DB::PS::V3::PageDirectory<DB::PS::V3::u128::PageDirectoryTrait>::getByIDImpl(DB::UInt128 const&, std::__1::shared_ptr<DB::PS::V3::PageDirectorySnapshot> const&, bool) const [tiflash+33747105]
                    dbms/src/Storages/Page/V3/PageDirectory.cpp:1180
  0x55e90520fe90    DB::PS::V3::PageStorageImpl::readImpl(unsigned long, unsigned long, std::__1::shared_ptr<DB::ReadLimiter> const&, std::__1::shared_ptr<DB::PageStorageSnapshot>, bool) [tiflash+144440976]
                    dbms/src/Storages/Page/V3/PageDirectory.h:463
  0x55e9051b5e75    DB::PageReaderImplNormal::read(unsigned long) const [tiflash+144072309]
                    dbms/src/Storages/Page/PageStorage.h:154
  0x55e903f03374    DB::DM::StableValueSpace::restore(DB::DM::DMContext&, unsigned long) [tiflash+124466036]
                    dbms/src/Storages/Page/PageStorage.cpp:551
  0x55e903e51c3f    DB::DM::Segment::restoreSegment(std::__1::shared_ptr<DB::Logger> const&, DB::DM::DMContext&, unsigned long) [tiflash+123739199]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:427
  0x55e903dd14df    DB::DM::DeltaMergeStore::DeltaMergeStore(DB::Context&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int, long, long, bool, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine>> const&, DB::DM::ColumnDefine const&, bool, unsigned long, std::__1::shared_ptr<std::__1::vector<DB::DM::LocalIndexInfo, std::__1::allocator<DB::DM::LocalIndexInfo>>>, DB::DM::DeltaMergeStore::Settings const&, DB::ThreadPoolImpl<DB::ThreadFromGlobalPoolImpl<false>>*) [tiflash+123213023]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:311
  0x55e903dd2751    DB::DM::DeltaMergeStore::create(DB::Context&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int, long, long, bool, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine>> const&, DB::DM::ColumnDefine const&, bool, unsigned long, std::__1::shared_ptr<std::__1::vector<DB::DM::LocalIndexInfo, std::__1::allocator<DB::DM::LocalIndexInfo>>>, DB::DM::DeltaMergeStore::Settings const&, DB::ThreadPoolImpl<DB::ThreadFromGlobalPoolImpl<false>>*) [tiflash+123217745]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:349
  0x55e904ba8d5d    DB::StorageDeltaMerge::getAndMaybeInitStore(DB::ThreadPoolImpl<DB::ThreadFromGlobalPoolImpl<false>>*) [tiflash+137727325]
                    dbms/src/Storages/StorageDeltaMerge.cpp:1824
  0x55e904baf445    DB::StorageDeltaMerge::getSchemaSnapshotAndBlockForDecoding(DB::TableDoubleLockHolder<false> const&, bool, bool) [tiflash+137753669]
                    dbms/src/Storages/StorageDeltaMerge.cpp:1184
  0x55e9059d458a    bool DB::atomicReadWrite<std::__1::vector<DB::RegionDataReadInfo, std::__1::allocator<DB::RegionDataReadInfo>>>(DB::AtomicReadWriteCtx&, DB::RegionPtrWithBlock const&, std::__1::vector<DB::RegionDataReadInfo, std::__1::allocator<DB::RegionDataReadInfo>>&, bool) [tiflash+152585610]
                    dbms/src/Storages/KVStore/Decode/PartitionStreams.cpp:162
  0x55e9059d1d0f    std::__1::optional<DB::DM::RaftWriteResult> DB::writeRegionDataToStorage<std::__1::vector<DB::RegionDataReadInfo, std::__1::allocator<DB::RegionDataReadInfo>>>(DB::Context&, DB::RegionPtrWithBlock const&, std::__1::vector<DB::RegionDataReadInfo, std::__1::allocator<DB::RegionDataReadInfo>>&, std::__1::shared_ptr<DB::Logger> const&) [tiflash+152575247]
                    dbms/src/Storages/KVStore/Decode/PartitionStreams.cpp:240
  0x55e9059d17b9    DB::RegionTable::writeCommittedByRegion(DB::Context&, DB::RegionPtrWithBlock const&, std::__1::vector<DB::RegionDataReadInfo, std::__1::allocator<DB::RegionDataReadInfo>>&, std::__1::shared_ptr<DB::Logger> const&, bool) [tiflash+152573881]
                    dbms/src/Storages/KVStore/Decode/PartitionStreams.cpp:455
  0x55e905944e2d    DB::Region::handleWriteRaftCmd(DB::WriteCmdsView const&, unsigned long, unsigned long, DB::TMTContext&) [tiflash+151997997]
                    dbms/src/Storages/KVStore/MultiRaft/RaftCommands.cpp:481
  0x55e90594bd37    DB::KVStore::handleWriteRaftCmdInner(DB::WriteCmdsView const&, unsigned long, unsigned long, unsigned long, DB::TMTContext&, std::__1::optional<DB::DM::RaftWriteResult>&) [tiflash+152026423]
                    dbms/src/Storages/KVStore/MultiRaft/RaftCommandsKVS.cpp:62
  0x55e90594bba4    DB::KVStore::handleWriteRaftCmd(DB::WriteCmdsView const&, unsigned long, unsigned long, unsigned long, DB::TMTContext&) [tiflash+152026020]
                    dbms/src/Storages/KVStore/MultiRaft/RaftCommandsKVS.cpp:41
  0x55e9058df385    HandleWriteRaftCmd [tiflash+151581573]
                    dbms/src/Storages/KVStore/FFI/ProxyFFI.cpp:98
  0x7f9972a764f9    _$LT$engine_store_ffi..observer..TiFlashObserver$LT$T$C$ER$GT$$u20$as$u20$raftstore..coprocessor..QueryObserver$GT$::post_exec_query::h5090a11627e80af3 [libtiflash_proxy.so+65496313]
                    contrib/tiflash-proxy/proxy_components/engine_store_ffi/src/observer.rs:155
  0x7f9973d5708d    raftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::apply_raft_cmd::h88d847f439f38129 [libtiflash_proxy.so+85291149]
                    contrib/tiflash-proxy/components/raftstore/src/store/fsm/apply.rs:1554
  0x7f9973d69f7e    raftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::process_raft_cmd::hbb5bc0795bcc376e [libtiflash_proxy.so+85368702]
                    contrib/tiflash-proxy/components/raftstore/src/store/fsm/apply.rs:1421
  0x7f9973d6c93c    raftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::handle_raft_committed_entries::h6c08a3f47d937d5e [libtiflash_proxy.so+85379388]
                    contrib/tiflash-proxy/components/raftstore/src/store/fsm/apply.rs:1171
  0x7f9973d4661d    raftstore::store::fsm::apply::ApplyFsm$LT$EK$GT$::handle_apply::h37b20d9fc18dab9a [libtiflash_proxy.so+85222941]
                    contrib/tiflash-proxy/components/raftstore/src/store/fsm/apply.rs:4104
  0x7f9973d4aa64    raftstore::store::fsm::apply::ApplyFsm$LT$EK$GT$::handle_tasks::h9336d0ec4f241ecd [libtiflash_proxy.so+85240420]
                    contrib/tiflash-proxy/components/raftstore/src/store/fsm/apply.rs:4458
  0x7f9972bf39a8    _$LT$raftstore..store..fsm..apply..ApplyPoller$LT$EK$GT$$u20$as$u20$batch_system..batch..PollHandler$LT$raftstore..store..fsm..apply..ApplyFsm$LT$EK$GT$$C$raftstore..store..fsm..apply..ControlFsm$GT$$GT$::handle_normal::h4f77773aacd5b300 [libtiflash_proxy.so+67058088]
                    contrib/tiflash-proxy/components/raftstore/src/store/fsm/apply.rs:4757
  0x7f9972b0e513    batch_system::batch::Poller$LT$N$C$C$C$Handler$GT$::poll::h19a440dfc173ad0f [libtiflash_proxy.so+66118931]
                    contrib/tiflash-proxy/components/batch-system/src/batch.rs:422
  0x7f9972c634af    std::sys_common::backtrace::__rust_begin_short_backtrace::hafedf41f2f45404d [libtiflash_proxy.so+67515567]
                    /rustc/89e2160c4ca5808657ed55392620ed1dbbce78d1/library/std/src/sys_common/backtrace.rs:155
  0x7f9972cbe299    core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h7dcbd8571c961cd2 [libtiflash_proxy.so+67887769]
                    /rustc/89e2160c4ca5808657ed55392620ed1dbbce78d1/library/core/src/ops/function.rs:250
  0x7f997467fa26    std::sys::unix::thread::Thread::new::thread_start::hd01330a228cdd35d [libtiflash_proxy.so+94894630]
                    /rustc/89e2160c4ca5808657ed55392620ed1dbbce78d1/library/std/src/sys/unix/thread.rs:108
  0x7f996e889d22    start_thread [libc.so.6+564514]
  0x7f996e90ed40    __GI___clone3 [libc.so.6+1109312]"] [source="DB::DM::DeltaMergeStore::DeltaMergeStore(Context &, bool, const String &, const String &, KeyspaceID, TableID, ColumnID, bool, const ColumnDefines &, const ColumnDefine &, bool, size_t, LocalIndexInfosPtr, const Settings &, ThreadPool *)"] [thread_id=616]
[2024/12/10 00:05:37.345 +08:00] [ERROR] [RaftCommands.cpp:506] ["[region_id=119032 applied_term=6 applied_index=1720] catch exception: Invalid page id, entry not exist [page_id=451.26] [resolve_id=451.26], while applying `RegionTable::writeCommittedByRegion` on [term 6, index 1721], entries PUT|write|7480000000000001FFC35F728000000000FF01B7330000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF01B7330000000000FA:PUT|write|7480000000000001FFC35F728000000000FF01BA7F0000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF01BA7F0000000000FA:PUT|write|7480000000000001FFC35F728000000000FF01BE090000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF01BE090000000000FA:PUT|write|7480000000000001FFC35F728000000000FF01E3420000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF01E3420000000000FA:PUT|write|7480000000000001FFC35F728000000000FF01FE400000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF01FE400000000000FA:PUT|write|7480000000000001FFC35F728000000000FF01FECE0000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF01FECE0000000000FA:PUT|write|7480000000000001FFC35F728000000000FF020F610000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF020F610000000000FA:PUT|write|7480000000000001FFC35F728000000000FF0211440000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF0211440000000000FA:PUT|write|7480000000000001FFC35F728000000000FF0216600000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF0216600000000000FA:PUT|write|7480000000000001FFC35F728000000000FF0223100000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF0223100000000000FA:PUT|write|7480000000000001FFC35F728000000000FF0235E40000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF0235E40000000000FA:PUT|write|7480000000000001FFC35F728000000000FF02362D0000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF02362D0000000000FA:PUT|write|7480000000000001FFC35F728000000000FF0247B60000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF0247B60000000000FA:PUT|write|7480000000000001FFC35F728000000000FF024D340000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF024D340000000000FA:PUT|write|7480000000000001FFC35F728000000000FF0252570000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF0252570000000000FA:PUT|write|7480000000000001FFC35F728000000000FF026DF80000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF026DF80000000000FA:PUT|write|7480000000000001FFC35F728000000000FF026F390000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF026F390000000000FA:PUT|write|7480000000000001FFC35F728000000000FF027C890000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF027C890000000000FA:PUT|write|7480000000000001FFC35F728000000000FF027FF00000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF027FF00000000000FA:PUT|write|7480000000000001FFC35F728000000000FF0287D70000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF0287D70000000000FA:PUT|write|7480000000000001FFC35F728000000000FF0288A40000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF0288A40000000000FA:PUT|write|7480000000000001FFC35F728000000000FF028C000000000000FAF9B1B5C6AE6BFFFD:DEL|lock|7480000000000001FFC35F728000000000FF028C000000000000FA"] [thread_id=616]

4. What is your TiFlash version? (Required)

nightly

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions