Skip to content

TiDB Down Occurs When TiDB Sets tiflash Replica During Upgrade #57863

@apollodafoni

Description

@apollodafoni

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. alter table tiflash replica
  2. During alter , use tiup upgrade from v8.4.0 to v8.5.0-pre

2. What did you expect to see? (Required)

upgrade success

3. What did you see instead (Required)

Image
[2024/12/02 10:14:20.198 +08:00] [ERROR] [advancer.go:400] ["listen task meet error, would reopen."] [error=EOF]
[2024/12/02 10:14:20.323 +08:00] [ERROR] [tso_client.go:366] ["[tso] update connection contexts failed"] [dc=global] [error="rpc error: code = Canceled desc = context canceled"]
[2024/12/02 10:19:37.115 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:37.217 +08:00] [ERROR] [controller.go:623] ["[resource group controller] token bucket rpc error"] [error="rpc error: code = Unavailable desc = not leader"]
[2024/12/02 10:19:37.220 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.271 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:37.324 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.375 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:37.427 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.531 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.573 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:37.634 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.738 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.842 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.970 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:38.037 +08:00] [ERROR] [client.go:252] ["[pd] request failed with a non-200 status"] [caller-id=pd-http-client] [name=GetMinResolvedTSByStoresIDs] [uri=/pd/api/v1/min-resolved-ts] [method=GET] [target-url=] [source=tikv-driver] [url=http://pd-1-peer:2379/pd/api/v1/min-resolved-ts] [status="500 Internal Server Error"] [body="[PD:apiutil:ErrRedirectToNotLeader]redirect to not leader"]
[2024/12/02 10:19:38.074 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:38.074 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:21:09.936 +08:00] [ERROR] [http_status.go:531] ["start status/rpc server error"] [error="accept tcp [::]:10080: use of closed network connection"]
[2024/12/02 10:21:09.936 +08:00] [ERROR] [http_status.go:526] ["http server error"] [error="http: Server closed"]
[2024/12/02 10:21:09.936 +08:00] [ERROR] [http_status.go:521] ["grpc server error"] [error="mux: server closed"]
[2024/12/02 10:21:09.947 +08:00] [ERROR] [advancer.go:400] ["listen task meet error, would reopen."] [error=EOF]
[2024/12/02 10:21:21.774 +08:00] [ERROR] [session.go:3935] ["set system variable max_allowed_packet error"] [error="[variable:1621]SESSION variable 'max_allowed_packet' is read-only. Use SET GLOBAL to assign the value"]
[2024/12/02 10:21:25.985 +08:00] [ERROR] [ddl_tiflash_api.go:538] ["updating TiFlash replica status err"] [category=ddl] [error="context canceled"] [tableID=131] [isPartition=true]

4. What is your TiDB version? (Required)

v8.4.0

Metadata

Metadata

Assignees

Labels

affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.component/ddlThis issue is related to DDL of TiDB.severity/majortype/bugThe issue is confirmed as a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions