-
Notifications
You must be signed in to change notification settings - Fork 742
Closed
Labels
affects-8.1This bug affects the 8.1.x(LTS) versions.This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.This bug affects the 8.5.x(LTS) versions.severity/majortype/bugThe issue is confirmed as a bug.The issue is confirmed as a bug.
Description
Bug Report
We set the lastSlowStoreCaptureTS
according to the time when we first detect the slow store.
pd/pkg/schedule/schedulers/evict_slow_store.go
Lines 104 to 120 in b486e21
func (conf *evictSlowStoreSchedulerConfig) readyForRecovery() bool { | |
conf.RLock() | |
defer conf.RUnlock() | |
recoveryDurationGap := conf.RecoveryDurationGap | |
failpoint.Inject("transientRecoveryGap", func() { | |
recoveryDurationGap = 0 | |
}) | |
return uint64(time.Since(conf.lastSlowStoreCaptureTS).Seconds()) >= recoveryDurationGap | |
} | |
func (conf *evictSlowStoreSchedulerConfig) setStoreAndPersist(id uint64) error { | |
conf.Lock() | |
defer conf.Unlock() | |
conf.EvictedStores = []uint64{id} | |
conf.lastSlowStoreCaptureTS = time.Now() | |
return conf.save() | |
} |
If the recovery time is less than the jitter duration, after the slow store disappears, we will transfer leaders back immediately. Instead, we should wait for the recovery time before the transfer.
The first one is 10m and the second is 30m for recovery time, which looks similar in this case.
Metadata
Metadata
Assignees
Labels
affects-8.1This bug affects the 8.1.x(LTS) versions.This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.This bug affects the 8.5.x(LTS) versions.severity/majortype/bugThe issue is confirmed as a bug.The issue is confirmed as a bug.