-
Notifications
You must be signed in to change notification settings - Fork 6k
ttl: fix the issue that TTL job may hang some time when shrink the delete worker count #55572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @lcwangchao. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
b47a900
to
3944d30
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #55572 +/- ##
================================================
+ Coverage 73.0193% 75.0851% +2.0658%
================================================
Files 1576 1582 +6
Lines 440864 453011 +12147
================================================
+ Hits 321916 340144 +18228
+ Misses 99271 92389 -6882
- Partials 19677 20478 +801
Flags with carried forward coverage won't be shown. Click here to find out more.
|
3944d30
to
a60e2f5
Compare
…lete worker count
a60e2f5
to
49242d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pkg/ttl/ttlworker/task_manager.go
Outdated
// All rows are processed. | ||
// We use `<=` instead of `==` to make the logic strong to make sure | ||
// it also works when statistics are not accurate. | ||
logger.Info("mark TTL task finished because all scanned rows are processed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a warning for the case t.statistics.ErrorRows.Load()+t.statistics.SuccessRows.Load() > t.statistics.TotalRows.Load()
, as we currently don't know when they are not equal.
return true | ||
} | ||
|
||
if time.Since(t.result.time) > waitTaskProcessRowsTimeout { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is two minutes long enough? Previously, the meaning of finished()
is that both scan
and delete
are finished. If we are going to keep the behavior, it means the delete
should have finished within 2 minutes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I can enlarge it to 5 minutes. The delChan
is a channel with 0 zero size, and san worker will only be marked as finished after the delTask
emitted. That means we as most waiting tidb_ttl_scan_batch_size
rows to deleted. tidb_ttl_scan_batch_size
default value is 500, I think it is enough for normal cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bb7133, YangKeao The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
3b299ac
to
4b35fc9
Compare
4b35fc9
to
8cbad54
Compare
/retest |
@lcwangchao: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #55561
What changed and how does it work?
ErrorRows
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.