Skip to content

Conversation

tiancaiamao
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #59400

Problem Summary:

After a NotifyPrivilegeUpdate(), the notification from etcd is laggy.

When we're in a batch of privilege modification operations, the laggy event cause that even when the workload is done,
tidb still have a long period of time handling the laggy events.
This use 100% cpu (1 core) even there are no other events.

What changed and how does it work?

Wait for a while to get a batch of the laggy events, handle them together.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

Before:
image

After:
image

As you can see, after the workload finish, the CPU drop to 0 immediately with this commit.

  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 6, 2025
Copy link

tiprow bot commented Mar 6, 2025

Hi @tiancaiamao. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Mar 6, 2025

Codecov Report

Attention: Patch coverage is 36.00000% with 16 lines in your changes missing coverage. Please review.

Project coverage is 75.1911%. Comparing base (e6b4d95) to head (cfee950).
Report is 16 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #59934        +/-   ##
================================================
+ Coverage   72.9252%   75.1911%   +2.2658%     
================================================
  Files          1701       1749        +48     
  Lines        469722     481030     +11308     
================================================
+ Hits         342546     361692     +19146     
+ Misses       106059      96770      -9289     
- Partials      21117      22568      +1451     
Flag Coverage Δ
integration 49.2611% <36.0000%> (?)
unit 72.4978% <36.0000%> (+0.3613%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.6910% <ø> (ø)
parser ∅ <ø> (∅)
br 64.2184% <ø> (+19.3879%) ⬆️
🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

func ignoreBatchData(ch clientv3.WatchChan) {
ticker := time.NewTicker(5 * time.Millisecond)
defer ticker.Stop()
for i := 0; i < 1024; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does 1024 here means maxBatchSize? Maybe use a const is more reasonable.

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Mar 6, 2025
@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Mar 6, 2025

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Mar 6, 2025

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

select {
case resp, ok := <-ch:
if !ok {
return event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to rerun the event here?
The later use may be wrong?err := privReloadEvent(do.privHandle, &event)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function just focus on the batching functionality, and the first input event is a correct one passed by the caller.
The close of the channel will be handled by the reloadPrivilegeLoad instaed of this function.

@tiancaiamao tiancaiamao requested a review from wjhuang2016 March 7, 2025 06:08
Copy link

ti-chi-bot bot commented Mar 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Defined2014, wjhuang2016

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm approved and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 7, 2025
Copy link

ti-chi-bot bot commented Mar 7, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-03-06 03:42:19.67105816 +0000 UTC m=+499452.799977903: ☑️ agreed by Defined2014.
  • 2025-03-07 07:13:16.075371656 +0000 UTC m=+598509.204291386: ☑️ agreed by wjhuang2016.

@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Mar 7, 2025

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hawkingrei
Copy link
Member

/retest

1 similar comment
@Defined2014
Copy link
Contributor

/retest

Copy link

tiprow bot commented Mar 8, 2025

@Defined2014: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Mar 8, 2025

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot merged commit 0a109e9 into pingcap:master Mar 8, 2025
26 checks passed
@tiancaiamao tiancaiamao deleted the issue59400 branch March 8, 2025 15:08
zeminzhou pushed a commit to zeminzhou/tidb that referenced this pull request May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The load privilege event is laggy after the workload gone
4 participants