Skip to content

Conversation

crazycs520
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #53428

Problem Summary: This PR fix the situation mentioned #53428 (comment)

fix issue of load multiple schema diff at a time will cause information schema cache miss, then may cause stale-read query latency 10x spike.

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: crazycs520 <[email protected]>
…se information schema cache miss

Signed-off-by: crazycs520 <[email protected]>
Copy link

ti-chi-bot bot commented May 28, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels May 28, 2024
@crazycs520 crazycs520 marked this pull request as ready for review May 28, 2024 08:17
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels May 28, 2024
Copy link

tiprow bot commented May 28, 2024

Hi @crazycs520. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: crazycs520 <[email protected]>
@crazycs520 crazycs520 requested a review from tiancaiamao May 28, 2024 08:24
@ti-chi-bot ti-chi-bot added needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. labels May 28, 2024
Signed-off-by: crazycs520 <[email protected]>
Copy link

codecov bot commented May 28, 2024

Codecov Report

❌ Patch coverage is 88.23529% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.1719%. Comparing base (68d1295) to head (e16fac6).
⚠️ Report is 3041 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #53620        +/-   ##
================================================
+ Coverage   72.4357%   75.1719%   +2.7361%     
================================================
  Files          1507       1510         +3     
  Lines        430587     434170      +3583     
================================================
+ Hits         311899     326374     +14475     
+ Misses        99369      87334     -12035     
- Partials      19319      20462      +1143     
Flag Coverage Δ
integration 50.6400% <64.7058%> (?)
unit 72.0269% <88.2352%> (-0.0007%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling ∅ <ø> (∅)
parser ∅ <ø> (∅)
br 61.0813% <ø> (+19.0770%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: crazycs520 <[email protected]>
Comment on lines +484 to +488
if i != len(diffs)-1 {
// If load multiple schema diffs, we need to insert each schema version into cache,
// to make sure the schema version in cache is continuous, otherwise the cache will have holes,
// then stale-read query will meet schema cache miss, then load snapshot schema from TiKV,
// then the TiKV which store schema will become the hot spot.
Copy link
Contributor

@D3Hunter D3Hunter May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might make create table/database(say 1M tables) or maintainance while customer might execute many DDLs, much slower, and might takes more memory, we need more test on it.

stale read does NOT always exist in customer scenarios, and even it exist, in this case we should load those schema meta on demand, during the load phase the QPS might be lower, but after that, it's ok, i think it's acceptable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He is fixing infoschema v1, FYI @D3Hunter

@D3Hunter
Copy link
Contributor

D3Hunter commented May 28, 2024

/hold

need discuss memory footprint/DDL performance of this change, evalulate whether it's necessary to change it.

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 28, 2024
actions := make([]uint64, 0, len(diffs))
diffTypes := make([]string, 0, len(diffs))
for _, diff := range diffs {
for i, diff := range diffs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not wrap the content of the old for loop as a function, and call them len(diffs)-1 times?

Like:

for _, diff := range diffs {
    builder := NewBuilder()
    builder.ApplyDiff()
    builder.Build()   // or BuildWithoutUpdateBundle
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more complicated to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. split the code into small functions improves the readibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't need multiple builders here, maybe it's better to extract a infoschema.clone function to use in here.

Copy link

ti-chi-bot bot commented May 30, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tiancaiamao
Once this PR has been reviewed and has the lgtm label, please assign tangenta for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label May 30, 2024
Copy link

ti-chi-bot bot commented May 30, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-05-30 12:46:42.384384093 +0000 UTC m=+2953356.141519666: ☑️ agreed by tiancaiamao.

@ti-chi-bot ti-chi-bot added the needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. label Jun 5, 2024
Copy link

ti-chi-bot bot commented Sep 6, 2024

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 6, 2024
@ti-chi-bot ti-chi-bot bot added needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. and removed needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. labels Nov 13, 2024
Copy link

ti-chi-bot bot commented Aug 21, 2025

@crazycs520: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-lightning-integration-test e16fac6 link true /test pull-lightning-integration-test
pull-integration-e2e-test e16fac6 link true /test pull-integration-e2e-test
pull-integration-realcluster-test-next-gen e16fac6 link true /test pull-integration-realcluster-test-next-gen

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-1-more-lgtm Indicates a PR needs 1 more LGTM. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stale-read query latency 10x spike cause by information schema cache miss
5 participants