infoschema: fix issue of load multiple schema diff at a time will cause information schema cache miss #53620

crazycs520 · 2024-05-28T08:17:17Z

What problem does this PR solve?

Issue Number: close #53428

Problem Summary: This PR fix the situation mentioned #53428 (comment)

fix issue of load multiple schema diff at a time will cause information schema cache miss, then may cause stale-read query latency 10x spike.

What changed and how does it work?

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: crazycs520 <[email protected]>

…se information schema cache miss Signed-off-by: crazycs520 <[email protected]>

ti-chi-bot · 2024-05-28T08:17:20Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

tiprow · 2024-05-28T08:17:40Z

Hi @crazycs520. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: crazycs520 <[email protected]>

pkg/domain/domain.go

codecov · 2024-05-28T08:48:59Z

Codecov Report

❌ Patch coverage is 88.23529% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.1719%. Comparing base (68d1295) to head (e16fac6).
⚠️ Report is 3041 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #53620        +/-   ##
================================================
+ Coverage   72.4357%   75.1719%   +2.7361%     
================================================
  Files          1507       1510         +3     
  Lines        430587     434170      +3583     
================================================
+ Hits         311899     326374     +14475     
+ Misses        99369      87334     -12035     
- Partials      19319      20462      +1143

Flag	Coverage Δ
integration	`50.6400% <64.7058%> (?)`
unit	`72.0269% <88.2352%> (-0.0007%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`∅ <ø> (∅)`
parser	`∅ <ø> (∅)`
br	`61.0813% <ø> (+19.0770%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: crazycs520 <[email protected]>

D3Hunter · 2024-05-28T14:26:45Z

pkg/domain/domain.go

+		if i != len(diffs)-1 {
+			// If load multiple schema diffs, we need to insert each schema version into cache,
+			// to make sure the schema version in cache is continuous, otherwise the cache will have holes,
+			// then stale-read query will meet schema cache miss, then load snapshot schema from TiKV,
+			// then the TiKV which store schema will become the hot spot.


this might make create table/database(say 1M tables) or maintainance while customer might execute many DDLs, much slower, and might takes more memory, we need more test on it.

stale read does NOT always exist in customer scenarios, and even it exist, in this case we should load those schema meta on demand, during the load phase the QPS might be lower, but after that, it's ok, i think it's acceptable

He is fixing infoschema v1, FYI @D3Hunter

D3Hunter · 2024-05-28T14:27:35Z

/hold

need discuss memory footprint/DDL performance of this change, evalulate whether it's necessary to change it.

tiancaiamao · 2024-05-29T07:48:45Z

pkg/domain/domain.go

 	actions := make([]uint64, 0, len(diffs))
 	diffTypes := make([]string, 0, len(diffs))
-	for _, diff := range diffs {
+	for i, diff := range diffs {


Why not wrap the content of the old for loop as a function, and call them len(diffs)-1 times?

Like:

for _, diff := range diffs { builder := NewBuilder() builder.ApplyDiff() builder.Build() // or BuildWithoutUpdateBundle }

I think it would be more complicated to do that.

Not really. split the code into small functions improves the readibility.

I don't need multiple builders here, maybe it's better to extract a infoschema.clone function to use in here.

ti-chi-bot · 2024-05-30T12:46:40Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tiancaiamao
Once this PR has been reviewed and has the lgtm label, please assign tangenta for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

~~OWNERS~~ [tiancaiamao]
pkg/domain/OWNERS
pkg/infoschema/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2024-05-30T12:46:42Z

[LGTM Timeline notifier]

Timeline:

2024-05-30 12:46:42.384384093 +0000 UTC m=+2953356.141519666: ☑️ agreed by tiancaiamao.

ti-chi-bot · 2024-09-06T08:40:32Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ti-chi-bot · 2025-08-21T14:22:10Z

@crazycs520: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-lightning-integration-test	`e16fac6`	link	true	`/test pull-lightning-integration-test`
pull-integration-e2e-test	`e16fac6`	link	true	`/test pull-integration-e2e-test`
pull-integration-realcluster-test-next-gen	`e16fac6`	link	true	`/test pull-integration-realcluster-test-next-gen`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

crazycs520 added 2 commits May 28, 2024 15:08

add test

73c30fd

Signed-off-by: crazycs520 <[email protected]>

infoschema: fix issue of load multiple schema diff at a time will cau…

02fd70d

…se information schema cache miss Signed-off-by: crazycs520 <[email protected]>

ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels May 28, 2024

crazycs520 marked this pull request as ready for review May 28, 2024 08:17

ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels May 28, 2024

refine

00120c1

Signed-off-by: crazycs520 <[email protected]>

crazycs520 mentioned this pull request May 28, 2024

stale-read query latency 10x spike cause by information schema cache miss #53428

Closed

crazycs520 requested a review from tiancaiamao May 28, 2024 08:24

ti-chi-bot added needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. labels May 28, 2024

make bazel_prepare

c8f3dba

Signed-off-by: crazycs520 <[email protected]>

lcwangchao reviewed May 28, 2024

View reviewed changes

pkg/domain/domain.go Show resolved Hide resolved

add comment

e16fac6

Signed-off-by: crazycs520 <[email protected]>

D3Hunter reviewed May 28, 2024

View reviewed changes

ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 28, 2024

tiancaiamao reviewed May 29, 2024

View reviewed changes

tiancaiamao approved these changes May 30, 2024

View reviewed changes

ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label May 30, 2024

ti-chi-bot added the needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. label Jun 5, 2024

ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 6, 2024

ti-chi-bot bot added needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. and removed needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. labels Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

infoschema: fix issue of load multiple schema diff at a time will cause information schema cache miss #53620

infoschema: fix issue of load multiple schema diff at a time will cause information schema cache miss #53620

crazycs520 commented May 28, 2024

Uh oh!

ti-chi-bot bot commented May 28, 2024

Uh oh!

tiprow bot commented May 28, 2024

Uh oh!

Uh oh!

codecov bot commented May 28, 2024 •

edited

Loading

Uh oh!

D3Hunter May 28, 2024 •

edited

Loading

Uh oh!

tiancaiamao May 29, 2024

Uh oh!

D3Hunter commented May 28, 2024 •

edited

Loading

Uh oh!

tiancaiamao May 29, 2024

Uh oh!

crazycs520 May 29, 2024

Uh oh!

tiancaiamao May 30, 2024

Uh oh!

crazycs520 May 31, 2024

Uh oh!

ti-chi-bot bot commented May 30, 2024

Uh oh!

ti-chi-bot bot commented May 30, 2024

Uh oh!

ti-chi-bot bot commented Sep 6, 2024

Uh oh!

ti-chi-bot bot commented Aug 21, 2025

Uh oh!

Uh oh!

infoschema: fix issue of load multiple schema diff at a time will cause information schema cache miss #53620

Are you sure you want to change the base?

infoschema: fix issue of load multiple schema diff at a time will cause information schema cache miss #53620

Conversation

crazycs520 commented May 28, 2024

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Uh oh!

ti-chi-bot bot commented May 28, 2024

Uh oh!

tiprow bot commented May 28, 2024

Uh oh!

Uh oh!

codecov bot commented May 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

D3Hunter May 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tiancaiamao May 29, 2024

Choose a reason for hiding this comment

Uh oh!

D3Hunter commented May 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tiancaiamao May 29, 2024

Choose a reason for hiding this comment

Uh oh!

crazycs520 May 29, 2024

Choose a reason for hiding this comment

Uh oh!

tiancaiamao May 30, 2024

Choose a reason for hiding this comment

Uh oh!

crazycs520 May 31, 2024

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot bot commented May 30, 2024

Uh oh!

ti-chi-bot bot commented May 30, 2024

[LGTM Timeline notifier]

Uh oh!

ti-chi-bot bot commented Sep 6, 2024

Uh oh!

ti-chi-bot bot commented Aug 21, 2025

Uh oh!

Uh oh!

codecov bot commented May 28, 2024 •

edited

Loading

D3Hunter May 28, 2024 •

edited

Loading

D3Hunter commented May 28, 2024 •

edited

Loading