Skip to content

Conversation

joechenrh
Copy link
Contributor

@joechenrh joechenrh commented Apr 18, 2025

What problem does this PR solve?

Issue Number: close #60649

Problem Summary:

As title, this PR add support for fast multi-value index check.

The rationale can be found in the linked issue, let's just take a simple schema as example here:

CREATE TABLE t(i1 INT, i2 CHAR(16), j JSON, INDEX mvi((CAST(j AS UNSIGNED ARRAY))), PRIMARY KEY (i1, i2));

To check the data-consistency for index mvi, we need to read the below columns from table side and index side respectively

  • Table side: SELECT i1, i2, j FROM t USE INDEX()
  • Index side: SELECT i1, i2, _V$_mvi_0 FROM t USE INDEX(mvi)

Here _V$_mvi_0 is the hidden column generated by multi-valued index, and we use this column to scan the whole index.

After retrieving the data, we can calculate the checksum and detect the inconsistent record.

What changed and how does it work?

The changes in this PR can be divided into three parts:

Support for Fast Admin Checks on MV Indexes:

This PR enhances support for multi-valued (MV) indexes in FAST ADMIN CHECK TABLE with some restricted column types.

We implement separate logic to generate checksum SQL for MV indexes. Unlike normal indexes that use BIT_XOR to aggregate per-row checksums, we simply use SUM for MV indexes. It's necessary because calculating per-row
checksum for multi-valued index is impractical.

Optimizer Adaptations:

Previously, the usage of multi-valued index is limited to index merge. To support reading data from MV index via SQL, we add some new logic:

  • Before build data source, we manually remove the flags related to MV Index(IsMVIndex for index and Hidden for index columns) in table meta. So optimizer can treat this table as normal table.

Adjustments to the JSON_SUM_CRC32 Function:

To align index-side and table-side calculations, we modified the internal implementation of JSON_SUM_CRC32:

  • Deduplicate array elements before summation
  • Refine type conversion logic to ensure CRC32 value consistency
  • Restrict the usage of this function only in internal SQL
  • Apply MOD to calculated crc32 value to prevent overflow

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 18, 2025
Copy link

tiprow bot commented Apr 18, 2025

Hi @joechenrh. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Apr 18, 2025

Codecov Report

❌ Patch coverage is 60.95764% with 212 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.1749%. Comparing base (46543df) to head (b1b5ccc).
⚠️ Report is 76 commits behind head on master.

⚠️ Current head b1b5ccc differs from pull request most recent head 337a8f4

Please upload reports for the commit 337a8f4 to get more accurate results.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #60650        +/-   ##
================================================
+ Coverage   72.8056%   73.1749%   +0.3693%     
================================================
  Files          1823       1828         +5     
  Lines        494595     498374      +3779     
================================================
+ Hits         360093     364685      +4592     
+ Misses       112624     111727       -897     
- Partials      21878      21962        +84     
Flag Coverage Δ
integration 42.0324% <60.9576%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.8700% <ø> (ø)
parser ∅ <ø> (∅)
br 46.3183% <ø> (-0.0310%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@joechenrh
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Aug 13, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Ruihao Chen <[email protected]>
Signed-off-by: Ruihao Chen <[email protected]>
Signed-off-by: Ruihao Chen <[email protected]>
Signed-off-by: Ruihao Chen <[email protected]>
@joechenrh
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Aug 18, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Aug 18, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Ruihao Chen <[email protected]>
Signed-off-by: Ruihao Chen <[email protected]>
@joechenrh
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Aug 19, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Ruihao Chen <[email protected]>
@joechenrh
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Aug 19, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Ruihao Chen <[email protected]>
Signed-off-by: Ruihao Chen <[email protected]>
@@ -4248,6 +4249,38 @@ func getLatestVersionFromStatsTable(ctx sessionctx.Context, tblInfo *model.Table
return version
}

// buildTableForAdminCheckSQL make a copy of table info if necessary.
// Since we want to utilize MVIndex to read the data in FAST ADMIN CHECK, to prevent from adding
// many hacky code in the planner module, we craete a new table info with all MVIndex-related meta
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: Ruihao Chen <[email protected]>
Copy link

ti-chi-bot bot commented Aug 20, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hawkingrei
Once this PR has been reviewed and has the lgtm label, please assign xuhuaiyu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Aug 20, 2025
Copy link

ti-chi-bot bot commented Aug 20, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-08-20 03:35:06.435458696 +0000 UTC m=+410914.378634212: ☑️ agreed by hawkingrei.

Signed-off-by: Ruihao Chen <[email protected]>
Copy link

ti-chi-bot bot commented Sep 4, 2025

@joechenrh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-lightning-integration-test 2ca2adc link true /test pull-lightning-integration-test
pull-unit-test-ddlv1 2ca2adc link true /test pull-unit-test-ddlv1
pull-br-integration-test 2ca2adc link true /test pull-br-integration-test
idc-jenkins-ci-tidb/check_dev 337a8f4 link true /test check-dev
idc-jenkins-ci-tidb/check_dev_2 337a8f4 link true /test check-dev2
idc-jenkins-ci-tidb/unit-test 337a8f4 link true /test unit-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make fast admin check table support multi-value index
2 participants