Skip to content

Conversation

tangenta
Copy link
Contributor

@tangenta tangenta commented Jun 27, 2025

What problem does this PR solve?

Issue Number: ref #61702

Problem Summary:

This PR supports adding index at system keyspace TiDB.

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
    -- system keyspace tidb
    mysql> create table t (a int);
    Query OK, 0 rows affected (0.14 sec)
    
    mysql> insert into t values (1);
    Query OK, 1 row affected (0.04 sec)
    
    mysql> alter table t add index idx(a);
    Query OK, 0 rows affected (6.41 sec)
    
    mysql> admin check table t;
    Query OK, 0 rows affected (0.02 sec)
    -- user keyspace tidb
    mysql> create table t (a int);
    Query OK, 0 rows affected (0.12 sec)
    
    mysql> insert into t values (1);
    Query OK, 1 row affected (0.04 sec)
    
    mysql> alter table t add index idx(a);
    Query OK, 0 rows affected (2.44 sec)
    
    mysql> admin check table t;
    Query OK, 0 rows affected (0.00 sec)
    mysql> select id,state,step,task_key from mysql.tidb_global_task union select id,state,step,task_key from mysql.tidb_global_task_history;
    +-------+---------+------+-------------------------+
    | id    | state   | step | task_key                |
    +-------+---------+------+-------------------------+
    |     1 | succeed |   -2 | SYSTEM/ddl/backfill/116 |
    | 30001 | succeed |   -2 | ks1/ddl/backfill/116    |
    +-------+---------+------+-------------------------+
    2 rows in set (0.00 sec)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 27, 2025
Copy link

tiprow bot commented Jun 27, 2025

Hi @tangenta. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Jun 27, 2025

Codecov Report

Attention: Patch coverage is 50.00000% with 25 lines in your changes missing coverage. Please review.

Project coverage is 74.8263%. Comparing base (1962f22) to head (e986f89).
Report is 22 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #62045        +/-   ##
================================================
+ Coverage   72.9246%   74.8263%   +1.9016%     
================================================
  Files          1739       1786        +47     
  Lines        482820     494656     +11836     
================================================
+ Hits         352095     370133     +18038     
+ Misses       109169     101418      -7751     
- Partials      21556      23105      +1549     
Flag Coverage Δ
integration 48.8415% <2.0000%> (?)
unit 72.1587% <50.0000%> (+0.0020%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.7804% <ø> (ø)
parser ∅ <ø> (∅)
br 62.1822% <ø> (+15.8463%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines 1070 to 1078
distTaskSessPool := do.sysSessionPool
if keyspace.IsRunningOnUser() {
sp, err := do.GetKSSessPool(keyspace.System)
if err != nil {
return err
}
distTaskSessPool = sp
}
taskManager := storage.NewTaskManager(distTaskSessPool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this cannot be merged now, current task still runs on its owner keyspace, we can merge it when we are ready to switch to DXF service

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What else do we need to do? I can run add-index job normally after this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems ok for add-index

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we enable it, user ks DXF will also try get task/subtasks from SYSTEM ks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't change this for all places, and only change where add-index uses TaskManager

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a classical kernel, it should use advancenSessionPool

Signed-off-by: tangenta <[email protected]>
Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a real-tikv test like this one to make sure we do submit and run on DXF service on SYSTEM KS https://github.com/pingcap/tidb/pull/62023/files#diff-4ca8746e7f966fd0f1bc3257edfb8208f38f354c3a33d91c3591eb2d1eaa7376R30

@D3Hunter
Copy link
Contributor

failed to add index on SYSTEM keyspace

mysql> show config where name='keyspace-name';
+------+----------------------+---------------+--------+
| Type | Instance             | Name          | Value  |
+------+----------------------+---------------+--------+
| tidb | 192.168.206.185:5000 | keyspace-name | SYSTEM |
+------+----------------------+---------------+--------+
1 row in set (0.03 sec)

mysql>
mysql> create table sys(id int);
Query OK, 0 rows affected (0.10 sec)

mysql> insert into sys values(1);
Query OK, 1 row affected (0.02 sec)

mysql> alter table sys add index(id);
ERROR 1105 (HY000): cross keyspace session manager is not available in classic kernel or current keyspace
mysql>

Signed-off-by: tangenta <[email protected]>
store := ddlObj.store
sessPool := ddlObj.sessPool
taskKS := s.task.Keyspace
if keyspace.IsRunningOnSystem() && taskKS != config.GetGlobalKeyspaceName() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if keyspace.IsRunningOnSystem() && taskKS != config.GetGlobalKeyspaceName() {
if taskKS != config.GetGlobalKeyspaceName() {

taskKS != config.GetGlobalKeyspaceName() implies it's running on SYSTEM

Copy link
Contributor Author

@tangenta tangenta Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to make it explicit to avoid unexpected executions. (Make kerneltype.NextGen() and SYSTEM explicit)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment about why we still state it explicitly even though taskKS != config.GetGlobalKeyspaceName() implies it's running on SYSTEM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to avoid unexpected executions, it's better to use intest.Assert to let it panic in test environment

@D3Hunter
Copy link
Contributor

please add your manually test results when you

  • add index on SYSTEM ks
  • add index on user ks

}

// NewSessionPool creates a new Session pool.
func NewSessionPool(resPool *pools.ResourcePool) *Pool {
func NewSessionPool(resPool util.SessionPool) *Pool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in nextgen, as we are not registering this cross ks session into the SessionManager, so it will not block GC during scanning kv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a risk. Not sure if the GC worker can execute across keyspaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they won't AFAIK

@tangenta
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Jun 30, 2025

@tangenta: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Contributor

please add your manually test results when you

  • add index on SYSTEM ks
  • add index on user ks

in your manually test, please also give the result of select id,state,step,task_key from mysql.tidb_global_task union select id,state,step,task_key from mysql.tidb_global_task_history;, so we known we submit the task to the right keyspace

Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jul 1, 2025
Signed-off-by: tangenta <[email protected]>
@D3Hunter
Copy link
Contributor

D3Hunter commented Jul 1, 2025

/hold

i am running on SYSTEM, but got error on ks1

mysql> show config where name='keyspace-name';
+------+------------------+---------------+--------+
| Type | Instance         | Name          | Value  |
+------+------------------+---------------+--------+
| tidb | 192.168.2.4:5000 | keyspace-name | SYSTEM |
+------+------------------+---------------+--------+
1 row in set (0.03 sec)

mysql> create table t(a int);
Query OK, 0 rows affected (0.17 sec)

mysql> insert into t values(1);
Query OK, 1 row affected (0.01 sec)

mysql> alter table t add index(a);
ERROR 8028 (HY000): Information schema is changed during the execution of the statement(for example, table definition may be updated by other DDL ran in parallel). If you see this error often, try increasing `tidb_max_delta_schema_count`. [try again later]
[2025/07/01 11:41:54.454 +08:00] [INFO] [manager.go:328] ["task executor started"] [keyspaceName=ks1] [task-id=1] [type=backfill] [concurrency=4] [remaining-slots=6]
[2025/07/01 11:41:54.778 +08:00] [INFO] [task_executor.go:359] ["failed to get step executor"] [keyspaceName=ks1] [task-id=1] [task-type=backfill] [error="[schema:1146]Table '(Schema ID 2).(Table ID 114)' doesn't exist"] [errorVerbose="[schema:1146]Table '(Schema ID 2).(Table ID 114)' doesn't exist\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\t/Users/jujiajia/go/pkg/mod/github.com/pingcap/[email protected]/normalize.go:177\ngithub.com/pingcap/tidb/pkg/ddl.getTableInfo\n\t/Users/jujiajia/code/pingcap/tidb/pkg/ddl/table.go:433\ngithub.com/pingcap/tidb/pkg/ddl.getTableByTxn.func1\n\t/Users/jujiajia/code/pingcap/tidb/pkg/ddl/job_scheduler.go:668\ngithub.com/pingcap/tidb/pkg/kv.RunInNewTxn\n\t/Users/jujiajia/code/pingcap/tidb/pkg/kv/txn.go:132\ngithub.com/pingcap/tidb/pkg/ddl.getTableByTxn\n\t/Users/jujiajia/code/pingcap/tidb/pkg/ddl/job_scheduler.go:661\ngithub.com/pingcap/tidb/pkg/ddl.(*backfillDistExecutor).newBackfillStepExecutor\n\t/Users/jujiajia/code/pingcap/tidb/pkg/ddl/backfilling_dist_executor.go:179\ngithub.com/pingcap/tidb/pkg/ddl.(*backfillDistExecutor).GetStepExecutor\n\t/Users/jujiajia/code/pingcap/tidb/pkg/ddl/backfilling_dist_executor.go:258\ngithub.com/pingcap/tidb/pkg/disttask/framework/taskexecutor.(*BaseTaskExecutor).createStepExecutor\n\t/Users/jujiajia/code/pingcap/tidb/pkg/disttask/framework/taskexecutor/task_executor.go:357\ngithub.com/pingcap/tidb/pkg/disttask/framework/taskexecutor.(*BaseTaskExecutor).Run\n\t/Users/jujiajia/code/pingcap/tidb/pkg/disttask/framework/taskexecutor/task_executor.go:328\ngithub.com/pingcap/tidb/pkg/disttask/framework/taskexecutor.(*Manager).startTaskExecutor.func2\n\t/Users/jujiajia/code/pingcap/tidb/pkg/disttask/framework/taskexecutor/manager.go:339\ngithub.com/pingcap/tidb/pkg/util.(*WaitGroupWrapper).RunWithLog.func1\n\t/Users/jujiajia/code/pingcap/tidb/pkg/util/wait_group_wrapper.go:181\nruntime.goexit\n\t/Users/jujiajia/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_arm64.s:1223"]

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 1, 2025
Signed-off-by: tangenta <[email protected]>
@tangenta
Copy link
Contributor Author

tangenta commented Jul 2, 2025

/retest

Copy link

tiprow bot commented Jul 2, 2025

@tangenta: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Contributor

D3Hunter commented Jul 2, 2025

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 2, 2025
tangenta added 2 commits July 2, 2025 16:05
Signed-off-by: tangenta <[email protected]>
Signed-off-by: tangenta <[email protected]>
@D3Hunter
Copy link
Contributor

D3Hunter commented Jul 2, 2025

/retest

Copy link

tiprow bot commented Jul 2, 2025

@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Contributor

D3Hunter commented Jul 2, 2025

/retest

Copy link

tiprow bot commented Jul 2, 2025

@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Contributor

D3Hunter commented Jul 2, 2025

/retest

Copy link

tiprow bot commented Jul 2, 2025

@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

ti-chi-bot bot commented Jul 2, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, wjhuang2016

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jul 2, 2025
Copy link

ti-chi-bot bot commented Jul 2, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-07-01 03:12:16.901340431 +0000 UTC m=+1364589.624519414: ☑️ agreed by D3Hunter.
  • 2025-07-02 10:05:29.149956161 +0000 UTC m=+1475781.873135143: ☑️ agreed by wjhuang2016.

@ti-chi-bot ti-chi-bot bot merged commit c56481e into pingcap:master Jul 2, 2025
27 checks passed
@D3Hunter D3Hunter mentioned this pull request Aug 12, 2025
65 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants