Skip to content

Analyze's analyzeColumnsPushDownV2 create too much concurrency #61785

@hawkingrei

Description

@hawkingrei

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

The issue is that customers have observed higher I/O consumption when the analyze operation reaches the index, compared to when it analyzes regular tables. (The analyze status contains sensitive information, so it will not be included here.)

Image

The root cause of the issue lies in improper coding practices. When we perform the analyze operation, we create multiple concurrent tasks to execute it. However, within these concurrently spawned goroutines, we further create additional concurrency. This nested concurrency results in an actual level of parallelism that is significantly higher than we anticipated.

CREATE TABLE `test` (
  `c1` binary(16) NOT NULL,
  `c2` tinyint(1) NOT NULL DEFAULT '0',
  `c3` int NOT NULL,
  `c4` varchar(48) COLLATE utf8mb4_general_ci NOT NULL,
  `c5` varchar(512) COLLATE utf8mb4_general_ci DEFAULT NULL,
  `c6` enum('A','B','C') COLLATE utf8mb4_general_ci DEFAULT NULL,
  `c7` int unsigned NOT NULL DEFAULT '0',
  `c8` int unsigned NOT NULL DEFAULT '0',
  `c9` tinyint(1) GENERATED ALWAYS AS (`c7` > 0) VIRTUAL NOT NULL,
  `c10` int DEFAULT NULL,
  `c11` datetime(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
  `c12` datetime(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
  PRIMARY KEY (`c1`) /*T![clustered_index] CLUSTERED */,
  KEY `idx_c4_c2_c9_c3_c12_c5_c6` (`c4`,`c2`,`c9`,`c3`,`c12`,`c5`,`c6`),
  KEY `idx_c4_c2_c9_c12_c5_c6` (`c4`,`c2`,`c9`,`c12`,`c5`,`c6`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

analyze table chat_session all columns ;

show analyze status

+--------------+------------+----------------+-----------------------------------------------------------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+-------------------+----------+----------------------+
| Table_schema | Table_name | Partition_name | Job_info                                                                                                        | Processed_rows | Start_time          | End_time            | State    | Fail_reason | Instance       | Process_ID | Remaining_seconds | Progress | Estimated_total_rows |
+--------------+------------+----------------+-----------------------------------------------------------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+-------------------+----------+----------------------+
| test         | test       |                | analyze ndv for index idx_c4_c2_c9_c12_c5_c6                                                                    | 0              | 2025-06-18 14:48:05 | 2025-06-18 14:48:05 | finished | <null>      | 127.0.0.1:4000 | <null>     | <null>            | <null>   | <null>               |
| test         | test       |                | analyze ndv for index idx_c4_c2_c9_c3_c12_c5_c6                                                                 | 0              | 2025-06-18 14:48:05 | 2025-06-18 14:48:05 | finished | <null>      | 127.0.0.1:4000 | <null>     | <null>            | <null>   | <null>               |
| test         | test       |                | analyze table all indexes, columns c1, c2, c3, c4, c5, c6, c7, c9, c12 with 256 buckets, 100 topn, 1 samplerate | 0              | 2025-06-18 14:48:05 | 2025-06-18 14:48:05 | finished | <null>      | 127.0.0.1:4000 | <null>     | <null>            | <null>   | <null>               |
+--------------+------------+----------------+-----------------------------------------------------------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+-------------------+----------+----------------------+

You will see that it will create two task about analyze ndv for index.

the problem is here.

The first creation of concurrency

// Start workers with channel to collect results.
taskCh := make(chan *analyzeTask, buildStatsConcurrency)
resultsCh := make(chan *statistics.AnalyzeResults, 1)
for range buildStatsConcurrency {
e.wg.Run(func() { e.analyzeWorker(taskCh, resultsCh) })
}

The second creation of concurrency

AnalyzeExec.analyzeWorker -> analyzeColumnsPushDownEntry -> analyzeColumnsPushDownV2

https://github.com/pingcap/tidb/blob/master/pkg/executor/analyze_col_v2.go#L105-L107

The third creation of concurrency

var subIndexWorkerWg = NewAnalyzeResultsNotifyWaitGroupWrapper(resultsCh)
subIndexWorkerWg.Add(statsConcurrncy)
for range statsConcurrncy {
subIndexWorkerWg.Run(func() { e.subIndexWorkerForNDV(taskCh, resultsCh) })
}
for _, task := range tasks {

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.report/customerCustomers have encountered this bug.severity/moderatesig/plannerSIG: Plannertype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions