-
Notifications
You must be signed in to change notification settings - Fork 6k
Description
Feature Request
The background is that when we turn to use chunk encoding in the RPC, the grpc packet size become quite big.
A 96M region may amplify to 900M-1G when using chunk encoding.
And this account for some OOM issues.
So to alleviate the OOM cases, we have to avoid the grpc packet being too big.
One way is to use the coprocessor streaming, and the other way is coprocessor paging.
The streaming has some known issues and is lack of maintance, so it's deprecated #20759.
We have an internal doc about the details. The conclusion is that we'll turn to coprocessor paging and make it the default protocol.
Describe the feature you'd like:
This issue tracks the process of the development/testing.
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Migration Strategy:
Done
- the previous research docs (internal)
- *: implement chunk rpc encoding for unistore #35114
- *: support paging protocol on unistore #35244
- enable coprocessor paging by default #35273
- test the tidb repo unit test cases with real tikv with paging enabled, result here
- testkit: add a WithTiKV flag to support run unit test on real TiKV #35647
- executor: fix a nil point when @@tidb_enable_collect_execution_info is off and cop cache is on #35839
- *: support coprocessor cache for paging protocol #35787
- config: hide and remove query feedback config #35798
- store/copr: set a smaller channel size for coprocessor task in the keep order case #36047
- store/mockstore/unistore: fix several issues of coprocessor paging in unistore #36147
- *: add tidb_min_paging_size system variable #36107
- store/copr: adjust the cop cache admission process time for paging #36157
- store/copr: set low concurrency for keep order coprocessor requests #35975
- store/copr: use non-buffered channel for coprocessor response #35988
- telemetry: add telemetry for tidb_enable_paging #36261
- util/paging: choose min paging size default value as 128, and max value as 8192 #36331
TODO
- Fix the follow issues found when enable paging by default in pr#35244
- There are several kinds of coprocessor request, including DAG and others.
admin check table/index
analyze
are not using DAG so they don't support paging - Find bug In dynamic partition mode distsql request key ranges is not sorted #35242 that for partition table on dynamic pruning mode, the key ranges is not sorted. So rebuilding cop tasks and calculate remain key ranges get wrong result.
- cop cache for paging is not well supported.
- When enable/disable paging, the explain result (related to IndexLookup) is different, so some test cases need to be updated.
- QueryFeedback is broken when coprocessor paging is enabled.
- mocktikv doesn't support paging, and some test cases are still using mocktikv, so if we enable paging by default, those test cases needs update
- Read from
INFORMATION_SCHEMA.CLUSTER_XXX
table use coprocessor request to query tidb instances, and tidb coprocessor doesn't support paging
- There are several kinds of coprocessor request, including DAG and others.
- Test the tidb repo unit test cases with real tikv with paging enabled
- Fix cop cache for paging protocol #35786
- Make QueryFeedback deprecated and remove the switch #35790
- Fix the channel size and distsql concurrency for paging #34849
- Test paging with https://github.com/tikv/copr-test
- add a @@tidb_min_paging_size system variable for testing #36106
- Performance test comparing with coprocessor paging enabled/disabled
- sysbench using benchbot
- tpcc using benchbot
- tpch using benchbot
- Coprocessor cache strategy needs update when working with paging #36156
- Add telemetry Add telemetry for tidb_enable_paging #36260 @XuHuaiyu
- Expirement to choose the best value for page size (or self-adapting) #36328