-
Notifications
You must be signed in to change notification settings - Fork 6k
Description
Enhancement
Refactoring
The current plan/core
pkg is quite huge not as slim as we expected. The hybrid placement of logicalOp, physicalOp, property, task, logical-rewrite, build-phase, binder, cost, exhaustion, etc makes the hierarchy complicated. The boundary of them is also not as clear as we desired. I concluded that there is some reason for this phenomenon.
-
One is if Golang's structure wants to implement member functions with itself as a receiver, including the interface implementation, which can only be defined in the same pkg where the structure is defined. (here clearly the pkg is
core
) stackoverflow ref- eg:
LogicalSort
is defined in core pkg somewhere. If you want to implement (p *LogicalSort) buildKeyInfo for the sake of interface implementation or just member function of them, this implementation can only be done in the samecore
pkg, because Golang only allows local type as a receiver. Given buildKeyInfo is logical-rewrite-related code, it should be put into another part/component/pkg of core logic for clarity, that calls for the interface simplification or elimination of self-receiver's function.
- eg:
-
Numerous utility and facility functions related to the core logic are defined and used in the
core
package. These include trace, hashcode, cost mode, selectivity, task, enumeration, and more. Moving or adding a new feature outside thecore
package may result in many internal structures or functions being exported out as well. -
The biggest block is the import cycle problem. Say we create a new pkg called core2 and we want to add a new feature inside. The new feature couldn't work well without the current core context support, so by some means (exported original
core
structure or interface, using function pointer or something), we got a dependency fromcore2
->core
. However, this feature must also be called fromcore
, causecore
is a current unified portal for the entire optimization phase, so we got a dependency fromcore
->core2
. As a result, hops, golang's import cycle is formed. The final solution to the problem is Back to integrate them all. Eventually, the core pkg becomes larger and larger.
current core pkg should be arranged as:
* /core
* /base (base interface definition, physical, logical, task, etc)
* /operator
* /baseimpl
* /logicalimpl
* /physicalimpl
* /stats (the basic stats structure is defined in pkg/statistics, here is mainly about cross-est, row count derivation, row size, and selectivity, if possible, we can change the name from old cardinality to stats)
* /cost (cost main contains the detail about how the cost model works, the formula inferred across different op, etc)
* /dump (explain info related, plan replayer related, flatten plan related)
* /rewrite (tidb currently does not have a clear part of this, but it should, like a constant prop, order eliminate, etc)
* /rule (currently all the kinds of logical rules, all of them are treated as always-optimal)
* /util (currently all the glue code should be extracted with effort to here)
Tips about resolving import cycle
- if
A
depends onX
, andB
depends onX
, while currentX
is defined in file ofA
orB
.- you can extract
X
logic out as util usage
- you can extract
- if pkg
A
depends on pkgB
's function, and pkgB
systematically depends onA
. eg:executor
depends onplanner
is reasonable, while ifplanner
dependsexecutor
's function for some tricky handling. This is not will designed.- Anyway, you can resort the function pointer assignment in
executor
pkg's init function, init(){ planner.Fxxx = Fxx in executor} - If the dependency is only used as the field type for some new structure definition, you can use interface{} instead here.
- Anyway, you can resort the function pointer assignment in
- for the others, you should rethink about how you strong dependency is come. And move those hard parts to the correct place.
- Last but not least, if in one of your refactoring, the changes call for more changes endlessly, you should find the minimum set across Tips 1 & 2, and split them as the first part. Then we go back to Tips 3 if dependency still exists.
Process
- move logical optimizing trace logic out of core pkg planner: move logical optimizing trace logic out of core pkg #52161
- planner: refine cop task as a capital one for latter pkg move planner: refine cop task as a capital one for latter pkg move #52506
- planner: refine mppTask as capital one for latter pkg move planner: refine mppTask as capital one for latter pkg move #52491
- planner: move physical opt and cost misc to util and split plan interface planner: move physical opt and cost misc to util and split plan interface #52224
- planner: move base plan related output of core pkg and make it well-packaged planner: move base plan related output of core pkg and make it well-pkged #52529
- planner: remove internal pkg and move base code to certain place planner: remove internal pkg and move base code to certain place #52620
- planner: rename base plan implemention's pkg from base to baseImpl planner: rename base plan implemention's pkg from base to baseImpl #52659
- planner: move debugtracer logic and handle col definition to util pkg planner: move debugtracer logic and handle col defition to util pkg #52681