Skip to content

Proposal: introduce system variable to ignore hash join in TiDB #46695

@coderplay

Description

@coderplay

Background

  • For most OLTP queries, hash join is not selected as it normally takes longer to fetch very few rows.
  • MySQL doesn’t support hash join before MySQL 8.0.
  • In 8.0 or higher, MySQL provides an optimizer switch () to disable in session/global level.
  • There is undocumented parameter (_hash_join_enabled) in Oracle to disable hash join.

It makes sense we introduce the mechanism to control this optimizer behavior at session or cluster level.

User case

  • As an application developer, I’m very confident the queries don’t benefit from hash join. However, at times, bad plans were generated in which a hash join was selected. I will config the connection pool, and initialize the connections with “set tidb_opt_enable_hash_join=no”.
  • As a DBA, I’m aware the application is OLTP workload, which works properly in MySQL ~5.7. I’m very sure the SQLs don’t need hash join. I’ve seen cases in the past. bad plans were generated in which a hash join was selected. After confirming with application developers, I will disable hash join in cluster level with “set global tidb_opt_enable_hash_join=no”

Functional Specification

Introduce variable to disable hash join in session/global

  • Name: tidb_opt_enable_hash_join
  • Scope: session / global
  • Applies to set_var(): YES
  • Default value: YES
  • Possible values: YES /NO

Behavior

  • With tidb_opt_enable_hash_join=NO, Optimizer ignores hash join when generating execution plan.
  • Optimizer hints overrides the behavior of the variable. If tidb_opt_enable_hash_join is set to “NO”, hash join is still enforced for the queries with hint “HASH_JOIN()”, “HASH_JOIN_PROBE()” or “HASH_JOIN_BUILD()”. In this case, lower level control makes exceptions possible. (The same behavior as MySQL)
  • It’s possible queries come with hints which could lead to no join type available. For example
set tidb_opt_enable_hash_join=NO 
select /*+ no_merge_join(), no_index_join(), no_index_hash_join()/ … 

It’s not acceptable SQL fails caused by optimizer variable changes, as the scope of variables is wide. We have to avoid no available plan because of this variable.

In this case, we shall keep the hints working and ignore the behavior change caused by this variable. Query still generates a plan with hash_join, and appends messages to warnings.

variable tidb_opt_enable_hash_join=NO is ignored due to no join type available.

Metadata

Metadata

Assignees

Labels

affects-6.5This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.sig/plannerSIG: Plannertype/enhancementThe issue or PR belongs to an enhancement.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions