Skip to content

Support executing DDL when cluster is in BDR mode #48519

@okJiang

Description

@okJiang

Solution

DDL definition

In BDR mode, we can understand that in the operation of the business, we can only synchronize safe DDL for execution downstream.
Unsafe DDL can only be synchronized when the related applications are paused.
We call these safe DDL and unsafe DDL as managed DDL.
In addition, we refer to DDL other than managed DDL as unmanaged DDL, and unmanaged DDL can be executed freely on different clusters, independent of the BDR mode.

Circular synchronization

As we all know, bi-directional replication(BDR) will cause circular synchronization. So, in order for users to be able to execute DDL normally, we have divided BDR mode into two different situations:
  • restricted mode: In this mode, the user can execute safe DDL in Primary cluster(will be explained in the next section), and the DDL will be successfully synchronized to other Secondary clusters.
  • unrestricted mode: In this mode, the user can execute all managed DDL in Local_only cluster(will be explained in the next section) separately, and the DDL will be not synchronized to other clusters.
Note:
unrestricted mode must pause related applications. It's a temporary mode, and we recommend to switch back to restricted mode after completing the necessary DDL.

BDR role

In order to implement restricted mode and unrestricted mode. We introduce five bdr roles:
  • None: means that BDR mode is disabled.
  • Primary: can execute safe DDL and unmanaged DDL from users. Usually, the Primary does not receive DDL from CDC.
  • Secondary: can't execute DDL except unmanaged DDL from users. Usually, the Secondary only execute DDL from CDC.
  • Local_only: can execute all DDL from user. However, the DDL from Local_only can't be synced by CDC.

How to switch mode?

restricted mode

  1. Set one cluster to Primary
  2. Set other clusters to Secondary

unrestricted mode

  1. Set all clusters to Local_only

Implement

SQL

  • admin set bdr role (none|primary|secondary|local_only);
    • need admin privilege
  • admin show bdr role;
    • need admin privilege
    • > admin show bdr role; +----------+ | BDR ROLE | +----------+ | none | +----------+

BDR Mode (kernel)

  • List all the DDL, and then classify them as safe/unsafe/unmanaged.
    • For DDL executed in Primary, we check whether it is safe DDL or unmanaged DDL. If it is, DDL can be executed successfully. Otherwise, it is restricted.
    • For DDL executed in Secondary, we check whether it is unmanaged DDL. If it is, DDL can be executed successfully. Otherwise, it is restricted.
  • How does CDC check the bdr role of source cluster?
    • Add a field in tidb_ddl_job

Role Safe DDL(from user) Unsafe DDL(from user) Un-managed DDL(from user) All DDL (from CDC) Can be syncedby CDC
None Y Y Y Y Y
Primary Y N Y Y Y
Secondary N N Y Y N
local_only Y Y Y Y N

Work progress

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/feature-requestCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions