Add `cluster_id` to the `mysql.tidb`

## Feature Request

**Describe the feature you'd like:**

Sometimes the external software will need to know the `CLUSTER_ID` to know whether the two TiDB instances are from the same cluster. For example: the CDC can use it to decide whether the upstream and downstream are from the same cluster and give an error to the user.

The `CLUSTER_ID` is already generated and stored in PD, but it's not exposed from TiDB side.

**Describe alternatives you've considered:**

I've considered two workaround before having this feature implemented:

### Workaround 1: Use `CLUSTER_INFO.SERVER_ID`

Currently, we have a workaround to use `CLUSTER_INFO.SERVER_ID` to decide whether two servers are from the same cluster. For example, with a given server_id from the SERVER1, you can query `SELECT count(1) FROM information_schema.cluster_info WHERE server_id = ?` in the SERVER2 to know whether the two servers are from the same cluster.

However, a cluster may have a lot of servers and the possibility of conflict server id is not small enough. For example, if I have two cluster and each one of them has 100 server, the possibility of having conflict server id is:

$$
\frac{(1<<22 - 1 - 100) \choose 100}{(1<<22 - 1) \choose 100}  = 99.7618...%
$$

When I have 200 servers, the possibility is:

$$
\frac{(1<<22 - 1 - 200) \choose 200}{(1<<22 - 1) \choose 200}  = 99.0508...%
$$

### Workaround 2: Sort and Hash `CLUSTER_INFO.SERVER_ID`

As using one `SERVER_ID` may face conflict, we can use the information provided by all servers, and hash them together. Run `SELECT SERVER_ID FROM information_schema.cluster_info order by SERVER_ID` and hash them. Now it should generate an unique id for each cluster. However, this ID will change during the scaling of the cluster, so it may have some bugs when we are generating the hash during scaling.

The both two workarounds are not that elegant.

**Teachability, Documentation, Adoption, Migration Strategy:**

1. Add a new row to the `mysql.tidb`: `CLUSTER_ID xxxxx`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `cluster_id` to the `mysql.tidb` #59476

Feature Request

Workaround 1: Use `CLUSTER_INFO.SERVER_ID`

Workaround 2: Sort and Hash `CLUSTER_INFO.SERVER_ID`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add cluster_id to the mysql.tidb #59476

Description

Feature Request

Workaround 1: Use CLUSTER_INFO.SERVER_ID

Workaround 2: Sort and Hash CLUSTER_INFO.SERVER_ID

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add `cluster_id` to the `mysql.tidb` #59476

Workaround 1: Use `CLUSTER_INFO.SERVER_ID`

Workaround 2: Sort and Hash `CLUSTER_INFO.SERVER_ID`