-
Notifications
You must be signed in to change notification settings - Fork 6k
Description
Feature Request
Describe the feature you'd like:
Sometimes the external software will need to know the CLUSTER_ID
to know whether the two TiDB instances are from the same cluster. For example: the CDC can use it to decide whether the upstream and downstream are from the same cluster and give an error to the user.
The CLUSTER_ID
is already generated and stored in PD, but it's not exposed from TiDB side.
Describe alternatives you've considered:
I've considered two workaround before having this feature implemented:
Workaround 1: Use CLUSTER_INFO.SERVER_ID
Currently, we have a workaround to use CLUSTER_INFO.SERVER_ID
to decide whether two servers are from the same cluster. For example, with a given server_id from the SERVER1, you can query SELECT count(1) FROM information_schema.cluster_info WHERE server_id = ?
in the SERVER2 to know whether the two servers are from the same cluster.
However, a cluster may have a lot of servers and the possibility of conflict server id is not small enough. For example, if I have two cluster and each one of them has 100 server, the possibility of having conflict server id is:
When I have 200 servers, the possibility is:
Workaround 2: Sort and Hash CLUSTER_INFO.SERVER_ID
As using one SERVER_ID
may face conflict, we can use the information provided by all servers, and hash them together. Run SELECT SERVER_ID FROM information_schema.cluster_info order by SERVER_ID
and hash them. Now it should generate an unique id for each cluster. However, this ID will change during the scaling of the cluster, so it may have some bugs when we are generating the hash during scaling.
The both two workarounds are not that elegant.
Teachability, Documentation, Adoption, Migration Strategy:
- Add a new row to the
mysql.tidb
:CLUSTER_ID xxxxx