Skip to content

Conversation

OliverS929
Copy link
Contributor

@OliverS929 OliverS929 commented Apr 23, 2025

What problem does this PR solve?

Issue Number: close #12153

What is changed and how it works?

Add $k8s_cluster and $tidb_cluster as potential filter to DM grafana panel setups.

Check List

Tests

  • []Unit test
  • []Integration test
  • []Manual test (add detailed scripts or steps below)
  • [x]No code

Questions

Will it cause performance regression or break compatibility?

No it should not be.

Do you need to update user documentation, design documentation or monitoring documentation?

No obvious relation to user documentation or designs.

Release note

None

@ti-chi-bot ti-chi-bot bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Apr 23, 2025
@OliverS929
Copy link
Contributor Author

/hold

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Apr 23, 2025
@OliverS929 OliverS929 changed the title Add $k8s_cluster and $tidb_cluster as potential filter to meet the DM: Add $k8s_cluster and $tidb_cluster as potential filter to Grafana panels. Apr 23, 2025
@OliverS929
Copy link
Contributor Author

/retest

@OliverS929
Copy link
Contributor Author

OliverS929 commented Apr 23, 2025

/cc @D3Hunter @GMHDBJD @lance6716 Please take a look and help review this PR.

@OliverS929
Copy link
Contributor Author

/cc @kaaaaaaang @joyjoyhong

@ti-chi-bot ti-chi-bot bot requested a review from kaaaaaaang April 23, 2025 11:00
Copy link
Contributor

ti-chi-bot bot commented Apr 23, 2025

@OliverS929: GitHub didn't allow me to request PR reviews from the following users: joyjoyhong.

Note that only pingcap members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @kaaaaaaang @joyjoyhong

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you paste the manual test result? I'm not sure if there are other problems

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Apr 23, 2025
@OliverS929
Copy link
Contributor Author

OliverS929 commented Apr 23, 2025

Can you paste the manual test result? I'm not sure if there are other problems

Sure can do. I don't think these filters can be utilized in self-managed DM cluster though, since we don't have those two filters available. Are you referring to some manual testing in a self-managed cluster just in case we break anything with these modifications?

@lance6716
Copy link
Contributor

I see the issue #12153 is opened because of TiDB cloud monitor problems. I think you can open the grafana of TiDB cloud and manually edit some panel expression to see if $k8s_cluster and $tidb_cluster can work as expected

Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

@@ -208,7 +208,7 @@
"targets": [
{
"exemplar": true,
"expr": "dm_worker_task_state{job_id=~\"$job_id\",task=~\"$task\",source_id=~\"$source\"}",
"expr": "dm_worker_task_state{k8s_cluster=~\"$k8s_cluster\",tidb_cluster=~\"$tidb_cluster\", job_id=~\"$job_id\",task=~\"$task\",source_id=~\"$source\"}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tidb is using k8s_cluster="$k8s_cluster", tidb_cluster="$tidb_cluster", maybe unify it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am aware of that practice, and I took this k8s_cluster=~\"$k8s_cluster\" approach on purpose. Because it is much more flexible than k8s_cluster="$k8s_cluster". For scenarios where k8s_cluster is absent, under the current approach, Prometheus will include all series that do NOT have k8s_cluster, and ignore the filter for series without that label, whereas "$k8s_cluster" will lead to the query returning nothing.

@OliverS929
Copy link
Contributor Author

I see the issue #12153 is opened because of TiDB cloud monitor problems. I think you can open the grafana of TiDB cloud and manually edit some panel expression to see if $k8s_cluster and $tidb_cluster can work as expected

Sure I will turn to @kaaaaaaang and see we can have some screenshots. Any particular panel you would like us to verify?

@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 25, 2025
@OliverS929
Copy link
Contributor Author

/cc @kennytm

@ti-chi-bot ti-chi-bot bot requested a review from kennytm April 25, 2025 03:02
Copy link
Contributor

ti-chi-bot bot commented Apr 25, 2025

@kaaaaaaang: adding LGTM is restricted to approvers and reviewers in OWNERS files.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

ti-chi-bot bot commented Apr 27, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kaaaaaaang, kennytm, lance6716

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Apr 27, 2025
Copy link
Contributor

ti-chi-bot bot commented Apr 27, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-04-23 13:17:22.45241965 +0000 UTC m=+447986.264210030: ☑️ agreed by lance6716.
  • 2025-04-27 12:36:02.150279246 +0000 UTC m=+791105.962069627: ☑️ agreed by kennytm.

@OliverS929
Copy link
Contributor Author

image
image
Here are two screenshots I got from the updated Grafana panels of my own DM cluster deployment, just to show that with those two labels added, promqls still work as expected.
/cc @D3Hunter @lance6716

@ti-chi-bot ti-chi-bot bot requested review from D3Hunter and lance6716 April 27, 2025 14:47
@OliverS929
Copy link
Contributor Author

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 27, 2025
@ti-chi-bot ti-chi-bot bot merged commit 1080da9 into pingcap:master Apr 27, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DM: Need $k8s_cluster and $tidb_cluster as filter in grafana setup.
5 participants