-
Notifications
You must be signed in to change notification settings - Fork 3.8k
fix(ingest-limits): Get usage from owing instances only #16937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💻 Deploy preview deleted. |
[recheck_period: <duration> | default = 10s] | ||
|
||
# The number of partitions to use for the ring. | ||
# CLI flag: -ingest-limits-frontend.num-partitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit worried that someone configures this for ingest-limits
and forgets to configure it for ingest-limits-frontend
. Perhaps we could share the configuration here? I'm not sure what's best...
Can we remove |
|
249a12c
to
0e9d003
Compare
d9b6cf5
to
0558080
Compare
pkg/limits/frontend/ring_test.go
Outdated
instances := make([]ring.InstanceDesc, len(clients)) | ||
|
||
for i := 0; i < len(test.expectedAssignedPartitionsRequest); i++ { | ||
expectedStreamUsageReq := (*logproto.GetStreamUsageRequest)(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do (the whole block of code rather than the specific line)? I'm not sure I follow here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad that you point that out. The whole loop needs an overhaul here. This setup was written when the two requests types matched one to one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I delete this, the tests still pass 😕
clients[i] = &mockIngestLimitsClient{
expectedAssignedPartitionsRequest: test.expectedAssignedPartitionsRequest[i],
getAssignedPartitionsResponse: test.getAssignedPartitionsResponses[i],
expectedStreamUsageRequest: test.expectedStreamUsageRequest[i],
getStreamUsageResponse: test.getStreamUsageResponses[i],
t: t,
}
One of the tests is panicking in
|
It's because of |
1c3b77e
to
744b63d
Compare
} | ||
} | ||
|
||
// Set up the mock clients for the stream usage requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but I still don't understand this? The tests pass without it. If you want to put in nils
, wouldn't this work?
diff --git a/pkg/limits/frontend/ring_test.go b/pkg/limits/frontend/ring_test.go
index e7621d0cf..b3f2b7a48 100644
--- a/pkg/limits/frontend/ring_test.go
+++ b/pkg/limits/frontend/ring_test.go
@@ -106,7 +106,7 @@ func TestRingStreamUsageGatherer_GetStreamUsage(t *testing.T) {
Tenant: "test",
StreamHashes: []uint64{1}, // Hash 1 maps to partition 1
},
- expectedAssignedPartitionsRequest: []*logproto.GetAssignedPartitionsRequest{{}, {}},
+ expectedAssignedPartitionsRequest: []*logproto.GetAssignedPartitionsRequest{nil, nil},
getAssignedPartitionsResponses: []*logproto.GetAssignedPartitionsResponse{{
AssignedPartitions: map[int32]int64{
1: time.Now().Add(-time.Second).UnixNano(),
@@ -116,15 +116,17 @@ func TestRingStreamUsageGatherer_GetStreamUsage(t *testing.T) {
1: time.Now().UnixNano(),
},
}},
- expectedStreamUsageRequest: []*logproto.GetStreamUsageRequest{{}, {
- Tenant: "test",
- StreamHashes: []uint64{1},
- }},
- getStreamUsageResponses: []*logproto.GetStreamUsageResponse{{}, {
- Tenant: "test",
- ActiveStreams: 1,
- Rate: 10,
- }},
+ expectedStreamUsageRequest: []*logproto.GetStreamUsageRequest{
+ nil, {
+ Tenant: "test",
+ StreamHashes: []uint64{1},
+ }},
+ getStreamUsageResponses: []*logproto.GetStreamUsageResponse{
+ nil, {
+ Tenant: "test",
+ ActiveStreams: 1,
+ Rate: 10,
+ }},
expectedResponses: []GetStreamUsageResponse{{
Addr: "instance-1",
Response: &logproto.GetStreamUsageResponse{
@@ -148,17 +150,11 @@ func TestRingStreamUsageGatherer_GetStreamUsage(t *testing.T) {
t: t,
expectedAssignedPartitionsRequest: test.expectedAssignedPartitionsRequest[i],
getAssignedPartitionsResponse: test.getAssignedPartitionsResponses[i],
+ expectedStreamUsageRequest: test.expectedStreamUsageRequest[i],
+ getStreamUsageResponse: test.getStreamUsageResponses[i],
}
}
- // Set up the mock clients for the stream usage requests.
- // Note: Not every client will have a stream usage request because the
- // requested stream hashes are owned only by a subset of partition consumers.
- for i := range test.expectedStreamUsageRequest {
- clients[i].(*mockIngestLimitsClient).expectedStreamUsageRequest = test.expectedStreamUsageRequest[i]
- clients[i].(*mockIngestLimitsClient).getStreamUsageResponse = test.getStreamUsageResponses[i]
- }
-
// Set up the instances for the ring.
for i := range len(clients) {
instances[i] = ring.InstanceDesc{
This commit fixes a bug where limits were incorrect after the change in #16937. This bug occurred because we stopped asking each limit instance if it knew about all stream hashes, and instead starting asking each limit instance if it knew about the stream hashes for its assigned partitions. This broke the logic in the frontend that was taking the intersection of all responses to calculate if a stream was unknown. It should now take the union instead.
This commit fixes a bug where limits were incorrect after the change in #16937. This bug occurred because we stopped asking each limit instance if it knew about all stream hashes, and instead starting asking each limit instance if it knew about the stream hashes for its assigned partitions. This broke the logic in the frontend that was taking the intersection of all responses to calculate if a stream was unknown. It should now take the union instead.
Contains backports of these commits: ``` 2cde9b1 fix: skip streams over limits in dry-run mode (#17114) 805125c fix: fix a bug where limits were incorrect after #16937 (#17224) d106042 fix: fix bug where expectedStreamUsageRequest was not used (#17223) 69aeda1 feat: add tests for enforcing limits in distributors (#17124) ```
What this PR does / why we need it:
This PR updates the stream usage retrieval routine in the ingest-limits-frontend component to query only ingest-limits instances that own the requested streams. The purpose of this optimization is to streamline traffic between these components to necessary minimum per
ExceedLimits
request.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
deprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR