Skip to content

Commit 37defbb

Browse files
committed
[A76] Clarify behaviors discovered during implementation
I discovered a few behaviors that I'm not sure are desired (although I don't think they are harmful either), and that may cause confusion for implementors, so definitely worth noting: 1. If multiple concurrent RPCs use the same picker, and the picker has no endpoint in connecting state, then both may trigger a connection attempt, resulting in multiple endpoints leaving Idle. This happens until the picker is updated to update `picker_has_endpoint_in_connecting_state`. One thing we could do to avoid this, at least in Go, is to atomically set `picker_has_endpoint_in_connecting_state` to true when we trigger a connection attempt. 2. If we queued the pick because there was no endpoint in `Ready` state and there was endpoints in `Idle` or connecting state, then when the endpoint transitions to Ready, we may end up triggering a connection attempt again on the next pick for a given RPC. 3. The behavior is not specified when multiple endpoints have the same hash key. Looking at Envoy's code, I think there is no deliberate choice of endpoint in that case, so we could leave it unspecified, but I think it's worth clarifying.
1 parent dea04cb commit 37defbb

File tree

1 file changed

+17
-3
lines changed

1 file changed

+17
-3
lines changed

A76-ring-hash-improvements.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -157,11 +157,19 @@ if !using_random_hash {
157157
return ring[first_index].picker->Pick(...);
158158
```
159159

160-
This behavior ensures that a single RPC does not cause more than one endpoint to
160+
This behavior ensures that a single RPC does not cause more than two endpoint to
161161
exit `IDLE` state at a time, and that a request missing the header does not
162162
incur extra latency in the common case where there is already at least one
163163
endpoint in `READY` state. It converges to picking a random endpoint, since each
164-
request may eventually cause a random endpoint to go from `IDLE` to `READY`.
164+
request may cause one or two random endpoints to go from `IDLE` to `READY`: one
165+
if it randomly picks an `IDLE` endpoint, and one if the pick was queued and an
166+
endpoint transitioned from `CONNECTING` to `READY`, but no other endpoint is
167+
currently `CONNECTING`.
168+
169+
Note that because triggering a connection attempt and transitioning to
170+
`CONNECTING` is not synchronous, it is possible for multiple endpoints to exit
171+
`IDLE` state if no endpoint is `CONNECTING` and multiple RPCs that use a random
172+
hash are made concurrently on the same picker.
165173

166174
### Explicitly setting the endpoint hash key
167175

@@ -178,6 +186,10 @@ endpoint attribute to the value of [LbEndpoint.Metadata][LbEndpoint.Metadata]
178186
`envoy.lb` `hash_key` field, as described in [Envoy's documentation for the ring
179187
hash load balancer][envoy-ringhash].
180188

189+
If multiple endpoints have the same `hash_key`, then which endpoint is chosen is
190+
unspecified, and the implementation is free to choose any of the endpoint with a
191+
matching `hash_key`.
192+
181193
### Temporary environment variable protection
182194

183195
Explicitly setting the request hash key will be gated by the
@@ -230,7 +242,9 @@ considered the following alternative solutions:
230242

231243
## Implementation
232244

233-
Implemented in C-core in https://github.com/grpc/grpc/pull/38312.
245+
Implemented in:
246+
- C-core in https://github.com/grpc/grpc/pull/38312.
247+
- Go: https://github.com/grpc/grpc-go/pull/8159
234248

235249
[A42]: A42-xds-ring-hash-lb-policy.md
236250
[A61]: A61-IPv4-IPv6-dualstack-backends.md

0 commit comments

Comments
 (0)