-
Notifications
You must be signed in to change notification settings - Fork 65
Fix input plane attempt retry handling. #3565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
> Checks for functions with a high McCabe complexity https://docs.astral.sh/ruff/rules/complex-structure/ Motivated by bugs like the one introduced in #3456 and fixed in #3565. Signed-off-by: David Xia <[email protected]>
@cursor review |
7772425
to
41f9c58
Compare
Use [parameterized pytests][1] to be DRY. [1]: https://docs.pytest.org/en/stable/example/parametrize.html Signed-off-by: David Xia <[email protected]>
# AttemptAwait will return a failure until this is 0. It is decremented by 1 each time AttemptAwait is called. | ||
self.attempt_await_failures_remaining = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored unit tests to not require this attribute.
@@ -91,7 +322,7 @@ def test_map(client, servicer, slow_put_inputs): | |||
|
|||
@pytest.mark.parametrize("slow_put_inputs", [False, True]) | |||
@pytest.mark.timeout(120) | |||
def test_map_inputplane(client, servicer, slow_put_inputs): | |||
def test_map_input_plane(client, servicer, slow_put_inputs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change and below are unrelated and simply fix camel-casing of "input plane" to be consistent.
assert servicer.attempt_await_failures_remaining == 0 | ||
|
||
|
||
def test_retry_limit_on_internal_error(client: Client, servicer: MockClientServicer, monkeypatch): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -35,25 +32,3 @@ def test_lookup_foo(client: Client, servicer: MockClientServicer): | |||
assert f.remote() == "foo" | |||
assert f._get_metadata().input_plane_url is not None | |||
assert f._get_metadata().input_plane_region == "us-east" | |||
|
|||
|
|||
def test_retry_on_internal_error(client: Client, servicer: MockClientServicer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I manually tested many of the unit test cases against prod. The client behaved as I expected. |
), | ||
], | ||
) | ||
def test_remote_input_plane( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reduce the duration of some of these tests?
60.53s call test/function_test.py::test_remote_input_plane[long running]
7.43s call test/function_test.py::test_remote_input_plane[honor user retry policy]
7.41s call test/function_test.py::test_remote_input_plane[internal failure does not use up user retries]
Describe your changes
We have a bug in which we create excessive input plane attempts due to accidentally submitting AttemptRetry on every AttemptAwait timeout of 55s. We haven't caught it because most input plane calls take less than a minute. This will be patched server-side.
Also fixes a separate bug where internal failures can eat into client specified retries.
SVC-863