-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix: invalid master_last_io_seconds_ago metric during stable sync #4892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: invalid master_last_io_seconds_ago metric during stable sync #4892
Conversation
src/server/protocol_client.cc
Outdated
if (!input_str.empty()) | ||
break; | ||
RETURN_ON_ERR(Recv(sock_.get(), io_buf)); | ||
TouchIoTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also factor out the lines 260-269 into the Recv call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But there is a call of TouchIoTime() at 268. This looks correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it is correct but there is code duplication -you can replace these lines with Recv and TouchIoTime calls like we do here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/server/replica.cc
Outdated
res.master_last_io_sec = (ProactorBase::GetMonotonicTimeNs() - last_io_time) / 1000000000UL; | ||
|
||
uint64_t current_time = ProactorBase::GetMonotonicTimeNs(); | ||
// Handle the case when last_io_time is greater than current time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets change the comment to this:
// last_io_time is derived above by reading last_io_time_
from all the flows, by accessing them from a foreign thread, see the loop above. As a result some threads may have last_io_time_
bigger than our current time, so we fix it here.
This way you explain why this happens, not just what you do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/server/protocol_client.cc
Outdated
VLOG(2) << "Read master response of " << *size_res << " bytes"; | ||
ec = Recv(sock_.get(), buffer); | ||
if (ec) { | ||
LOG(WARNING) << "Socket error " << ec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you already do this inside Recv()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/server/protocol_client.cc
Outdated
|
||
TouchIoTime(); | ||
buffer->CommitWrite(*size_res); | ||
VLOG(3) << "Read master response"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this vlog is not helpful now and can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with small nits
fixes: #4884
Fixed overflow on calculating time difference.