-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix(hset_family): Ensure empty hash sets are removed #4873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f72ff33
to
1b3f3cd
Compare
e860e4e
to
8b3c8e7
Compare
src/server/hset_family.cc
Outdated
} | ||
|
||
// If the set is empty, FindWithCleanup will have removed the key | ||
if (sm->Empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If FindWithCleanup removed the key, is sm still valid or was it deallocated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, oddly enough in the debugger the pointer is still valid although its size is 0 and entries_
table is empty although the block of memory seems to have been freed. It doesn't seem safe though.
I will look into this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in tests the pointer is valid because of UB and the OS just hasn't reused the memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latest version of the code, the map is not accessed after delete because delete is performed during scope exit of the function.
src/server/hset_family.cc
Outdated
|
||
bool ContainsWithCleanup(HSetCleanupCtx ctx, const std::string_view field) { | ||
const bool found = ctx.sm->Contains(field); | ||
ctx.DelKeyIfEmpty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a blocking function as RecordJournal can preempt
- Lets rename this function to have suffix Blocking
- If the key was deleted we should not access sm anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I am not wrong with number 2, we need to find a way to enforce that callers will not access sm after calling this functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we set sm
to nullptr if the key is deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach could be to consume the input pointer (by always setting to null) and return the string map pointer wrapped in a type such as variant or optional to signify that the string map is no longer valid, but nothing guarantees that the caller will not attempt to use the returned pointer without a check.
When a search operation is performed on a hash set, expired fields are removed as a side effect. If at the end of such an operation the hash set becomes empty, its key is removed from the database. Signed-off-by: Abhijat Malviya <[email protected]>
8b3c8e7
to
c91cea7
Compare
src/server/hset_family.cc
Outdated
return OpStatus::OK; | ||
} | ||
|
||
struct HSetCleanupCtx { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code needs more improvement.
- We pass too many variables into struct, some of them can be deduced from others
- The interface a bit awkward - why
sm_ptr
is double pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a double pointer to signal the user that the pointer may be invalidated as discussed in #4873 (comment)
If the key is deleted, then we set the pointer to null. As mentioned in #4873 (comment) though, it is tricky to avoid accidental usage.
src/server/hset_family.cc
Outdated
|
||
if (it == sm->end()) | ||
return OpStatus::KEY_NOTFOUND; | ||
if (const auto it = FindWithCleanupBlocking( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the attempt to wrap everything into a single call but I think it makes things more confusing on the caller side.
If we found something - we for sure do not need to clean up.
if not - we should check if sm->Empty()
and perform the deletion. So the deletion - can be a separate function
with its arguments but pushing this code down into FindWithCleanupBlocking
makes the code here more confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially in this PR I had the approach to do an empty check + delete using RAII as in f72ff33 - but we discussed later and to make sure the key is always deleted for future commands I combined the check and deletion into one.
I will look into changing to make things simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to simplify it a bit, it can be made simpler by arming the delete object by hand but that leaves the possibility of new code not ensuring delete of empty keys cc @adiholden
1d93270
to
b418619
Compare
Signed-off-by: Abhijat Malviya <[email protected]>
b418619
to
5381f48
Compare
|
||
struct KeyCleanup { | ||
using CleanupFuncT = std::function<void(std::string_view)>; | ||
explicit KeyCleanup(CleanupFuncT func, const std::string_view key_view) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, string_view
is not mutable
so const
is redundant here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to use const
with views when I remember because even string_view
can be made to point to some other string in the function body and const
protects against that.
It's probably not something that happens often so I am fine to remove it, but I used const
deliberately here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhh yeah that makes sense! nvm
bool armed{false}; | ||
}; | ||
|
||
void DeleteKey(DbSlice& db_slice, const OpArgs& op_args, std::string_view key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can move this function inside KeyCleanup
. You only call it once as "freestanding" and I think we might even have a bug there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept it as a function because it is used twice.
auto it = db_slice.FindMutable(op_args.db_cntx, key).it; | ||
|
||
db_slice.Del(op_args.db_cntx, it); | ||
DeleteKey(db_slice, op_args, key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeleteKey
also writes to the journal right ? Why didn't we do that before ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure but earlier in the PR discussion it was mentioned we should write to journal when we delete the key. Should we not do this in this case?
This is in the HGetGeneric -> OpGetAll
code path. Will it be replicated by itself, considering it is a read command?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure but earlier in the PR discussion it was mentioned we should write to journal when we delete the key. Should we not do this in this case?
Yes we should but I am skeptical before those changes it seems that we did not write explicitly and we do now.
I will need to check at the flow and get back to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow that was an actual bug. Mind if we also add a replication test for this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Plz follow up with:
- Change the log for snapshot as per Adi's comment
When a search operation is performed on a hash set, expired fields are removed as a side effect.
If at the end of such an operation the hash set becomes empty, its key is removed from the database.
fixes #4856