-
Notifications
You must be signed in to change notification settings - Fork 678
wasm/alloc: zero fill memory #15063
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wasm/alloc: zero fill memory #15063
Conversation
This is assumed by wasmtime for these memory APIs Signed-off-by: Tyler Rockwood <[email protected]>
Force push: Fix release build |
Force push: Remove a debug build check that was causing asan false positives |
* don't have to touch all the bytes. | ||
*/ | ||
void deallocate(heap_memory); | ||
void deallocate(heap_memory, size_t used_amount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This amount is user configurable and in theory could be 10s of MB, I guess that is too expensive to memset in a single go? Do we need to do this in an async friendly way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like chunk-wise with yields inerspersed or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does N have a concrete upper bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4GiB, probably in practice it's much lower (the default is 2MiB), but for example golang (not tinygo) is requiring 50MB in my tests for a non trivial program (seems like a lot but 🤷)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an idea to do this in a fairly lightweight way, but I will do in a followup PR, not this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would assume that as long as memset(0 is on a contiguous region it is very fast. i dunno tho what the limit is where it becomes costly enough to be a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is to get this in, then I'll create a benchmark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also page aligned, that should help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* don't have to touch all the bytes. | ||
*/ | ||
void deallocate(heap_memory); | ||
void deallocate(heap_memory, size_t used_amount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like chunk-wise with yields inerspersed or something?
allocated = allocator.allocate(req); | ||
ASSERT_TRUE(allocated.has_value()); | ||
EXPECT_THAT(allocated, Optional(HeapIsZeroed())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: what's the purpose of this second allocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's reused, so its ensured memory is zeroed when returned "dirty" to the allocator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see
/ci-repeat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want input from noah regarding the big memset, but lgtm
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/41544#018bf50b-15f3-49ce-8ef3-b8632c06513c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
src/v/wasm/allocator.cc
Outdated
|
||
void heap_allocator::deallocate(heap_memory m) { | ||
void heap_allocator::deallocate(heap_memory m, size_t used_amount) { | ||
std::memset(m.data.get(), 0, std::min(used_amount, m.size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defensive programming and all, yay. but it also reads as if more could have been used than the buffer size. is that a thing? maybe used_amount corresponds to an amount that perhaps spans instances of heap_memory? that would explain it i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I should just remove the std::min
it's overly defensive
* don't have to touch all the bytes. | ||
*/ | ||
void deallocate(heap_memory); | ||
void deallocate(heap_memory, size_t used_amount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would assume that as long as memset(0 is on a contiguous region it is very fast. i dunno tho what the limit is where it becomes costly enough to be a concern.
wasmtime assumes that memory is zero filled, so ensure that heap memory is zero filled. We also force callers to tell the allocator how much of that heap was actually used, so that deallocation can memset only the changed memory - this is an optimization for large memories if only part of the memory is used. Signed-off-by: Tyler Rockwood <[email protected]>
Adds tests that we ensure memory is always zero filled Signed-off-by: Tyler Rockwood <[email protected]>
Force push: Remove overly defensive cap to memset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rubber stamp
@oleiman @dotnwat on my local machine:
Presumably arm is a little slower, so I suspect most initial transforms will be fine in terms of budget, but there are other things going in this task when we dealloc an instance, so I can make it async |
Wasmtime assumes that memory returned from allocators is zero filled. We need to do this for our custom allocators too.
Backports Required
Release Notes