-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Add option to limit max prefetching in MultiScan #13920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Status s; | ||
s.PermitUncheckedError(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems all the usage of Status is local. I would recommend to just define the status on the place we use it, instead of define one at the top of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor feedback. The rest looks good.
s = table_->LookupAndPinBlocksInCache<Block_kData>( | ||
|
||
// Check if we would exceed the prefetch size limit with this block | ||
uint64_t block_size = data_block_handle.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This doesn't account for the trailer. Maybe use BlockSizeWithTrailer()?
include/rocksdb/options.h
Outdated
// prefetch size. When the limit is exceeded, iterator will return | ||
// Status::PrefetchLimitReached(). | ||
// | ||
// Note that this limit is per file and is on compressed block size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify in the comment that the prefetch happens only once, to distinguish it from ReadOptions::readahead_size which applies anytime the iterator does IO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
// Check if we would exceed the prefetch size limit with this block | ||
uint64_t block_size = data_block_handle.size(); | ||
total_prefetch_size += block_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be done only if the block doesn't exist in the block cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, we want to limit memory usage of pinned blocks too.
Summary: Add a new option
MultiScanArgs::max_prefetch_size
that limits the memory usage of per file pinning of prefetched blocks. Note that this only accounts for compressed block size. This is intended to be a stopgap until we implement some kind of global prefetch manager that limits the global multiscan memory usage.Test plan: new unit test
./block_based_table_reader_test --gtest_filter="*MultiScanPrefetchSizeLimit/*"