Skip to content

Conversation

krhancoc
Copy link
Contributor

The multiscan operation uses the prepare function prefetch and pin all the required blocks for its provided scan ranges. However, this could cause large amount of memory being used. To restrict this, we introduce the PrefetchRateLimiter which allows users to bound how much is prefetched to ensure we do not OOM the host.

Test Plan
Added additional tests.

@meta-cla meta-cla bot added the CLA Signed label Aug 28, 2025
@facebook-github-bot
Copy link
Contributor

@krhancoc has imported this pull request. If you are a Meta employee, you can view this in D81548706.

@facebook-github-bot
Copy link
Contributor

@krhancoc has imported this pull request. If you are a Meta employee, you can view this in D81548706.

@facebook-github-bot
Copy link
Contributor

@krhancoc has imported this pull request. If you are a Meta employee, you can view this in D81548706.

@krhancoc krhancoc requested review from anand1976 and cbi42 September 4, 2025 16:44
@@ -1345,6 +1410,14 @@ void BlockBasedTableIterator::FindBlockForwardInMultiScan() {
}
// Move to the next pinned data block
ResetDataIter();
if (multi_scan_->prefetch_rate_limiter) {
size_t releasing =
multi_scan_->pinned_data_blocks[multi_scan_->cur_data_block_idx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinned_data_blocks[multi_scan_->cur_data_block_idx] would not be valid at this point I think, since it would've been transferred to the data block iter. Or am I missing something?

virtual bool release(size_t bytes) = 0;
};

class DefaultPrefetchRateLimiter : public PrefetchRateLimiter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This need not be declared in a public header file. RocksDB typically just exposes an allocator, like NewDefaultPrefetchRateLimiter().

multi_scan_->pinned_data_blocks[multi_scan_->cur_data_block_idx]
.GetValue()
->size();
multi_scan_->prefetch_rate_limiter->release(releasing);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd do this in a cleanup function registered with block_iter_ (which is derived from Cleanable) so that the release happens whenever block_iter_ is reset.

// Lets make sure we are rate limited on how many blocks to prepare
if (multiscan_opts->prefetch_rate_limiter) {
auto blocks = multiscan_opts->GetMutablePrefetchRateLimiter().acquire(
table_, index_iter_->value().handle.size(), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write last parameter as /*all_or_nothing=*/true (Google C++ style guide - https://google.github.io/styleguide/cppguide.html#Function_Argument_Comments)

@anand1976
Copy link
Contributor

Its not clear how the proposed interface would support prefetching across multiple levels. For example, say we have L1, L2 and L3. Suppose L1's Prepare() gets called first. It could potentially exhaust the prefetch quota and L2/L3 cannot do any prefetching. I understand that we want to keep the initial implementation simple for now that works for single level use cases, but the interface should be extensible to support other use cases in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants