-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[ENH] ScoutLogs issues a HEAD if possible. #5376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
ScoutLogs Uses Cache With HEAD Optimization for Manifest Verification This PR introduces a cache validation mechanism for Key Changes• Added the Affected Areas• This summary was automatically generated by @propel-code-bot |
pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> { | ||
Err(StorageError::NotImplemented) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TestCoverage]
This NotImplemented
error will cause cache verification to always fail for local storage, preventing tests from exercising the cache-hit path. You could implement this using the existing etag_for_bytes
helper to make local storage tests more realistic.
pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> { | |
Err(StorageError::NotImplemented) | |
} | |
pub async fn confirm_same(&self, key: &str, e_tag: &ETag) -> Result<bool, StorageError> { | |
match self.get(key).await { | |
Ok(bytes) => { | |
let current_etag = Self::etag_for_bytes(&bytes); | |
Ok(¤t_etag == e_tag) | |
} | |
Err(StorageError::NotFound { .. }) => Ok(false), | |
Err(e) => Err(e), | |
} | |
} |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> { | ||
Err(StorageError::NotImplemented) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
Similar to the LocalStorage
implementation, returning NotImplemented
here prevents testing the cache verification path for this storage backend. The underlying object_store
crate supports head
requests which return an ETag, so this could be implemented.
pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> { | |
Err(StorageError::NotImplemented) | |
} | |
pub async fn confirm_same(&self, key: &str, e_tag: &ETag) -> Result<bool, StorageError> { | |
match self.object_store.head(&object_store::path::Path::from(key)).await { | |
Ok(meta) => Ok(meta.e_tag == Some(e_tag.0.clone())), | |
Err(object_store::Error::NotFound { .. }) => Ok(false), | |
Err(e) => Err(e.into()), | |
} | |
} |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
Do we want to consider racing reads here? It seems like that could be useful |
e76ace7
to
3aa12eb
Compare
if !log_reader.verify(cached).await.unwrap_or_default() { | ||
cached_manifest_and_e_tag.take(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The use of unwrap_or_default()
here effectively treats any error during manifest verification as a cache miss. While this is a safe fallback, it hides potentially important errors (e.g., network issues, S3 permissions) that could indicate a deeper problem. Logging these errors would improve observability and help diagnose issues that might cause frequent cache misses.
if !log_reader.verify(cached).await.unwrap_or_default() { | |
cached_manifest_and_e_tag.take(); | |
} | |
match log_reader.verify(cached).await { | |
Ok(true) => (), // All good, manifest is fresh. | |
Ok(false) => { | |
// Stale manifest, invalidate. | |
cached_manifest_and_e_tag.take(); | |
} | |
Err(err) => { | |
tracing::warn!( | |
"Failed to verify cached manifest for collection {}: {}. Falling back to full fetch.", | |
collection_id, | |
err | |
); | |
cached_manifest_and_e_tag.take(); | |
} | |
} |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
Context for Agents
[**BestPractice**]
The use of `unwrap_or_default()` here effectively treats any error during manifest verification as a cache miss. While this is a safe fallback, it hides potentially important errors (e.g., network issues, S3 permissions) that could indicate a deeper problem. Logging these errors would improve observability and help diagnose issues that might cause frequent cache misses.
```suggestion
match log_reader.verify(cached).await {
Ok(true) => (), // All good, manifest is fresh.
Ok(false) => {
// Stale manifest, invalidate.
cached_manifest_and_e_tag.take();
}
Err(err) => {
tracing::warn!(
"Failed to verify cached manifest for collection {}: {}. Falling back to full fetch.",
collection_id,
err
);
cached_manifest_and_e_tag.take();
}
}
```
⚡ **Committable suggestion**
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
File: rust/log-service/src/lib.rs
Line: 1151
3aa12eb
to
568b89c
Compare
Offline discussion documented here: Racing reads will only help in the case that both ops race with a write that invalidates the cache. |
c6171dd
to
bbc6610
Compare
@@ -4758,7 +4758,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" | |||
checksum = "fc2f4eb4bc735547cfed7c0a4922cbd04a4655978c09b54f1f7b228750664c34" | |||
dependencies = [ | |||
"cfg-if", | |||
"windows-targets 0.52.6", | |||
"windows-targets 0.48.5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is cargo lock change here intentional?
unrelated but maybe we should consider bump our dependencies in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was from a rebase. Will fix.
Description of changes
This PR changes scout logs to consult the cache on ScoutLogs. If the
manifest was recently in the cache, wal3/rls will perform a HEAD
operation to fetch the object into cache.
This PR contains tests written by Claude.
Test plan
CI
Migration plan
N/A
Observability plan
N/A
Documentation Changes
N/A