[ENH] ScoutLogs issues a HEAD if possible. #5376

rescrv · 2025-08-28T23:03:40Z

Description of changes

This PR changes scout logs to consult the cache on ScoutLogs. If the
manifest was recently in the cache, wal3/rls will perform a HEAD
operation to fetch the object into cache.

This PR contains tests written by Claude.

Test plan

CI

Migration plan

N/A

Observability plan

N/A

Documentation Changes

N/A

github-actions · 2025-08-28T23:03:48Z

propel-code-bot · 2025-08-28T23:05:51Z

ScoutLogs Uses Cache With HEAD Optimization for Manifest Verification

This PR introduces a cache validation mechanism for ScoutLogs, allowing the log service to consult a cached manifest and verify its freshness via an S3 HEAD operation (etag check) instead of loading the entire manifest. The change extends S3, admission-controlled S3, and storage abstraction layers with a new confirm_same interface, which provides an etag consistency check without downloading the full object. When a cached manifest+etag is found, the server calls HEAD to verify its validity before using it, falling back to a full manifest fetch if verification fails. Corresponding test coverage is added for S3, manifest, log-reader, and service endpoints to ensure correct integration and observability.

Key Changes

• Added the Storage::confirm_same() method (with backend implementations) to verify that a provided etag matches the current file (manifest) in storage without fetching the whole file
• Updated ScoutLogs logic in rust/log-service/src/lib.rs to prefer using cached manifests, verifying them with HEAD/etag checks, and falling back to a fresh manifest fetch if verification or cache miss occurs
• Introduced Manifest::head() and LogReader::verify() for lightweight manifest freshness checks via etag
• Wired through S3 (S3Storage), admission-controlled S3, storage abstraction, and implemented test stubs (and appropriate NotImplemented for local/object_store backends)
• Added comprehensive k8s-integration and unit tests for HEAD/etag behavior, including edge cases (e.g., stale cache, missing files, error handling)
• Updated Cargo.lock and resolved minor package version drifts

Affected Areas

• rust/log-service/src/lib.rs: logic for manifest fetching and caching in ScoutLogs
• rust/wal3/src/reader.rs, manifest.rs: new methods for etag verification and manifest loading
• rust/storage/src/lib.rs, s3.rs, admissioncontrolleds3.rs, local.rs, object_store.rs: storage backend implementations for confirm_same/etag logic and passthrough
• Tests: integration and unit tests in storage, manifest, log-reader, and log-service modules
• Cargo.lock: dependency tree maintenance

This summary was automatically generated by @propel-code-bot

propel-code-bot · 2025-08-28T23:11:16Z

rust/storage/src/local.rs

+    pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> {
+        Err(StorageError::NotImplemented)
+    }


[TestCoverage]

This NotImplemented error will cause cache verification to always fail for local storage, preventing tests from exercising the cache-hit path. You could implement this using the existing etag_for_bytes helper to make local storage tests more realistic.

Suggested change

pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> {

Err(StorageError::NotImplemented)

}

pub async fn confirm_same(&self, key: &str, e_tag: &ETag) -> Result<bool, StorageError> {

match self.get(key).await {

Ok(bytes) => {

let current_etag = Self::etag_for_bytes(&bytes);

Ok(&current_etag == e_tag)

}

Err(StorageError::NotFound { .. }) => Ok(false),

Err(e) => Err(e),

}

}

⚡ Committable suggestion

Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

propel-code-bot · 2025-08-28T23:11:17Z

rust/storage/src/object_store.rs

+    pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> {
+        Err(StorageError::NotImplemented)
+    }


[BestPractice]

Similar to the LocalStorage implementation, returning NotImplemented here prevents testing the cache verification path for this storage backend. The underlying object_store crate supports head requests which return an ETag, so this could be implemented.

Suggested change

pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> {

Err(StorageError::NotImplemented)

}

pub async fn confirm_same(&self, key: &str, e_tag: &ETag) -> Result<bool, StorageError> {

match self.object_store.head(&object_store::path::Path::from(key)).await {

Ok(meta) => Ok(meta.e_tag == Some(e_tag.0.clone())),

Err(object_store::Error::NotFound { .. }) => Ok(false),

Err(e) => Err(e.into()),

}

}

⚡ Committable suggestion

Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

HammadB · 2025-08-28T23:13:18Z

Do we want to consider racing reads here? It seems like that could be useful

propel-code-bot · 2025-09-08T17:07:15Z

rust/log-service/src/lib.rs

+            if !log_reader.verify(cached).await.unwrap_or_default() {
+                cached_manifest_and_e_tag.take();
            }


[BestPractice]

The use of unwrap_or_default() here effectively treats any error during manifest verification as a cache miss. While this is a safe fallback, it hides potentially important errors (e.g., network issues, S3 permissions) that could indicate a deeper problem. Logging these errors would improve observability and help diagnose issues that might cause frequent cache misses.

Suggested change

if !log_reader.verify(cached).await.unwrap_or_default() {

cached_manifest_and_e_tag.take();

}

match log_reader.verify(cached).await {

Ok(true) => (), // All good, manifest is fresh.

Ok(false) => {

// Stale manifest, invalidate.

cached_manifest_and_e_tag.take();

}

Err(err) => {

tracing::warn!(

"Failed to verify cached manifest for collection {}: {}. Falling back to full fetch.",

collection_id,

err

);

cached_manifest_and_e_tag.take();

}

}

⚡ Committable suggestion

Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Context for Agents

[**BestPractice**] The use of `unwrap_or_default()` here effectively treats any error during manifest verification as a cache miss. While this is a safe fallback, it hides potentially important errors (e.g., network issues, S3 permissions) that could indicate a deeper problem. Logging these errors would improve observability and help diagnose issues that might cause frequent cache misses. ```suggestion match log_reader.verify(cached).await { Ok(true) => (), // All good, manifest is fresh. Ok(false) => { // Stale manifest, invalidate. cached_manifest_and_e_tag.take(); } Err(err) => { tracing::warn!( "Failed to verify cached manifest for collection {}: {}. Falling back to full fetch.", collection_id, err ); cached_manifest_and_e_tag.take(); } } ``` ⚡ **Committable suggestion** Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. File: rust/log-service/src/lib.rs Line: 1151

rescrv · 2025-09-09T21:36:36Z

Offline discussion documented here: Racing reads will only help in the case that both ops race with a write that invalidates the cache.

Sicheng-Pan · 2025-09-10T23:03:01Z

Cargo.lock

@@ -4758,7 +4758,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "fc2f4eb4bc735547cfed7c0a4922cbd04a4655978c09b54f1f7b228750664c34"
 dependencies = [
 "cfg-if",
- "windows-targets 0.52.6",
+ "windows-targets 0.48.5",


is cargo lock change here intentional?

unrelated but maybe we should consider bump our dependencies in the future

This was from a rebase. Will fix.

rust/log-service/src/lib.rs

rust/storage/src/lib.rs

propel-code-bot bot reviewed Aug 28, 2025

View reviewed changes

rescrv force-pushed the rescrv/scout-logs-uses-head branch from e76ace7 to 3aa12eb Compare September 8, 2025 16:54

propel-code-bot bot reviewed Sep 8, 2025

View reviewed changes

rescrv force-pushed the rescrv/scout-logs-uses-head branch from 3aa12eb to 568b89c Compare September 9, 2025 21:35

blacksmith-sh bot deleted a comment from rescrv Sep 9, 2025

rescrv requested a review from Sicheng-Pan September 9, 2025 22:52

rescrv added 2 commits September 9, 2025 16:54

[ENH] Scout logs issues a HEAD for cached manifests.

71640b2

more test k8s

bbc6610

rescrv force-pushed the rescrv/scout-logs-uses-head branch from c6171dd to bbc6610 Compare September 9, 2025 23:57

blacksmith-sh bot deleted a comment from rescrv Sep 10, 2025

Sicheng-Pan reviewed Sep 10, 2025

View reviewed changes

rust/log-service/src/lib.rs Outdated Show resolved Hide resolved

Sicheng-Pan reviewed Sep 10, 2025

View reviewed changes

rust/storage/src/lib.rs Show resolved Hide resolved

rescrv requested a review from Sicheng-Pan September 11, 2025 21:44

Sicheng-Pan approved these changes Sep 15, 2025

View reviewed changes

docs

8f303b1

rescrv merged commit 214864d into main Sep 15, 2025
58 checks passed

rescrv deleted the rescrv/scout-logs-uses-head branch September 15, 2025 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH] ScoutLogs issues a HEAD if possible. #5376

[ENH] ScoutLogs issues a HEAD if possible. #5376

Uh oh!

rescrv commented Aug 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 28, 2025

Uh oh!

propel-code-bot bot commented Aug 28, 2025 •

edited

Loading

Uh oh!

propel-code-bot bot Aug 28, 2025

Uh oh!

propel-code-bot bot Aug 28, 2025

Uh oh!

HammadB commented Aug 28, 2025

Uh oh!

propel-code-bot bot Sep 8, 2025

Uh oh!

rescrv commented Sep 9, 2025

Uh oh!

Sicheng-Pan Sep 10, 2025

Uh oh!

rescrv Sep 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

-    pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> {
-        Err(StorageError::NotImplemented)
-    }
+pub async fn confirm_same(&self, key: &str, e_tag: &ETag) -> Result<bool, StorageError> {
+    match self.get(key).await {
+        Ok(bytes) => {
+            let current_etag = Self::etag_for_bytes(&bytes);
+            Ok(&current_etag == e_tag)
+        }
+        Err(StorageError::NotFound { .. }) => Ok(false),
+        Err(e) => Err(e),
+    }
+}

-    pub async fn confirm_same(&self, _: &str, _: &ETag) -> Result<bool, StorageError> {
-        Err(StorageError::NotImplemented)
-    }
+pub async fn confirm_same(&self, key: &str, e_tag: &ETag) -> Result<bool, StorageError> {
+    match self.object_store.head(&object_store::path::Path::from(key)).await {
+        Ok(meta) => Ok(meta.e_tag == Some(e_tag.0.clone())),
+        Err(object_store::Error::NotFound { .. }) => Ok(false),
+        Err(e) => Err(e.into()),
+    }
+}

-            if !log_reader.verify(cached).await.unwrap_or_default() {
-                cached_manifest_and_e_tag.take();
-            }
+            match log_reader.verify(cached).await {
+                Ok(true) => (), // All good, manifest is fresh.
+                Ok(false) => {
+                    // Stale manifest, invalidate.
+                    cached_manifest_and_e_tag.take();
+                }
+                Err(err) => {
+                    tracing::warn!(
+                        "Failed to verify cached manifest for collection {}: {}. Falling back to full fetch.",
+                        collection_id,
+                        err
+                    );
+                    cached_manifest_and_e_tag.take();
+                }
+            }

[ENH] ScoutLogs issues a HEAD if possible. #5376

[ENH] ScoutLogs issues a HEAD if possible. #5376

Uh oh!

Conversation

rescrv commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Test plan

Migration plan

Observability plan

Documentation Changes

Uh oh!

github-actions bot commented Aug 28, 2025

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality

Uh oh!

propel-code-bot bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

propel-code-bot bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

HammadB commented Aug 28, 2025

Uh oh!

propel-code-bot bot Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

rescrv commented Sep 9, 2025

Uh oh!

Sicheng-Pan Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

rescrv Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rescrv commented Aug 28, 2025 •

edited

Loading

propel-code-bot bot commented Aug 28, 2025 •

edited

Loading