arrow_reader_row_filter benchmark doesn't capture page cache improvements

- Part of https://github.com/apache/arrow-rs/issues/7456
- Part of https://github.com/apache/arrow-rs/issues/7363

We are trying to improve the performance of row filter application and part of that is a benchmark that we can use to guide optimization efforts. 

```shell
cargo bench --all-features --bench arrow_reader_row_filter
```

However, as shown in https://github.com/apache/arrow-rs/pull/7428  we have a case where we see the performance benefit when running an end to end query in datafusion but the same improvement is not seen in the benchmark.

This ticket tracks figuring out why the benchmark doesn't show an improvement even when the end to end query does.


> 
> Interesting, the decoder cache doesn't seem to help much on my test machine (which is some crappy gcp VM). I couldn't reproduce the results listed on [#7363 (comment)](https://github.com/apache/arrow-rs/issues/7363#issuecomment-2816670842) 🤔

> Thank you @alamb ,  it seems no obvious improvement compares to main. This branch only improve PointLookup for 1000000 line big data set comparing to original better-decode. 

> I agree, we need to find how to mock clickbench result from arrow-rs side.

_Originally posted by @zhuqi-lucas in https://github.com/apache/arrow-rs/issues/7428#issuecomment-2838960224_
            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

arrow_reader_row_filter benchmark doesn't capture page cache improvements #7460

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

arrow_reader_row_filter benchmark doesn't capture page cache improvements #7460

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions