Skip to content

Experiment with reusing blobs in WAL after flush #173

@marvin-j97

Description

@marvin-j97

Currently, the blobs that are written to the WAL, are then rewritten to a blob file after flushing the memtable. That gives us a write amplification of 2 and a bit (because of the pointers in the SSTs). That is pretty good compared to something like 7-10 if the blobs were compacted over and over again.

However, something like LMDB actually has a write amp of close to 1 for very large values, because it does not write to a WAL.

Question is, can we reuse the blobs in the WAL?
This is used in https://github.com/topling/toplingdb

Basically:

  1. Write blob frames (the format defined in https://github.com/fjall-rs/value-log) directly into WAL
  2. Reference those blobs in the memtable
  3. On rotation, somehow register the WAL as a blob file - this will take a bit of care so that it all works correctly with value-logs recovery and everything
  4. At that point, the WAL file would not be added to the Journal Manager's GC list because it is governed by value-log

The disadvantage is that newly written blobs are stored out of order, in a log+index kind of fashion, so range reads may suffer, but when performing garbage collection we can create a new blob file in order. Though it's questionable how we will be able to sort a hijacked WAL file like that without too much IO or memory usage.

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions