Skip to content

bad_alloc + segfault with 10k partitions #3809

@grsmith-projects

Description

@grsmith-projects

Version & Environment

Redpanda version: 21.11.5
OS: Kubernetes using the :latest tag
Using default config options from the custom config example
only change was storage_read_buffer_size set to 32768
Kernel: Linux 4.19.0-0.bpo.5-amd64 #1 SMP Debian 4.19.37-4~bpo9+1 (2019-06-19) x86_64 GNU/Linux

we have 6 fairly beefy nodes with 384 CPU + 24 SSD (2TB)

each pod (24) has 40G of ram allocated, with 32G for RedPanda itself.

What went wrong?

ERROR 2022-02-15 03:31:53,794 [shard 0] seastar_memory - Dumping seastar memory diagnostics Used memory: 2G Free memory: 0B Total memory: 2G

libc++abi: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc Aborting on shard 0.

Segmentation fault on shard 0.

What should have happened instead?

We were looking to scale to 25k partitions, we were able to scal

How to reproduce the issue?

  1. create cluster with size described above
  2. once confirmed healthy start running a load test against it [using custom tools, with various sized payloads]
  3. the LT will hit one topic
  4. change topic to 10,000 partitions
  5. witness crashing

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions