Skip to content

[BUG] observing peer can't change a role #5330

@timofeevmd

Description

@timofeevmd

OS and Environment

longevity cluster: linux x86, k8s, 5 peer's, 2.9 vCPU/6RAM on each peer

GIT commit hash

f348b9a

Minimum working example / Steps to reproduce

When conducting performance testing to find the maximum capacity, I encountered an issue with transaction queue buildup on one of the peers.

Based on the data obtained from Grafana, logs, and profiling tools, I assume that the problem is related to errors occurring when attempting to change the role of a specific peer after each round of Iroha.

To obtain a comprehensive analysis of the current state of the product, three performance tests were conducted:

  1. With the PARCA profiler enabled
  2. With the profiler disabled and logging level set to INFO
    Test start time: 2025-02-26 11:53:36
    Test end time: 2025-02-26 11:58:36

After the test was completed, we observed only peer synchronization.
The issue with queue desynchronization was observed on peers iroha2-1 and iroha2-4, whose roles are observing peer.

Test result:

  1. With the profiler disabled and logging level set to TRACE
    Test result:

Since the issue was observed in all three tests with identical characteristics, we can assume that the data obtained from different tests is consistent.

Full logs are attached to this ticket.

During the tests, the issue was observed on peers iroha2-0 and iroha2-2.

PREPARATION LONGEVITY ENV

Access to standard monitoring tools

On the perf generator

git clone https://github.com/soramitsu/iroha2-perf.git &&
git checkout iroha/2_0_0-rc_1/keypair &&
cd performance-generator/ &&
mvn -N io.takari:maven:wrapper &&
./mvnw gatling:test -Dgatling.simulationClass=simulation.transactions.standard.TransferAssetSimulation -DtargetURL= -DremoteLogin= -DremotePassword= -Dintensity=67000 -DrampDuration=300 -DmaxDuration=301

Image

Image

Image

Image

Actual result

Peers are not synchronizing, errors occur when changing the peer's role, and there is a message about forced container shutdown.

Expected result

Peers are synchronizing, no errors occur when changing the peer's role.

Logs

log files:

parca profile

Who can help to reproduce?

@timofeevmd @RamilMus

Notes

No response

Metadata

Metadata

Labels

BugSomething isn't workingPerformancenon-functional

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions