Data and loads are not well distributed in my benchmark

Please answer these questions before submitting your issue. Thanks!

1. What did you do?
If possible, provide a recipe for reproducing the error.

I tried to insert tuples into TiDB by using YCSB (Yahoo! Cloud Serving Benchmark).
My TiDB settings were as follows:
- I used 9 AWS EC2 instances (each instance is i3.2xlarge: 8 vCPUs, 61GiB RAM , 1 x 1900GB nvme disk)
   . node#1~#3: 1 x TiDB & 1 x PD ware deployed on each instance (i.e., total 3 x TiDBs & 3 x PDs)
   . node#4~#9: 1 x TiKV was deployed on each instance (i.e., total 6 x TiKVs)
- tidb_version: v2.0.5
- I used the default TiDB config. Exception is as follows:
   . prepared_plan_cache: enabled: true
- I used the default PD config.
- I used the default TiKV config. Exceptions are as follows:
   . raftstore: sync-log: false
   . rocksdb: bytes-per-sync: "1MB"
   . rocksdb: wal-bytes-per-sync: "512KB"
   . raftdb: bytes-per-sync: "1MB"
   . raftdb: wal-bytes-per-sync: "512KB"
   . storage: scheduler-concurrency: 1024000
   . storage: scheduler-worker-pool-size: 6
   . *: max-write-buffer-number: 10
- I created one table like follows:
CREATE TABLE usertable (
	YCSB_KEY VARCHAR(255) PRIMARY KEY,
	FIELD0 TEXT, FIELD1 TEXT,
	FIELD2 TEXT, FIELD3 TEXT,
	FIELD4 TEXT, FIELD5 TEXT,
	FIELD6 TEXT, FIELD7 TEXT,
	FIELD8 TEXT, FIELD9 TEXT
);

And YCSB settings were as follows:
- I used YCSB "load" command to insert tuples into TiDB
- I used the following config for YCSB "load"
   . maxexecutiontime: 3600 (i.e., insert tuples for 3600 seconds)
   . target: 32000 (i.e., insert tuples at a rate 32000 insertions per sec)
   . threads: 512 (i.e., use 512 client threads to insert tuples)
   . db.batchsize: 100 & jdbc.autocommit: false (i.e., commit every 100 inserts)


2. What did you expect to see?
I expected the data and loads to be well distributed.


3. What did you see instead?
Data and loads were not well distributed. I think the performance (TPS and latency) was limited by this.

Data were not well distributed as follows:
- Sore size => store size of tikv_1 and tikv7 was 3 times larger than that of the others
![image](https://user-images.githubusercontent.com/1661884/43942927-cb69addc-9cb4-11e8-895a-cf4c41099a5d.png)
- Leader and region distribution => It was unbalanced
![image](https://user-images.githubusercontent.com/1661884/43944361-8f94a618-9cb9-11e8-9942-49f1b9d543dc.png)
![image](https://user-images.githubusercontent.com/1661884/43944365-977012fa-9cb9-11e8-8c65-196ccb0798c1.png)

Loads for TiKV were not well distributed as follows:
(In overall, tikv1 seemed to be overloaded.)
- CPU => tikv_1 (also tikv_7) used about 200% (100%) more CPU than other tikvs
![image](https://user-images.githubusercontent.com/1661884/43943753-9c2adde0-9cb7-11e8-8f36-3ce6e7a83f47.png)
- Load => load received by tikv1(=tikv_1) was twice more than that of the others 
![image](https://user-images.githubusercontent.com/1661884/43943865-0fc1bce2-9cb8-11e8-9a19-cc4f61af0f7e.png)
- IO Util => IO util of tikv1(=tikv_1) and tikv5(=tikv_7) was six times (twice for tikv5) more than that of the others
![image](https://user-images.githubusercontent.com/1661884/43944002-7239ce0a-9cb8-11e8-95e0-c5b9811578ef.png)
- Network traffic Inbound => Inbound of tikv1 and tikv5 was twice more than that of the others
![image](https://user-images.githubusercontent.com/1661884/43944199-11a5f6d0-9cb9-11e8-95b7-54c51854051d.png)
- Network traffic Outbound => Outbound of tikv1 was very larger than that of the others
![image](https://user-images.githubusercontent.com/1661884/43944278-5419a700-9cb9-11e8-81df-635ec100ec14.png)
- Region average written keys => most of keys seemd to be written into tikv_1
![image](https://user-images.githubusercontent.com/1661884/43945241-84baf6a4-9cbc-11e8-8188-042f0a660b0c.png)
- Scheduler pending commands => most of scheduler pending commands were issued from tikv_1
![image](https://user-images.githubusercontent.com/1661884/43944749-f0a52eb8-9cba-11e8-829f-a31657726d69.png)


4. Questions
Why were data not well-distributed in my situation?
Why were loads (CPU, IO, Network) not well-distributed in my situation?
What is "scheduler pending commands"? Is it related to unbalanced load or data?
Can the region size exceed region-max-size? Following graph shows average region size was 20.3GiB which is too big! It seems that there was one very large size region. If there was such a region, why did not this large size region split?
![image](https://user-images.githubusercontent.com/1661884/43945161-4c41800e-9cbc-11e8-991f-47bf7d48f4a1.png)


5. What version of TiDB are you using (`tidb-server -V` or run `select tidb_version();` on TiDB)?
v2.0.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data and loads are not well distributed in my benchmark #7349

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Data and loads are not well distributed in my benchmark #7349

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions