-
Notifications
You must be signed in to change notification settings - Fork 6k
Description
Please answer these questions before submitting your issue. Thanks!
- What did you do?
If possible, provide a recipe for reproducing the error.
I tried to insert tuples into TiDB by using YCSB (Yahoo! Cloud Serving Benchmark).
My TiDB settings were as follows:
- I used 9 AWS EC2 instances (each instance is i3.2xlarge: 8 vCPUs, 61GiB RAM , 1 x 1900GB nvme disk)
. node#1~store: fix boltdb package import. #3: 1 x TiDB & 1 x PD ware deployed on each instance (i.e., total 3 x TiDBs & 3 x PDs)
. node#4~Update README.md #9: 1 x TiKV was deployed on each instance (i.e., total 6 x TiKVs) - tidb_version: v2.0.5
- I used the default TiDB config. Exception is as follows:
. prepared_plan_cache: enabled: true - I used the default PD config.
- I used the default TiKV config. Exceptions are as follows:
. raftstore: sync-log: false
. rocksdb: bytes-per-sync: "1MB"
. rocksdb: wal-bytes-per-sync: "512KB"
. raftdb: bytes-per-sync: "1MB"
. raftdb: wal-bytes-per-sync: "512KB"
. storage: scheduler-concurrency: 1024000
. storage: scheduler-worker-pool-size: 6
. *: max-write-buffer-number: 10 - I created one table like follows:
CREATE TABLE usertable (
YCSB_KEY VARCHAR(255) PRIMARY KEY,
FIELD0 TEXT, FIELD1 TEXT,
FIELD2 TEXT, FIELD3 TEXT,
FIELD4 TEXT, FIELD5 TEXT,
FIELD6 TEXT, FIELD7 TEXT,
FIELD8 TEXT, FIELD9 TEXT
);
And YCSB settings were as follows:
- I used YCSB "load" command to insert tuples into TiDB
- I used the following config for YCSB "load"
. maxexecutiontime: 3600 (i.e., insert tuples for 3600 seconds)
. target: 32000 (i.e., insert tuples at a rate 32000 insertions per sec)
. threads: 512 (i.e., use 512 client threads to insert tuples)
. db.batchsize: 100 & jdbc.autocommit: false (i.e., commit every 100 inserts)
-
What did you expect to see?
I expected the data and loads to be well distributed. -
What did you see instead?
Data and loads were not well distributed. I think the performance (TPS and latency) was limited by this.
Data were not well distributed as follows:
- Sore size => store size of tikv_1 and tikv7 was 3 times larger than that of the others

- Leader and region distribution => It was unbalanced


Loads for TiKV were not well distributed as follows:
(In overall, tikv1 seemed to be overloaded.)
- CPU => tikv_1 (also tikv_7) used about 200% (100%) more CPU than other tikvs

- Load => load received by tikv1(=tikv_1) was twice more than that of the others

- IO Util => IO util of tikv1(=tikv_1) and tikv5(=tikv_7) was six times (twice for tikv5) more than that of the others

- Network traffic Inbound => Inbound of tikv1 and tikv5 was twice more than that of the others

- Network traffic Outbound => Outbound of tikv1 was very larger than that of the others

- Region average written keys => most of keys seemd to be written into tikv_1

- Scheduler pending commands => most of scheduler pending commands were issued from tikv_1

-
Questions
Why were data not well-distributed in my situation?
Why were loads (CPU, IO, Network) not well-distributed in my situation?
What is "scheduler pending commands"? Is it related to unbalanced load or data?
Can the region size exceed region-max-size? Following graph shows average region size was 20.3GiB which is too big! It seems that there was one very large size region. If there was such a region, why did not this large size region split?

-
What version of TiDB are you using (
tidb-server -Vor runselect tidb_version();on TiDB)?
v2.0.5