Skip to content

Conversation

zhexuany
Copy link
Contributor

@zhexuany zhexuany commented Jan 28, 2018

Below is the benchmark, I am not sure why the ~~~memory usage does not decrease.~~~ (we just reuse the object. Memory allocation is still original. Using sync.Pool can only reduce gc pressure. )
I am inverstigating this. In additon, I will add different workload benchmark test later.

name              old time/op    new time/op    delta
SysbenchSelect-8    46.8ms ± 1%    46.5ms ± 5%  -0.55%  (p=0.044 n=18+20)
Parse-8              147ms ± 3%     145ms ± 2%  -1.61%  (p=0.000 n=20+17)

name              old alloc/op   new alloc/op   delta
SysbenchSelect-8    18.4MB ± 0%    18.4MB ± 0%    ~     (p=0.057 n=17+18)
Parse-8             51.7MB ± 0%    51.7MB ± 0%    ~     (p=0.072 n=19+20)

name              old allocs/op  new allocs/op  delta
SysbenchSelect-8      184k ± 0%      184k ± 0%    ~     (all equal)
Parse-8               707k ± 0%      707k ± 0%    ~     (all equal)

@zhexuany zhexuany self-assigned this Jan 28, 2018
@zhexuany zhexuany added the WIP label Jan 28, 2018
@zhexuany
Copy link
Contributor Author

@coocood Here is another example using sync.Pool can benefit on reducing cpu usage.

@coocood
Copy link
Member

coocood commented Jan 28, 2018

@zhexuany That's because the toDecimal function is never called.

@coocood
Copy link
Member

coocood commented Jan 28, 2018

@zhexuany
It doesn't make sense that reducing one small allocation improves noticeable performance if there are hundreds of allocations already.

@shenli
Copy link
Member

shenli commented Jan 28, 2018

We need to discuss this in detail. Could you open a Google Doc and put all the things in it? @zhexuany

@zhexuany
Copy link
Contributor Author

Here is benchmark result for different workloads.

SysbenchSelect1024-8    4.74ms ±10%    4.62ms ± 8%  -2.53%  (p=0.000 n=39+36)
SysbenchSelect256-8     1.21ms ±13%    1.20ms ±16%    ~     (p=0.307 n=40+40)
SysbenchSelect128-8      644µs ±11%     611µs ±13%  -5.16%  (p=0.000 n=38+39)
ParseInt4096-8          82.2ms ±11%    85.2ms ±18%  +3.70%  (p=0.012 n=39+40)
ParseInt1024-8          21.6ms ±12%    22.3ms ±13%    ~     (p=0.063 n=39+39)
ParseInt512-8           10.6ms ±14%    11.1ms ±15%  +4.72%  (p=0.000 n=40+39)
ParseInt256-8           5.04ms ± 9%    5.35ms ± 7%  +6.21%  (p=0.000 n=39+40)

name                  old alloc/op   new alloc/op   delta
SysbenchSelect1024-8    1.84MB ± 0%    1.84MB ± 0%  +0.00%  (p=0.022 n=32+37)
SysbenchSelect256-8      459kB ± 0%     459kB ± 0%    ~     (all equal)
SysbenchSelect128-8      229kB ± 0%     229kB ± 0%    ~     (all equal)
ParseInt4096-8          28.4MB ± 0%    28.4MB ± 0%  +0.01%  (p=0.000 n=39+24)
ParseInt1024-8          7.09MB ± 0%    7.10MB ± 0%  +0.02%  (p=0.000 n=40+40)
ParseInt512-8           3.55MB ± 0%    3.55MB ± 0%  +0.04%  (p=0.000 n=39+40)
ParseInt256-8           1.77MB ± 0%    1.77MB ± 0%  +0.04%  (p=0.000 n=40+40)

name                  old allocs/op  new allocs/op  delta
SysbenchSelect1024-8     18.4k ± 0%     18.4k ± 0%    ~     (all equal)
SysbenchSelect256-8      4.61k ± 0%     4.61k ± 0%    ~     (all equal)
SysbenchSelect128-8      2.30k ± 0%     2.30k ± 0%    ~     (all equal)
ParseInt4096-8            360k ± 0%      360k ± 0%  +0.00%  (p=0.000 n=40+40)
ParseInt1024-8           90.1k ± 0%     90.1k ± 0%  +0.00%  (p=0.000 n=40+40)
ParseInt512-8            45.1k ± 0%     45.1k ± 0%  +0.00%  (p=0.000 n=40+40)
ParseInt256-8            22.5k ± 0%     22.5k ± 0%  +0.00%  (p=0.000 n=40+40

@ngaut
Copy link
Member

ngaut commented Jan 31, 2018

The benchmark results shows that it is not worth to use sync.pool.

@ngaut
Copy link
Member

ngaut commented Jan 31, 2018

I will close this PR. please feel free to reopen it if you have different thoughts.

@ngaut ngaut closed this Jan 31, 2018
@zhexuany zhexuany deleted the optimize_parser branch February 26, 2018 14:39
crazycs520 pushed a commit to crazycs520/tidb that referenced this pull request Jun 4, 2025
…_pr=95

* executor: fix a bug that global temporary table send cop request (pingcap#588…
* statistics: fix the panic when to async load stats with dropped index …
* executor: fix prepared protocol charset (pingcap#58872) (pingcap#58931)
* *: Update client-go and verify all read ts (pingcap#58909)
* integration test: fix test case "br_pitr" (pingcap#58876)
* session: add indexes for `mysql.analyze_jobs` (pingcap#58134) (pingcap#58355)
* ddl: fix args count for modify column (pingcap#58855) (pingcap#58858)
* planner: correct plan when scan tidb related cluster table with KeepOr…
* planner: Fix vector not truncated after CBO (pingcap#58809) (pingcap#58844)
* ddl: Fix vector index for high dimensional vectors (pingcap#58717) (pingcap#58835)
* ddl: Fix issue with concurrent update getting reverted by BackfillData…
* statistics: stats cache set default quota as 20%  (pingcap#58013) (pingcap#58817)
* executor: change the evaluation order of columns in `Update` and `Inse…
* statistics: add recover to protect background task (pingcap#58739) (pingcap#58767)
* ttl: fix the infinite waiting for delRateLimiter when `tidb_ttl_delete…
* ttl: reduce some warnings logs when locking TTL tasks (pingcap#58306) (pingcap#58783)
* ttl: retry the rows when del rate limiter returns error in delWorker (…
* ttl: reschedule task to other instances when shrinking worker (pingcap#57703) (pingcap#58778)
* ttl: fix the issue that one task losing heartbeat will block other tas…
* ttl: fix the issue that the task is not cancelled after transfering ow…
* ddl: fix job state overridden when concurrent updates don't overlap in…
* ttl: set the job history status to `cancelled` if it's removed in GC a…
* ttl: fix the timezone issue and panic in the caller of `getSession` (#…
* ddl: fix version syncer doesn't print who hasn't synced on partial syn…
* ttl: fix the issue that `DROP TABLE` / `ALTER TABLE` will keep job run…
* br/stream: allow pitr to create oversized indices (pingcap#58433) (pingcap#58527)
* ttl: set a result for timeout scan task during shrinking scan worker (…
* executor: fix time zone issue when querying slow log (pingcap#58455) (pingcap#58577)
* table: fix the issue that the default value for `BIT` column is wrong …
* statistics: temporarily skip handling errors for DDL events (pingcap#58609) (pingcap#58634)
* sessionctx: fix null max value to leading wrong warning (pingcap#57898) (pingcap#57935)
* planner: convert cartesian semi join with other nulleq condition to cr…
* planner: fix idxMergePartPlans forget to deal with RootTaskConds (pingcap#585…
* domain: change some stats log level as WARN (pingcap#58316) (pingcap#58555)
* planner: quickly get total count from index/column (pingcap#58365) (pingcap#58431)
* planner, expr: eval readonly user var during plan phase (pingcap#54462) (pingcap#58540)
* metrics: add col/idx name(s) for BackfillProgressGauge and BackfillTot…
* br: refactor test to use wait checkpoint method (pingcap#57612) (pingcap#58498)
* executor: reuse chunk in hash join v2 during restoring (pingcap#56936) (pingcap#58018)
* executor: fix goroutine leak when exceed quota in hash agg (pingcap#58078) (pingcap#58462)
* copr: fix the issue that busy threshold may redirect batch copr to fol…
* statistics: skip non-exicted table when to init stats (pingcap#58381) (pingcap#58394)
* planner: fix incorrectly using the schema for plan cache (pingcap#57964) (pingcap#58090)
* *: use DDL subscriber updating stats meta (pingcap#57872) (pingcap#58387)
* planner, runtime_filter: Remove redundant logs whose meaning can be di…
* statistics: remove dead code (pingcap#58412) (pingcap#58442)
* planner: Use/force to apply prefer range scan (pingcap#56928) (pingcap#58444)
* statistics: gc the statistics correctly after drop the database (pingcap#5730…
* ddl: Fixed partition interval from DayMinute to just Minute. (pingcap#57738) (pingcap#58019)
* executor: Enlarge the timeout for fetching TiFlash system tables (pingcap#579
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants