Make the allocator shim participate in LTO again #146232

bjorn3 · 2025-09-05T09:33:59Z

This is likely the cause of the perf regression in #145955. It also caused some functional regressions.

bjorn3 · 2025-09-05T09:34:18Z

Make the allocator shim participate in LTO again

bjorn3 · 2025-09-05T10:06:24Z

Forgot to revert the changes in exported_symbols_for_lto. This shouldn't affect the perf run other than possibly showing less of a performance improvement than it should give in the end.

rust-bors · 2025-09-05T11:56:35Z

☀️ Try build successful (CI)
Build commit: 5ab6398 (5ab63980021f7c1ae280eba3261d66240d594007, parent: ad85bc524b1ad696e42061ad8338d382dffbdbe5)

rustbot · 2025-09-05T13:47:15Z

r? @fee1-dead

rustbot has assigned @fee1-dead.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-09-05T13:47:18Z

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

lqd · 2025-09-05T14:26:31Z

This test reproduces some from of these other two issues (rust-lld: error: undefined hidden symbol: __rustc::__rg_oom without this PR). Can you add it to the PR?

//@ compile-flags: --crate-type cdylib -C lto 

use std::alloc::{GlobalAlloc, Layout};

struct MyAllocator;

unsafe impl GlobalAlloc for MyAllocator {
    unsafe fn alloc(&self, _layout: Layout) -> *mut u8 {
        todo!()
    }

    unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {
    }
}

#[global_allocator]
static GLOBAL: MyAllocator = MyAllocator;

Forgot to revert the changes in exported_symbols_for_lto

You've since done this, IIUC.

bjorn3 · 2025-09-05T14:48:06Z

Thanks! Had to modify it slightly to work with compiletest.

Forgot to revert the changes in exported_symbols_for_lto

You've since done this, IIUC.

Correct

compiler/rustc_codegen_ssa/src/back/write.rs

lqd · 2025-09-05T15:01:43Z

Otherwise this looks good to me and fixes the regressions, so that's great, thanks!

I'm not sure we care about the perf results, but they should be available in 3-4hours. You can r=me at your preference.

rust-timer · 2025-09-05T18:34:23Z

Finished benchmarking commit (5ab6398): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.2%	[0.1%, 0.3%]	2
Improvements ✅ (primary)	-1.3%	[-27.1%, -0.3%]	229
Improvements ✅ (secondary)	-1.1%	[-46.7%, -0.0%]	264
All ❌✅ (primary)	-1.3%	[-27.1%, -0.3%]	229

Max RSS (memory usage)

Results (primary 1.8%, secondary -2.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	7.4%	[7.4%, 7.4%]	1
Regressions ❌ (secondary)	2.1%	[1.4%, 2.7%]	5
Improvements ✅ (primary)	-0.9%	[-1.4%, -0.5%]	2
Improvements ✅ (secondary)	-4.5%	[-6.5%, -2.5%]	9
All ❌✅ (primary)	1.8%	[-1.4%, 7.4%]	3

Cycles

Results (primary -13.2%, secondary -8.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.9%	[2.9%, 2.9%]	1
Improvements ✅ (primary)	-13.2%	[-24.6%, -2.4%]	6
Improvements ✅ (secondary)	-10.8%	[-44.4%, -1.6%]	6
All ❌✅ (primary)	-13.2%	[-24.6%, -2.4%]	6

Binary size

Results (primary 53.0%, secondary 59.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	70.9%	[70.9%, 71.0%]	3
Regressions ❌ (secondary)	118.8%	[118.8%, 118.8%]	1
Improvements ✅ (primary)	-0.9%	[-0.9%, -0.9%]	1
Improvements ✅ (secondary)	-0.4%	[-0.4%, -0.4%]	1
All ❌✅ (primary)	53.0%	[-0.9%, 71.0%]	4

Bootstrap: 467.829s -> 466.151s (-0.36%)
Artifact size: 390.58 MiB -> 387.87 MiB (-0.69%)

bjorn3 · 2025-09-05T19:10:17Z

It improves things even more than it previously regressed.

$ echo 'fn main() {}' | RUSTC_BOOTSTRAP=1 rustc +nightly-2025-09-01  - -Zhuman-readable-cgu-names -O -Csave-temps -Clto=true && ls -l
-rwxrwxr-x 1 bjorn bjorn 1772392  5 sep 20:53 rust_out
-rw-rw-r-- 1 bjorn bjorn 2609776  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.bc
-rw-rw-r-- 1 bjorn bjorn 6979364  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.lto.after-restriction.bc
-rw-rw-r-- 1 bjorn bjorn 6979324  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.lto.input.bc
-rw-rw-r-- 1 bjorn bjorn    4512  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.no-opt.bc
-rw-rw-r-- 1 bjorn bjorn 3211728  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.o
$ echo 'fn main() {}' | RUSTC_BOOTSTRAP=1 rustc +nightly  - -Zhuman-readable-cgu-names -O -Csave-temps -Clto=true && ls -l
-rwxrwxr-x 1 bjorn bjorn 1766032  5 sep 20:54 rust_out
-rw-rw-r-- 1 bjorn bjorn 2612612  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.bc
-rw-rw-r-- 1 bjorn bjorn 6982788  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.lto.after-restriction.bc
-rw-rw-r-- 1 bjorn bjorn 6982752  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.lto.input.bc
-rw-rw-r-- 1 bjorn bjorn    4512  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.no-opt.bc
-rw-rw-r-- 1 bjorn bjorn 3188680  5 sep 20:54 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.o
-rw-rw-r-- 1 bjorn bjorn    3484  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-crate.allocator.rcgu.bc
-rw-rw-r-- 1 bjorn bjorn    3224  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-crate.allocator.rcgu.o
$ echo 'fn main() {}' | RUSTC_BOOTSTRAP=1 rustc +5ab63980021f7c1ae280eba3261d66240d594007 - -Zhuman-readable-cgu-names -O -Csave-temps -Clto=true && ls -l
-rwxrwxr-x 1 bjorn bjorn 2264800  5 sep 20:55 rust_out
-rw-rw-r-- 1 bjorn bjorn    4512  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-cgu.0.rcgu.no-opt.bc
-rw-rw-r-- 1 bjorn bjorn 2619640  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-crate.allocator.rcgu.bc
-rw-rw-r-- 1 bjorn bjorn 6982380  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-crate.allocator.rcgu.lto.after-restriction.bc
-rw-rw-r-- 1 bjorn bjorn 6982344  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-crate.allocator.rcgu.lto.input.bc
-rw-rw-r-- 1 bjorn bjorn 3701008  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-crate.allocator.rcgu.o

I suspect what happened is that I didn't restore the check in fat LTO that ensures the allocator module is not used as base to merge all other modules into. The allocator module is not configured to be optimized, so we probably skipped all optimizations after doing the module merging pass of fat LTO. I've added a new commit to fix this.

@bors try @rust-timer queue

bors · 2025-09-06T01:22:04Z

☔ The latest upstream changes (presumably #146255) made this pull request unmergeable. Please resolve the merge conflicts.

rustbot · 2025-09-06T08:36:30Z

This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

bjorn3 · 2025-09-06T08:37:00Z

Not sure how to easily add a test either.

@bors r=lqd

bors · 2025-09-06T08:37:02Z

📌 Commit 3851246 has been approved by lqd

It is now in the queue for this repository.

bors · 2025-09-06T10:59:24Z

⌛ Testing commit 3851246 with merge 831db11...

Make the allocator shim participate in LTO again This is likely the cause of the perf regression in #145955. It also caused some functional regressions. Fixes #146235 Fixes #146239

bors · 2025-09-06T12:09:15Z

💔 Test failed - checks-actions

Co-Authored-By: Rémy Rakic <[email protected]>

bjorn3 · 2025-09-06T13:32:23Z

Added a missing //@ needs-crate-type: cdylib in the new test.

@bors r=lqd

bors · 2025-09-06T13:32:25Z

📌 Commit 2cf94b9 has been approved by lqd

It is now in the queue for this repository.

bors · 2025-09-06T15:21:19Z

⌛ Testing commit 2cf94b9 with merge bea625f...

bors · 2025-09-06T18:36:27Z

☀️ Test successful - checks-actions
Approved by: lqd
Pushing bea625f to master...

github-actions · 2025-09-06T18:39:39Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 6d5caf3 (parent) -> bea625f (this PR)

Test differences

Show 5 test diffs

Stage 1

[ui] tests/ui/lto/lto-global-allocator.rs: [missing] -> pass (J0)

Stage 2

[ui] tests/ui/lto/lto-global-allocator.rs: [missing] -> pass (J1)
[ui] tests/ui/lto/lto-global-allocator.rs: [missing] -> ignore (skipping test as target does not support all of the crate types ["cdylib"]) (J2)

Additionally, 2 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard bea625f3275e3c897dc965ed97a1d19ef7831f01 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

x86_64-gnu-llvm-19-3: 6327.4s -> 7257.5s (14.7%)
dist-x86_64-apple: 8410.6s -> 7204.6s (-14.3%)
dist-apple-various: 4219.0s -> 3687.9s (-12.6%)
dist-aarch64-apple: 7069.4s -> 6361.9s (-10.0%)
aarch64-apple: 5645.9s -> 5088.4s (-9.9%)
dist-aarch64-msvc: 5196.0s -> 5651.3s (8.8%)
dist-i686-mingw: 9459.6s -> 8857.2s (-6.4%)
dist-sparcv9-solaris: 4946.9s -> 5257.6s (6.3%)
dist-x86_64-netbsd: 4951.7s -> 4642.1s (-6.3%)
aarch64-gnu: 6523.1s -> 6121.3s (-6.2%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-09-06T19:47:37Z

Finished benchmarking commit (bea625f): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.1%	[0.1%, 0.2%]	2
Improvements ✅ (primary)	-0.9%	[-1.9%, -0.3%]	221
Improvements ✅ (secondary)	-0.9%	[-2.2%, -0.0%]	260
All ❌✅ (primary)	-0.9%	[-1.9%, -0.3%]	221

Max RSS (memory usage)

Results (primary 1.2%, secondary -3.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.5%	[4.5%, 4.5%]	1
Regressions ❌ (secondary)	2.1%	[1.9%, 2.3%]	2
Improvements ✅ (primary)	-2.1%	[-2.1%, -2.1%]	1
Improvements ✅ (secondary)	-5.0%	[-7.1%, -2.2%]	9
All ❌✅ (primary)	1.2%	[-2.1%, 4.5%]	2

Cycles

Results (primary -2.2%, secondary -2.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.6%	[4.6%, 4.6%]	1
Improvements ✅ (primary)	-2.2%	[-3.0%, -1.9%]	5
Improvements ✅ (secondary)	-3.7%	[-9.5%, -2.0%]	9
All ❌✅ (primary)	-2.2%	[-3.0%, -1.9%]	5

Binary size

Results (primary -0.8%, secondary -1.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.8%	[-0.9%, -0.8%]	4
Improvements ✅ (secondary)	-1.7%	[-1.7%, -1.7%]	1
All ❌✅ (primary)	-0.8%	[-0.9%, -0.8%]	4

Bootstrap: 469.247s -> 468.001s (-0.27%)
Artifact size: 390.35 MiB -> 387.44 MiB (-0.74%)

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 5, 2025

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Sep 5, 2025

Auto merge of #146232 - bjorn3:lto_allocator_shim, r=<try>

5ab6398

Make the allocator shim participate in LTO again

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 5, 2025

bjorn3 force-pushed the lto_allocator_shim branch from 4ec9a9b to d0e65a9 Compare September 5, 2025 10:04

bjorn3 mentioned this pull request Sep 5, 2025

Misc LTO cleanups #146209

Open

This comment has been minimized.

Sign in to view

This was referenced Sep 5, 2025

rustc emits an unexpected _rdl symbols for WASM with lto=true #146235

Closed

Hidden symbol in nightly. #146239

Closed

Move LTO from the codegen coordinator thread to link_binary rust-lang/compiler-team#908

Open

bjorn3 marked this pull request as ready for review September 5, 2025 13:47

rustbot assigned fee1-dead Sep 5, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 5, 2025

lqd reviewed Sep 5, 2025

View reviewed changes

compiler/rustc_codegen_ssa/src/back/write.rs Show resolved Hide resolved

bjorn3 force-pushed the lto_allocator_shim branch from e10f5b6 to e072d7d Compare September 5, 2025 15:02

rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 5, 2025

Make the allocator shim participate in LTO again

0271359

bjorn3 force-pushed the lto_allocator_shim branch from 9623f42 to 3851246 Compare September 6, 2025 08:36

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2025

This comment has been minimized.

Sign in to view

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Sep 6, 2025

bjorn3 and others added 2 commits September 6, 2025 13:31

Add test that __rg_oom doesn't get internalized during LTO

9239d14

Co-Authored-By: Rémy Rakic <[email protected]>

Ensure fat LTO doesn't merge everything into the allocator module

2cf94b9

bjorn3 force-pushed the lto_allocator_shim branch from 3851246 to 2cf94b9 Compare September 6, 2025 13:31

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 6, 2025

bors merged commit bea625f into rust-lang:master Sep 6, 2025
11 checks passed

rustbot added this to the 1.91.0 milestone Sep 6, 2025

bjorn3 deleted the lto_allocator_shim branch September 6, 2025 18:36

lqd mentioned this pull request Sep 7, 2025

hidden symbol `_RNvCsi3tQlMvAKpb_7___rustc8___rg_oom' isn't defined #146287

Open

Make the allocator shim participate in LTO again #146232

Make the allocator shim participate in LTO again #146232

Conversation

bjorn3 commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjorn3 commented Sep 5, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

bjorn3 commented Sep 5, 2025

Uh oh!

rust-bors bot commented Sep 5, 2025

Uh oh!

This comment has been minimized.

rustbot commented Sep 5, 2025

Uh oh!

rustbot commented Sep 5, 2025

Uh oh!

lqd commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjorn3 commented Sep 5, 2025

Uh oh!

Uh oh!

lqd commented Sep 5, 2025

Uh oh!

rust-timer commented Sep 5, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

bjorn3 commented Sep 5, 2025

Uh oh!

bors commented Sep 6, 2025

Uh oh!

rustbot commented Sep 6, 2025

Uh oh!

bjorn3 commented Sep 6, 2025

Uh oh!

bors commented Sep 6, 2025

Uh oh!

bors commented Sep 6, 2025

Uh oh!

This comment has been minimized.

bors commented Sep 6, 2025

Uh oh!

bjorn3 commented Sep 6, 2025

Uh oh!

bors commented Sep 6, 2025

Uh oh!

bors commented Sep 6, 2025

Uh oh!

bors commented Sep 6, 2025

Uh oh!

Uh oh!

github-actions bot commented Sep 6, 2025

Test differences

Stage 1

Stage 2

Job duration changes

Uh oh!

rust-timer commented Sep 6, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

Uh oh!

bjorn3 commented Sep 5, 2025 •

edited

Loading

lqd commented Sep 5, 2025 •

edited

Loading