Skip to content

Conversation

tgross35
Copy link
Contributor

@tgross35 tgross35 commented Aug 5, 2025

The baseline Armv8.0 ISA doesn't have atomics instructions, but in
practice most hardware is at least Armv8.1-A (2014), which includes
single-instruction atomics as part of the LSE feature. As a performance
optimization for these cases, GCC and LLVM have the -moutline-atomics flag
to turn atomic operations into calls to symbols like __aarch64_cas1_acq.
These can do runtime feature detection and use the LSE instructions if
available, falling back to more portable load-exclusive/store-exclusive
loops.

Since the recent 3b50253 ("compiler-builtins: plumb LSE support
for aarch64 on linux") our builtins support this LSE optimization, and
since 6936bb9 ("Dynamically enable LSE for aarch64 rust provided
intrinsics"), std will set the flag as part of its startup code. The first
commit in this PR configures this to work on all platforms built with
outline-atomics, not just Linux.

Thus, enable outline-atomics by default on Android, FreeBSD, OpenBSD,
Windows, Fuchsia, and Apple platforms that don't have LSE in the baseline.
The feature is already enabled on Linux. Platform-specific details are
included in each commit message.

The current implementation can still be accessed by setting
-Ctarget-feature=-outline-atomics. Setting -Ctarget-feature=+lse or
a relevant CPU will use the single-instruction atomics without the call
overhead. https://rust.godbolt.org/z/dsdrzszoe

Link: https://learn.arm.com/learning-paths/servers-and-cloud-computing/lse/intro/
Original Clang outline-atomics benchmarks: https://reviews.llvm.org/D91157#2435844

try-job: aarch64-apple
try-job: aarch64-msvc-*
try-job: arm-android
try-job: dist-android
try-job: dist-aarch64-windows-gnullvm
try-job: dist-aarch64-apple
try-job: dist-aarch64-msvc
try-job: dist-various-*
try-job: dist-x86_64-freebsd
try-job: test-various

@rustbot
Copy link
Collaborator

rustbot commented Aug 5, 2025

r? @ChrisDenton

rustbot has assigned @ChrisDenton.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-compiler-builtins Area: compiler-builtins (https://github.com/rust-lang/compiler-builtins) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 5, 2025
@rustbot
Copy link
Collaborator

rustbot commented Aug 5, 2025

These commits modify compiler targets.
(See the Target Tier Policy.)

@tgross35
Copy link
Contributor Author

tgross35 commented Aug 5, 2025

Whoops

r? @ghost

@tgross35 tgross35 marked this pull request as draft August 5, 2025 02:30
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 5, 2025
@tgross35

This comment was marked as outdated.

@rust-bors

This comment was marked as outdated.

rust-bors bot added a commit that referenced this pull request Aug 5, 2025
[experiment] enable outline-atomics on more aarch64 platforms

try-job: arm-android
try-job: dist-android
try-job: dist-x86_64-freebsd
try-job: dist-aarch64-windows-gnullvm
try-job: dist-aarch64-apple
try-job: aarch64-msvc-1
try-job: aarch64-msvc-2
try-job: dist-aarch64-msvc
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-bors

This comment was marked as outdated.

@tgross35

This comment was marked as outdated.

@rust-bors

This comment was marked as outdated.

rust-bors bot added a commit that referenced this pull request Aug 5, 2025
[experiment] enable outline-atomics on more aarch64 platforms

try-job: arm-android
try-job: dist-android
try-job: dist-x86_64-freebsd
try-job: dist-aarch64-windows-gnullvm
try-job: dist-aarch64-apple
try-job: aarch64-msvc-1
try-job: aarch64-msvc-2
try-job: dist-aarch64-msvc
@rust-bors

This comment was marked as outdated.

@bors

This comment was marked as outdated.

@tgross35 tgross35 force-pushed the more-outline-atomics branch from a25e101 to e20add7 Compare September 5, 2025 19:39
@tgross35 tgross35 changed the title [experiment] enable outline-atomics on more aarch64 platforms Enable outline-atomics by default on more aarch64 platforms Sep 5, 2025
@tgross35

This comment was marked as outdated.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Sep 5, 2025
Enable outline-atomics by default on more aarch64 platforms

try-job: aarch64-apple
try-job: aarch64-msvc-1
try-job: aarch64-msvc-2
try-job: arm-android
try-job: dist-android
try-job: dist-aarch64-windows-gnullvm
try-job: dist-aarch64-apple
try-job: dist-aarch64-msvc
try-job: dist-various
try-job: dist-x86_64-freebsd
@rust-bors

This comment was marked as outdated.

@rust-log-analyzer

This comment was marked as outdated.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Sep 5, 2025
Enable outline-atomics by default on more aarch64 platforms

try-job: aarch64-apple
try-job: aarch64-msvc-*
try-job: arm-android
try-job: dist-android
try-job: dist-aarch64-windows-gnullvm
try-job: dist-aarch64-apple
try-job: dist-aarch64-msvc
try-job: dist-various-*
try-job: dist-x86_64-freebsd
try-job: test-various-*
@rust-bors

This comment was marked as outdated.

@rust-log-analyzer

This comment was marked as outdated.

@tgross35 tgross35 force-pushed the more-outline-atomics branch from e20add7 to 5b91b55 Compare September 5, 2025 19:47
@tgross35
Copy link
Contributor Author

tgross35 commented Sep 5, 2025

@bors2 try

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Sep 5, 2025
Enable outline-atomics by default on more aarch64 platforms

try-job: aarch64-apple
try-job: aarch64-msvc-*
try-job: arm-android
try-job: dist-android
try-job: dist-aarch64-windows-gnullvm
try-job: dist-aarch64-apple
try-job: dist-aarch64-msvc
try-job: dist-various-*
try-job: dist-x86_64-freebsd
try-job: test-various
@rust-bors
Copy link

rust-bors bot commented Sep 5, 2025

☀️ Try build successful (CI)
Build commit: 09a8dba (09a8dba552fc10cf0f87e6af03c30037287492e7, parent: 9cd272dc85320e85a8c83a1a338870de52c005f3)

@tgross35 tgross35 changed the title Enable outline-atomics by default on more aarch64 platforms Enable outline-atomics by default on more AArch64 platforms Sep 5, 2025
Build outline atomic symbols and enable the startup code whenever std is
built with `outline-atomics`, rather than only on Linux.  Since this is
no longer Linux-specific, also rename the `compiler-builtins` module.
Darwin and the `-sim` targets already have a baseline with LSE. Enable
`outline-atomics` on all other Apple targets here.
Windows has a similar flag `/forceInterlockedFunctions`, which uses
names such as `_InterlockedAdd64_rel`.
Per LLVM commit c5e7e64 ("[AArch64][Clang][Linux] Enable
out-of-line atomics by default.") [1], Clang enables these on Android.
Thus, do the same in Rust.

[1]: llvm/llvm-project@c5e7e649d537067de
Clang does not currently have this enabled on FreeBSD, but there doesn't
seem to be any specific reason not to. Thus, enable it here.
Clang has done this by default since LLVM commit 1a963d3 ("[Driver]
Make -moutline-atomics default for aarch64-fuchsia targets"), [1], so do
the same here.

[1]: llvm/llvm-project@1a963d3
Clang has recently started doing this, as of LLVM commit 5d774ec8d183
("[Driver] Enable outline atomics for OpenBSD/aarch64") [1]. Thus, do
the same here.

[1]: llvm/llvm-project@5d774ec
@tgross35 tgross35 force-pushed the more-outline-atomics branch from 5b91b55 to 4c4f34f Compare September 6, 2025 00:46
@tgross35
Copy link
Contributor Author

tgross35 commented Sep 6, 2025

@bors2 try

rust-bors bot added a commit that referenced this pull request Sep 6, 2025
Enable `outline-atomics` by default on more AArch64 platforms

try-job: aarch64-apple
try-job: aarch64-msvc-*
try-job: arm-android
try-job: dist-android
try-job: dist-aarch64-windows-gnullvm
try-job: dist-aarch64-apple
try-job: dist-aarch64-msvc
try-job: dist-various-*
try-job: dist-x86_64-freebsd
try-job: test-various
@rust-bors

This comment has been minimized.

@tgross35 tgross35 marked this pull request as ready for review September 6, 2025 00:49
@rustbot
Copy link
Collaborator

rustbot commented Sep 6, 2025

r? @davidtwco

rustbot has assigned @davidtwco.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 6, 2025
@tgross35
Copy link
Contributor Author

tgross35 commented Sep 6, 2025

Cc target maintainers:

Zulip discussion: #t-compiler > outline-atomics on non-Linux targets

Copy link
Contributor

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked through it as best I could, but this is a fair bit outside my comfort zone.

Is there a test or two I could run to verify whether this works on-device?

View changes since this review

target_arch = "aarch64",
target_feature = "outline-atomics",
not(feature = "compiler-builtins-c")
))]
#[used]
#[unsafe(link_section = ".init_array.90")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure you need something like:

#[cfg_attr(target_vendor = "apple", unsafe(link_section = "__DATA,__mod_init_func,mod_init_funcs"))]

For this to work on Apple platforms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe LLVM turns init_array into whatever is applicable for the platform, but that would indeed be good to confirm. We do use init_array in one other location

#[unsafe(link_section = ".init_array.00099")]

target_arch = "aarch64",
target_feature = "outline-atomics",
not(feature = "compiler-builtins-c")
))]
#[used]
#[unsafe(link_section = ".init_array.90")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's pre-existing, but why the 90 in .init_array.90? Is that for a specific ordering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

90 is a priority, and per https://maskray.me/blog/2021-11-07-init-ctors-init-array it should be <= 100 (though I have no idea if that limit is actually meaningful). I don't think there is any significance to this number other than being less than the 99 value used in the args constructor.

I'll add a comment.

#[cfg(all(target_arch = "aarch64", target_os = "linux", not(feature = "compiler-builtins-c")))]
#[cfg(all(
target_arch = "aarch64",
target_feature = "outline-atomics",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there also be a cfg(not(target_feature = "lse")) here, or am I misunderstanding how this should work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably fair: the C implementations don't seem to gate this, but it would be nice to save the (small) startup time. The only downside I can think of would be if you build std with LSE then link applications that don't enable LSE (meaning they always get the slow fallback without checking), but I imagine this is rare.

@tgross35
Copy link
Contributor Author

tgross35 commented Sep 6, 2025

Is there a test or two I could run to verify whether this works on-device?

It's easy enough to check that outline atomics are used, just inspecting the code used by https://rust.godbolt.org/z/PWqeKP3W5. Verifying that feature detection works so LSE gets used is trickier though. The easiest thing is to call something that will use out-of-line atomics (like the example from that godbolt) and step through with a debugger, and ensure it is hitting the atomic instruction at this bit of code https://github.com/rust-lang/compiler-builtins/blob/a63d089c673aa9397d583c3cef506ad457c5f403/compiler-builtins/src/aarch64_linux.rs#L146-L148 rather than going to the fallback. Alternatively add an extern static or something that binds to the mangled name of HAVE_LSE_ATOMICS in that module, and just print to make sure it's nonzero.

(requires our compiler-builtins, would be similar but slightly different if using the C versions)

@rust-bors
Copy link

rust-bors bot commented Sep 6, 2025

☀️ Try build successful (CI)
Build commit: 7041613 (704161326f1460ec409add8d318ad1976f8f3eb5, parent: 6c699a37235700ab749e3f14147fe41d49c056e8)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compiler-builtins Area: compiler-builtins (https://github.com/rust-lang/compiler-builtins) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants