-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Enable outline-atomics
by default on more AArch64 platforms
#144938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
r? @ChrisDenton rustbot has assigned @ChrisDenton. Use |
These commits modify compiler targets. |
Whoops r? @ghost |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
[experiment] enable outline-atomics on more aarch64 platforms try-job: arm-android try-job: dist-android try-job: dist-x86_64-freebsd try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: aarch64-msvc-1 try-job: aarch64-msvc-2 try-job: dist-aarch64-msvc
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
[experiment] enable outline-atomics on more aarch64 platforms try-job: arm-android try-job: dist-android try-job: dist-x86_64-freebsd try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: aarch64-msvc-1 try-job: aarch64-msvc-2 try-job: dist-aarch64-msvc
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
a25e101
to
e20add7
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
Enable outline-atomics by default on more aarch64 platforms try-job: aarch64-apple try-job: aarch64-msvc-1 try-job: aarch64-msvc-2 try-job: arm-android try-job: dist-android try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: dist-aarch64-msvc try-job: dist-various try-job: dist-x86_64-freebsd
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
Enable outline-atomics by default on more aarch64 platforms try-job: aarch64-apple try-job: aarch64-msvc-* try-job: arm-android try-job: dist-android try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: dist-aarch64-msvc try-job: dist-various-* try-job: dist-x86_64-freebsd try-job: test-various-*
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
e20add7
to
5b91b55
Compare
@bors2 try |
This comment has been minimized.
This comment has been minimized.
Enable outline-atomics by default on more aarch64 platforms try-job: aarch64-apple try-job: aarch64-msvc-* try-job: arm-android try-job: dist-android try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: dist-aarch64-msvc try-job: dist-various-* try-job: dist-x86_64-freebsd try-job: test-various
outline-atomics
by default on more AArch64 platforms
Build outline atomic symbols and enable the startup code whenever std is built with `outline-atomics`, rather than only on Linux. Since this is no longer Linux-specific, also rename the `compiler-builtins` module.
Darwin and the `-sim` targets already have a baseline with LSE. Enable `outline-atomics` on all other Apple targets here.
Windows has a similar flag `/forceInterlockedFunctions`, which uses names such as `_InterlockedAdd64_rel`.
Per LLVM commit c5e7e64 ("[AArch64][Clang][Linux] Enable out-of-line atomics by default.") [1], Clang enables these on Android. Thus, do the same in Rust. [1]: llvm/llvm-project@c5e7e649d537067de
Clang does not currently have this enabled on FreeBSD, but there doesn't seem to be any specific reason not to. Thus, enable it here.
Clang has done this by default since LLVM commit 1a963d3 ("[Driver] Make -moutline-atomics default for aarch64-fuchsia targets"), [1], so do the same here. [1]: llvm/llvm-project@1a963d3
Clang has recently started doing this, as of LLVM commit 5d774ec8d183 ("[Driver] Enable outline atomics for OpenBSD/aarch64") [1]. Thus, do the same here. [1]: llvm/llvm-project@5d774ec
5b91b55
to
4c4f34f
Compare
@bors2 try |
Enable `outline-atomics` by default on more AArch64 platforms try-job: aarch64-apple try-job: aarch64-msvc-* try-job: arm-android try-job: dist-android try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: dist-aarch64-msvc try-job: dist-various-* try-job: dist-x86_64-freebsd try-job: test-various
This comment has been minimized.
This comment has been minimized.
r? @davidtwco rustbot has assigned @davidtwco. Use |
Cc target maintainers:
Zulip discussion: #t-compiler > outline-atomics on non-Linux targets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked through it as best I could, but this is a fair bit outside my comfort zone.
Is there a test or two I could run to verify whether this works on-device?
target_arch = "aarch64", | ||
target_feature = "outline-atomics", | ||
not(feature = "compiler-builtins-c") | ||
))] | ||
#[used] | ||
#[unsafe(link_section = ".init_array.90")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure you need something like:
#[cfg_attr(target_vendor = "apple", unsafe(link_section = "__DATA,__mod_init_func,mod_init_funcs"))]
For this to work on Apple platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe LLVM turns init_array
into whatever is applicable for the platform, but that would indeed be good to confirm. We do use init_array
in one other location
rust/library/std/src/sys/args/unix.rs
Line 120 in 6c699a3
#[unsafe(link_section = ".init_array.00099")] |
target_arch = "aarch64", | ||
target_feature = "outline-atomics", | ||
not(feature = "compiler-builtins-c") | ||
))] | ||
#[used] | ||
#[unsafe(link_section = ".init_array.90")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's pre-existing, but why the 90
in .init_array.90
? Is that for a specific ordering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
90 is a priority, and per https://maskray.me/blog/2021-11-07-init-ctors-init-array it should be <= 100 (though I have no idea if that limit is actually meaningful). I don't think there is any significance to this number other than being less than the 99
value used in the args constructor.
I'll add a comment.
#[cfg(all(target_arch = "aarch64", target_os = "linux", not(feature = "compiler-builtins-c")))] | ||
#[cfg(all( | ||
target_arch = "aarch64", | ||
target_feature = "outline-atomics", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there also be a cfg(not(target_feature = "lse"))
here, or am I misunderstanding how this should work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's probably fair: the C implementations don't seem to gate this, but it would be nice to save the (small) startup time. The only downside I can think of would be if you build std with LSE then link applications that don't enable LSE (meaning they always get the slow fallback without checking), but I imagine this is rare.
It's easy enough to check that outline atomics are used, just inspecting the code used by https://rust.godbolt.org/z/PWqeKP3W5. Verifying that feature detection works so LSE gets used is trickier though. The easiest thing is to call something that will use out-of-line atomics (like the example from that godbolt) and step through with a debugger, and ensure it is hitting the atomic instruction at this bit of code https://github.com/rust-lang/compiler-builtins/blob/a63d089c673aa9397d583c3cef506ad457c5f403/compiler-builtins/src/aarch64_linux.rs#L146-L148 rather than going to the fallback. Alternatively add an extern static or something that binds to the mangled name of (requires our compiler-builtins, would be similar but slightly different if using the C versions) |
The baseline Armv8.0 ISA doesn't have atomics instructions, but in
practice most hardware is at least Armv8.1-A (2014), which includes
single-instruction atomics as part of the LSE feature. As a performance
optimization for these cases, GCC and LLVM have the
-moutline-atomics
flagto turn atomic operations into calls to symbols like
__aarch64_cas1_acq
.These can do runtime feature detection and use the LSE instructions if
available, falling back to more portable load-exclusive/store-exclusive
loops.
Since the recent 3b50253 ("compiler-builtins: plumb LSE support
for aarch64 on linux") our builtins support this LSE optimization, and
since 6936bb9 ("Dynamically enable LSE for aarch64 rust provided
intrinsics"), std will set the flag as part of its startup code. The first
commit in this PR configures this to work on all platforms built with
outline-atomics
, not just Linux.Thus, enable
outline-atomics
by default on Android, FreeBSD, OpenBSD,Windows, Fuchsia, and Apple platforms that don't have LSE in the baseline.
The feature is already enabled on Linux. Platform-specific details are
included in each commit message.
The current implementation can still be accessed by setting
-Ctarget-feature=-outline-atomics
. Setting-Ctarget-feature=+lse
ora relevant CPU will use the single-instruction atomics without the call
overhead. https://rust.godbolt.org/z/dsdrzszoe
Link: https://learn.arm.com/learning-paths/servers-and-cloud-computing/lse/intro/
Original Clang outline-atomics benchmarks: https://reviews.llvm.org/D91157#2435844
try-job: aarch64-apple
try-job: aarch64-msvc-*
try-job: arm-android
try-job: dist-android
try-job: dist-aarch64-windows-gnullvm
try-job: dist-aarch64-apple
try-job: dist-aarch64-msvc
try-job: dist-various-*
try-job: dist-x86_64-freebsd
try-job: test-various