Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86-64: Some extern "C" functions use U64x2 on a softfloat target, which makes no sense #758

Closed
RalfJung opened this issue Feb 15, 2025 · 16 comments · Fixed by #759
Closed

Comments

@RalfJung
Copy link
Member

Passing U64x2 across an extern "C" boundary on x86-64 requires SSE; it is hence not possible to pass those types when SSE is disabled, which is the case for softfloat targets. (See rust-lang/rust#116558 for more context on how ABI and SIMD target features interact.) This comes up, for instance, when building for x86_64-unknown-uefi.

Here are the places where a SIMD type is passed via extern "C":

warning: this function definition uses a SIMD vector type that (with the chosen ABI) requires the `sse` target feature, which is not enabled
   --> src/macros.rs:260:13
    |
260 | /             extern $abi fn $name( $($argname: $ty),* )
261 | |                 -> $crate::macros::win64_128bit_abi_hack::U64x2
    | |_______________________________________________________________^ function defined here
    |
   ::: src/float/conv.rs:395:1
    |
395 | / intrinsics! {
396 | |     #[arm_aeabi_alias = __aeabi_f2uiz]
397 | |     pub extern "C" fn __fixunssfsi(f: f32) -> u32 {
398 | |         float_to_unsigned_int(f)
...   |
443 | | }
    | |_- in this macro invocation
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #116558 <https://github.com/rust-lang/rust/issues/116558>
    = help: consider enabling it globally (`-C target-feature=+sse`) or locally (`#[target_feature(enable="sse")]`)
    = note: `#[warn(abi_unsupported_vector_types)]` on by default
    = note: this warning originates in the macro `intrinsics` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: this function definition uses a SIMD vector type that (with the chosen ABI) requires the `sse` target feature, which is not enabled
   --> src/macros.rs:260:13
    |
260 | /             extern $abi fn $name( $($argname: $ty),* )
261 | |                 -> $crate::macros::win64_128bit_abi_hack::U64x2
    | |_______________________________________________________________^ function defined here
    |
   ::: src/float/conv.rs:395:1
    |
395 | / intrinsics! {
396 | |     #[arm_aeabi_alias = __aeabi_f2uiz]
397 | |     pub extern "C" fn __fixunssfsi(f: f32) -> u32 {
398 | |         float_to_unsigned_int(f)
...   |
443 | | }
    | |_- in this macro invocation
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #116558 <https://github.com/rust-lang/rust/issues/116558>
    = help: consider enabling it globally (`-C target-feature=+sse`) or locally (`#[target_feature(enable="sse")]`)
    = note: this warning originates in the macro `intrinsics` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: this function definition uses a SIMD vector type that (with the chosen ABI) requires the `sse` target feature, which is not enabled
   --> src/macros.rs:260:13
    |
260 | /             extern $abi fn $name( $($argname: $ty),* )
261 | |                 -> $crate::macros::win64_128bit_abi_hack::U64x2
    | |_______________________________________________________________^ function defined here
    |
   ::: src/float/conv.rs:446:1
    |
446 | / intrinsics! {
447 | |     #[arm_aeabi_alias = __aeabi_f2iz]
448 | |     pub extern "C" fn __fixsfsi(f: f32) -> i32 {
449 | |         float_to_signed_int(f)
...   |
494 | | }
    | |_- in this macro invocation
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #116558 <https://github.com/rust-lang/rust/issues/116558>
    = help: consider enabling it globally (`-C target-feature=+sse`) or locally (`#[target_feature(enable="sse")]`)
    = note: this warning originates in the macro `intrinsics` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: this function definition uses a SIMD vector type that (with the chosen ABI) requires the `sse` target feature, which is not enabled
   --> src/macros.rs:260:13
    |
260 | /             extern $abi fn $name( $($argname: $ty),* )
261 | |                 -> $crate::macros::win64_128bit_abi_hack::U64x2
    | |_______________________________________________________________^ function defined here
    |
   ::: src/int/sdiv.rs:168:1
    |
168 |   sdiv!(__udivti3, __divti3, u128, i128, win64_128bit_abi_hack);
    |   ------------------------------------------------------------- in this macro invocation
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #116558 <https://github.com/rust-lang/rust/issues/116558>
    = help: consider enabling it globally (`-C target-feature=+sse`) or locally (`#[target_feature(enable="sse")]`)
    = note: this warning originates in the macro `intrinsics` which comes from the expansion of the macro `sdiv` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: this function definition uses a SIMD vector type that (with the chosen ABI) requires the `sse` target feature, which is not enabled
   --> src/macros.rs:260:13
    |
260 | /             extern $abi fn $name( $($argname: $ty),* )
261 | |                 -> $crate::macros::win64_128bit_abi_hack::U64x2
    | |_______________________________________________________________^ function defined here
    |
   ::: src/int/sdiv.rs:169:1
    |
169 |   smod!(__umodti3, __modti3, u128, i128, win64_128bit_abi_hack);
    |   ------------------------------------------------------------- in this macro invocation
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #116558 <https://github.com/rust-lang/rust/issues/116558>
    = help: consider enabling it globally (`-C target-feature=+sse`) or locally (`#[target_feature(enable="sse")]`)
    = note: this warning originates in the macro `intrinsics` which comes from the expansion of the macro `smod` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: this function definition uses a SIMD vector type that (with the chosen ABI) requires the `sse` target feature, which is not enabled
   --> src/macros.rs:260:13
    |
260 | /             extern $abi fn $name( $($argname: $ty),* )
261 | |                 -> $crate::macros::win64_128bit_abi_hack::U64x2
    | |_______________________________________________________________^ function defined here
    |
   ::: src/int/udiv.rs:7:1
    |
7   | / intrinsics! {
8   | |     #[maybe_use_optimized_c_shim]
9   | |     #[arm_aeabi_alias = __aeabi_uidiv]
10  | |     /// Returns `n / d`
...   |
106 | | }
    | |_- in this macro invocation
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #116558 <https://github.com/rust-lang/rust/issues/116558>
    = help: consider enabling it globally (`-C target-feature=+sse`) or locally (`#[target_feature(enable="sse")]`)
    = note: this warning originates in the macro `intrinsics` (in Nightly builds, run with -Z macro-backtrace for more info)
@tgross35
Copy link
Contributor

I believe

// Hack for LLVM expectations for ABI on windows. This is used by the
// `#[win64_128bit_abi_hack]` attribute recognized above
#[cfg(all(any(windows, target_os = "uefi"), target_pointer_width = "64"))]
pub mod win64_128bit_abi_hack {
#[repr(simd)]
pub struct U64x2([u64; 2]);
impl From<i128> for U64x2 {
fn from(i: i128) -> U64x2 {
use crate::int::DInt;
let j = i as u128;
U64x2([j.lo(), j.hi()])
}
}
impl From<u128> for U64x2 {
fn from(i: u128) -> U64x2 {
use crate::int::DInt;
U64x2([i.lo(), i.hi()])
}
}
}
was added to meet the C/llvm expectation that i128 on Windows is returned in xmm0. Since rust-lang/rust#134290 this should no longer be necessary so I believe that workaround for Windows can just be removed (cc @beetrees to check my logic here).

I don't understand the uefi situation though; is LLVM still expecting that builtins return via xmm0 even though we consider it no-sse2? In that case, removing the workaround for uefi would break things if done after rust-lang/rust#137094 lands. Or is something already broken now?

Cc @nicholasbishop

@RalfJung
Copy link
Member Author

is LLVM still expecting that builtins return via xmm0 even though we consider it no-sse2?

The UEFI target has +soft-float,-sse set so I don't think it can expect anything in xmm0.

@tgross35
Copy link
Contributor

I didn't realize #[repr(simd)] falls back to a non-sse representation when not available https://rust.godbolt.org/z/TMWMzrsaG. Makes sense, I guess the hack isn't actually getting used on UEFI then.

@nicholasbishop
Copy link
Contributor

Enabling that hack on uefi was done in #475 to fix u128 division: rust-lang/rust#86494

So it was certainly getting used on the x86_64 UEFI target in 2021, but perhaps other changes since then have made that no longer true?

@RalfJung
Copy link
Member Author

@nicholasbishop do you happen to remember what the difference was between the broken ABI and the fixed ABI?

It seems unlikely that either of them used xmm0, given the target features we are setting for LLVM.

@tgross35
Copy link
Contributor

To confirm, the target has been using soft float since it was introduced rust-lang/rust@88cf2a2.

@beetrees
Copy link
Contributor

The ABI is now correct on x86_64-pc-windows-msvc and x86_64-pc-windows-gnu both with and without the #[win64_128bit_abi_hack]. However, on x86_64-unknown-uefi, LLVM expects 128-bit integer builtins to return the result in integer registers (which is what happens when using #[win64_128bit_abi_hack]), whereas the Rust extern "C" ABI returns the result indirectly (compiler explorer).

@tgross35
Copy link
Contributor

tgross35 commented Feb 17, 2025

rust-lang/rust#132570 would probably be the best solution then. However, that doesn't seem too close to ready. I guess making our i128 ABI match for now is the next best option to allow removing the vector type.

@RalfJung
Copy link
Member Author

However, on x86_64-unknown-uefi, LLVM expects 128-bit integer builtins to return the result in integer registers

So we should emit an i128 return type? Or how do we force LLVM to do that?

i128 does not fit into a single register. So does it use two registers for the return value then?

@tgross35
Copy link
Contributor

So we should emit an i128 return type? Or how do we force LLVM to do that?

LLVM seems to give the desired behavior for for that returns i128 directly. It looks like Clang is returning { i64, i64 } so our ScalarPair may also work.

i128 does not fit into a single register. So does it use two registers for the return value then?

That seems to be what Clang is doing now, and what LLVM expects for UEFI and no-sse Windows. Which probably isn’t ideal.

An alternative we have is to make our ABI return indirect but use naked asm to forward the calling convention in builtins. But I think it would be simplest to just match what LLVM expects for now and open an issue upstream.

@bjorn3
Copy link
Member

bjorn3 commented Feb 18, 2025

The Windows calling convention only allows a single return register. Are you sure { i64, i64 } or i128 doesn't get returned indirectly (PassMode::Indirect) at the assembly level?

@RalfJung
Copy link
Member Author

RalfJung commented Feb 18, 2025

#759 removes the abi hack, I hope that should then fix this problem.

The Windows calling convention only allows a single return register. Are you sure { i64, i64 } or i128 doesn't get returned indirectly (PassMode::Indirect) at the assembly level?

When LLVM calls __udivti3 on the UEFI target, it looks like this:

	callq	__udivti3
	movq	%rax, (%rsi)
	movq	%rdx, 8(%rsi)

So, it seems to be passing something via rax and rdx, i.e., via two registers? See here for the full assembly.

This godbolt shared above also seems to confirm that LLVM uses two registers for U64x2.

@tgross35
Copy link
Contributor

tgross35 commented Feb 18, 2025

It looks like UEFI, Windows+softfloat, and linux+softfloat act exactly the same for i128, { i64, i64 }, and <2 x i64>: passing and returning both use a register pair on x86_64, but are passed and returned indirectly on i686 and i586. On x86_64 without -sse,+soft-float (i.e. hardfloat), <2 x i64> uses xmm but the others are unchanged. On i586 and i686, whether or not SSE is available makes no difference (from our definitions i686 should enable SSE but I don't know how LLVM handles these targets).

By Microsoft's ABI, i128 needs to be passed indirectly so this behavior violates the guidelines. I think this is something LLVM leaves up to us to get correct, however.

For libcalls, x86_64 Linux and UEFI the same CC as i128. Windows+softfloat passes indirectly (correct for Microsoft's ABI); however, it returns in a register pair which doesn't seem correct. I doubt there are many consumers of this specific calling convention outside of us using it for UEFI, but that seems like something LLVM should change.

I don't know what the libcall ABI is on i686/i586 because LLVM seems to emit a large inline assembly routine rather than calling out.

For reference: https://llvm.godbolt.org/z/fzP1xozGq

@tgross35
Copy link
Contributor

I opened llvm/llvm-project#127723 for the LLVM change. But I think that with rust-lang/rust#137094 our behavior should agree with what LLVM expects for libcalls as well as what Clang does, using our definition uefi = windows + softfloat.

@RalfJung
Copy link
Member Author

For reference: https://llvm.godbolt.org/z/fzP1xozGq

What is particularly confusing me here is ret_u128_withsse vs do_div_withsse on x86_64-pc-windows-msvc: the former seems to indicate that functions that return i128 pass that via two registers, rax and rdx. But the latter seems to indicate that when LLVM emits a __udivti3 call, it expects the return value in xmm0. That's plain incoherent, isn't it? The return type is the same in both cases, it's i128, so it should use the same ABI. Am I misunderstanding something?

@tgross35
Copy link
Contributor

What is particularly confusing me here is ret_u128_withsse vs do_div_withsse on x86_64-pc-windows-msvc: the former seems to indicate that functions that return i128 pass that via two registers, rax and rdx. But the latter seems to indicate that when LLVM emits a __udivti3 call, it expects the return value in xmm0. That's plain incoherent, isn't it? The return type is the same in both cases, it's i128, so it should use the same ABI. Am I misunderstanding something?

I think this may just be LLVM's expected behavior (I'll elaborate at llvm/llvm-project#127723)

tgross35 added a commit to tgross35/rust that referenced this issue Feb 20, 2025
Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on
Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 20, 2025
Update `compiler-builtins` to 0.1.147

Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758

try-job: x86_64-msvc-1
try-job: x86_64-msvc-2
try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 20, 2025
Update `compiler-builtins` to 0.1.147

Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758

try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
tgross35 added a commit to tgross35/rust that referenced this issue Feb 23, 2025
Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on
Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 23, 2025
Update `compiler-builtins` to 0.1.147

Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758

try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
tgross35 added a commit to tgross35/rust that referenced this issue Feb 23, 2025
Update `compiler-builtins` to 0.1.147

Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758

try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 24, 2025
Rollup merge of rust-lang#137297 - tgross35:update-builtins, r=tgross35

Update `compiler-builtins` to 0.1.147

Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758

try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
RalfJung pushed a commit to RalfJung/miri that referenced this issue Feb 24, 2025
Update `compiler-builtins` to 0.1.147

Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang/rust#116558
Link: rust-lang/compiler-builtins#758

try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
bjorn3 pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this issue Feb 26, 2025
Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on
Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang/rust#116558
Link: rust-lang/compiler-builtins#758
bjorn3 pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this issue Feb 26, 2025
Update `compiler-builtins` to 0.1.147

Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang/rust#116558
Link: rust-lang/compiler-builtins#758

try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 6, 2025
Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on
Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Mar 6, 2025
Removes an ABI hack that used `<2 x i64>` to return `i128` in `xmm0` on
Windows [1].

[1]: rust-lang/compiler-builtins#759
Link: rust-lang#116558
Link: rust-lang/compiler-builtins#758
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants