Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we / can we make MaybeUninit<T> always preserve all bytes of T (including padding)? #518

Open
RalfJung opened this issue Jul 26, 2024 · 53 comments

Comments

@RalfJung
Copy link
Member

It is a frequent source of confusion that MaybeUninit<T> is not just preserving all the underlying bytes of storage, but actually if T has padding then those bytes are lost on copies/moves of MaybeUninit<T>.

This is currently pretty much a necessary consequence of the promise that MaybeUninit<T> is ABI-compatible with T: some ABIs don't preserve the padding of T when it is passed to a function. However, this was not part of the intention with MaybeUninit at all, it is something we discovered later.

Maybe we should try to take this back, and make the guarantee only for types without padding?

I am not even sure why we made this a guarantee. We made the type repr(transparent) because for performance it is quite important that MaybeUninit<$int> becomes just an iN in LLVM. But that doesn't require a stable guarantee. And in fact it seems like it would almost always be a bug if the caller and callee disagree about whether the value has to be initialized. So I would be curious about real-world examples where this guarantee is needed.

@elichai
Copy link

elichai commented Jul 26, 2024

I'll add that a typed copy of an uninitialized variable is UB in C, so there's no need to promise any ABI for FFI compatibility,
So this leaves us with the "Rust" ABI which isn't stable anyway.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 26, 2024 via email

@Diggsey

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@carbotaniuman
Copy link

The PR that introduced the guarantees does not talk about padding, and it seems like that wasn't really understood back then. The t-lang minutes discussing this are lost to time and reorganizations, but it seems doubtful that such a consideration was raised. Discussions from 2018 raise a lack of real-world use cases for ABI compatibility, and I agree with such a sentiment in the present.

I don't this this would be approved nowadays, but I am incredibly apprehensive about removing it. There are few places in the Rust documentation that use always for guarantees like this, and the use cases for some weird FFI thunks or bindings would be nigh-impossible to properly test with crater or similar...

@Diggsey

This comment was marked as off-topic.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 26, 2024

@carbotaniuman I think we should consider removing it. If we can't come up with any legitimate usecase, I think we should definitely remove it. I don't like going back on a promise like this, but if we don't have a usecase that could be broken by taking back this promise, then the chances that someone is affected should be very slim.

@Diggsey thanks for explaining why you think this belongs in this thread. But I disagree. "MaybeUninit preserves provenance" is not relevant here. You will note that provenance does not appear in the issue description. Furthermore, provenance on CHERI works like it does everywhere else, so even if provenance were relevant, CHERI wouldn't change anything. It is true that you can write code with MaybeUninit that will work everywhere but not on CHERI; discussing that is off-topic here as the reasons are completely different from what this thread is about and also different from what you mentioned -- it is caused not by padding and not by provenance, but by capabilities. So please take this elsewhere, e.g. Zulip or a new issue where you explain why you think MaybeUninit's provenance behavior is incompatible with CHERI, but I'd prefer not to see yet another thread derailed by CHERI. I like CHERI and want to see it work in Rust, but our main task here is to figure out Rust for the existing targets we already support. CHERI support is a nice extra that I'll happily discuss, as long as it doesn't distract from our core task.

@carbotaniuman
Copy link

If we do agree to remove the guarantee, I expect it to break 0 uses in practice. My only other concern would be the performance impact of having to copy more bytes. It probably won't affect SIMD or buffers though, so I don't really think that's it's really an issue.

@bjorn3
Copy link
Member

bjorn3 commented Jul 26, 2024

I think we should preserve the memory layout compatibility, but drop the calling convention compatibility. That could be done using repr(C) instead of repr(transparent) I think.

@chorman0773
Copy link
Contributor

FTR, I use MaybeUninit<T> in the signatures of lccc's libatomic and libsfp ABI-level routines. This is because they get called from xlang's codegen, and xlang allows uninit for those operations. Though in these cases, they are types without padding in the signature.

However, for compatibility with gcc/clang, they have to expose an ABI equal to the rountines using primitives.

@chorman0773
Copy link
Contributor

(And in general, I agree with @carbotaniuman - unless crater is testing all kinds of targets, I'm betting it primarily tests x86_64, where aggregate-of-one-field will get passed the same way as that one field*, so without using miri-crater, the ABI checks won't be found by crater. If the code is used on something like arm32 though, it's going to be very visibility broken)

@RalfJung
Copy link
Member Author

Yes, this is super hard to test for. I wonder if it's worth having a blog post asking people whether they need this guarantee...

I think we should preserve the memory layout compatibility, but drop the calling convention compatibility. That could be done using repr(C) instead of repr(transparent) I think.

Yes, concretely the proposal would be:

  • T and MaybeUninit<T> always have the same size and alignment
  • they have the same ABI if T has no padding

Or maybe "no padding" should be restricted a bit further, like "if T is a primitive integer/float/pointer type" or so. Note that some non-power-of-2 SIMD types have padding so we have to be careful if we want to talk about those types.

@Ddystopia
Copy link

Is the only motivation to backing up on that promise is the fact that this is a frequent source of confusion? Which benefits except clarity can Rust gain?

@RalfJung
Copy link
Member Author

We never intended MaybeUninit<(u8, u16)> to have a padding byte that would be lost on copies. This is a complete accident. It's not just a source of confusion, it's not the semantics we want. It came up in #517 where it means that returning a MaybeUninit<T> from an atomic compare-exchange actually doesn't work since padding bytes still get lost so we won't end up having the same bit pattern as what is stored in the atomic location.

@jamesmunns
Copy link
Member

jamesmunns commented Jul 26, 2024

I'm pretty sure this is still the case, but it might be worth it to enumerate things that ARE still allowed for this wrt FFI/ABI concerns. My primary use of MaybeUninit<T> in FFI is for "outptr" usages (edit: specifically &mut MaybeUninit<T> or *mut MaybeUninit<T>), which seems to be still good (because we never copy/pass by value - the part that is discussed by this issue), but for folks like myself might be worth spelling out clearly/contrasting what is no longer allowed.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 26, 2024

Yes, ABI compatibility is about the "by-value" part of a function argument or return type. That's how we've consistently been using this term for a while now, also see our glossary and the documentation on ABI compatibility.

In public communication we'll obviously spell out the details more than in internal discussion. ("Internal" not as in "private" but as in "among the team members and anyone else who's willing to participate".)

@chorman0773
Copy link
Contributor

Or maybe "no padding" should be restricted a bit further, like "if T is a primitive integer/float/pointer type" or so. Note that some non-power-of-2 SIMD types have padding so we have to be careful if we want to talk about those types.

I at the very least need target simd types as well - for floating-point types that aren't directly supported by rust (e.g. f2x64_t), I wrap them in a target-specific ABI type, which on x86_64, is mostly __m128.

@RalfJung
Copy link
Member Author

Since that is a compiler-internal concern, you could also do this by providing more ABI guarantees than what Rust provides in general.

But that case would be covered by "types without padding", or we could explicitly mention the stdarch SIMD types (since they are all powers of 2).

@chorman0773
Copy link
Contributor

Since that is a compiler-internal concern, you could also do this by providing more ABI guarantees than what Rust provides in general.

Not fully - you don't necessarily need to compile the rtlibs with lccc themselves, they're written in mostly portable rust, and quite deliberately. I'd like to be able to continue providing that guarantee.

Yes, ABI compatibility is about the "by-value" part of a function argument or return type. That's how we've consistently been using this term for a while now, also see our glossary and the documentation on ABI compatibility.

You can also now see a formalization in reference#1545, as a note.

@RalfJung
Copy link
Member Author

This has caused an actual soundness bug now: rust-lang/rust#134713.

I think we should seriously consider restricting the ABI compatibility guarantee to scalar and SIMD types.

@chorman0773
Copy link
Contributor

The issue with that is that will make it hard to represent "Maybe Initialized" aggregate types in ABIs.
I already know I need this for fn() since I use this (via a MaybeValid1 wrapper), in place of void(*)(int) in signal. Depending on some other design details of my OS (which I'm currently in the process of implementing a wine-like compatibility layer for running programs for it in Rust), I might need this behaviour (either via the MaybeValid wrapper or via MaybeUninit itself) for at least one aggregate type (at the very least, if I want to keep the strong typing my OS's API has been exporting).

Footnotes

  1. MaybeValid is a repr(transparent) wrapper arround MaybeUninit<T> that disallows uninit bytes wherever T does as a safety invariant. In particular, the type is allowed to contain "Any Initialized Bit patttern" or any safe value of T, minus padding bytes of T.

@RalfJung
Copy link
Member Author

Yeah I don't think a by-value ABI-compatible "maybe init" aggregate is common enough to justify the constant stream of surprises and UB that this problem causes. I would suggest not designing your OS around such a facility.

@RalfJung RalfJung changed the title Should we / can we make MaybeUninit<T> always preserve all bytes of T? Should we / can we make MaybeUninit<T> always preserve all bytes of T (including padding)? Feb 23, 2025
@carbotaniuman
Copy link

I had a longer post here that I since felt was too confrontational, but my thoughts have changed and I do not believe that solely changing this is justifiable given that this is not a breaking change for soundness, but merely to make the use of the API better for users. Unsafe Rust already has multiple sharp edges (SB/TB, Box noalias, provenance), and I feel like this is not a particularly sharp one once users understand it.

I would also like to echo the alternative of a new type BikeshedMaybeUninitBagOfBits, or if desire is expressed to retake the better name for the more useful use-case, BikeshedMaybeUninitIgnoringPadding. This to me feels similar to the exposed provenance methods in that we may not like them, but they've ossified (in this case for nearly 6 years), so providing both a good option with the obvious semantics as well as a way to express the use-cases others may care about with regards to ABI is a good compromise. We could even hang this on an edition!

@ia0
Copy link

ia0 commented Feb 23, 2025

I might be missing something obvious, but isn't BikeshedMaybeUninitBagOfBits<T> the same as [MaybeUninit<u8>; size_of::<T>()]? This has limitations today because of generic_const_exprs (in particular you can't fix the stdlib soundness bug with this yet), but ultimately it seems to me the current definition of MaybeUninit seems the most expressive one (making it irrelevant to also have BikeshedMaybeUninitBagOfBits).

@saethlin
Copy link
Member

I would also like to echo the alternative of a new type BikeshedMaybeUninitBagOfBits

What properties does this type have?

@carbotaniuman
Copy link

BikeshedMaybeUninitBagOfBits is the hypothetical MaybeUninit that preserves all bytes of T like is being proposed in this issue. This was called bag of bits semantics in the past which is why the bikeshed is named that.

@chorman0773
Copy link
Contributor

chorman0773 commented Feb 23, 2025

I would suggest not designing your OS around such a facility.

The issue is that the design isn't simply "Whether or not to use MaybeUninit in signatures". The issue that comes up is "What do I have to do on the rust side to match this C API while being compatible with other design decisions of the OS". Much of the SCI (System Call Interface) surface for the OS takes two-pointer aggregates by value instead of by pointer. In the past, when signals were a native part of the OS (and not emulated atop uSEH), sigset_t was defined as a struct wrapping a [u64; 2]). I'd rather those decisions not make it impossible to define other APIs in Rust.
Berkely Sockets will also be fun given that SOCKET likely needs to be a struct of a handle + metadata. And I may need to handle uninitialized SOCKETs - or at the very least, MaybeValid<SOCKET>s.

@comex
Copy link

comex commented Feb 24, 2025

I might be missing something obvious, but isn't BikeshedMaybeUninitBagOfBits<T> the same as [MaybeUninit<u8>; size_of::<T>()]?

You would probably also want alignment based on T, at least if the goal is to match C struct layout.

@ia0
Copy link

ia0 commented Feb 24, 2025

You would probably also want alignment based on T, at least if the goal is to match C struct layout.

That's also expressible today AFAICT:

type AbstractBytes<T> = [MaybeUninit<u8>; size_of::<T>()];

#[repr(C)]
struct AlignedAbstractBytes<T>
where
    [(); size_of::<T>()]:,
{
    aligned: [T; 0],
    abstract_bytes: AbstractBytes<T>,
}

The idea is that Rust already provides a solution for "bag" (array, possibly within a C struct for alignment) and "bytes" (MaybeUninit<u8>), so you can have "bag of bytes" by composing types. And I was gonna say this is not the case for MaybeUninit<T>, but it seems I'm wrong and you can implement MaybeUninit<T> yourself. It's only a lang item for vocabulary purposes AFAICT.

So forget my previous comment and this one as well. This issue is not about language expressivity, it's about vocabulary types. Do we want to fix MaybeUninit to match how it's used (which as far as I'm aware of is done in incompatible ways: bag of bytes versus ABI compatibility)? Or do we want to fix users to match how MaybeUninit is defined (transparent union, thus ABI compatibility)?

@RalfJung
Copy link
Member Author

The issue that comes up is "What do I have to do on the rust side to match this C API while being compatible with other design decisions of the OS".

Most C types must be initialized, so I really don't understand your example here. Can you give the simplest self-contained C signature that you think cannot be bound on Rust without this ABI guarantee?

Do we want to fix MaybeUninit to match how it's used (which rust-lang/rust#134713 (comment) is done in incompatible ways: bag of bytes versus ABI compatibility)?

Do we have an example of in-the-wild usage that relies on the ABI compatibility?

@chorman0773
Copy link
Contributor

chorman0773 commented Feb 26, 2025

Most C types must be initialized, so I really don't understand your example here. Can you give the simplest self-contained C signature that you think cannot be bound on Rust without this ABI guarantee?

In general, I already use this to represent the signature of signal, with a MaybeValid<extern "C" fn(i32)> wrapper.
Also, signal is pretty much where this comes into play, since the actual "signature" of a sigaction is extern "C" fn(i32, siginfo_t *, ucontext_t* ), but to support exception->signal translation properly (In particular, allowing the program to call raise from a signal handler, when recursing into the exception handling entrypoint in userspace causes a program abort), I believe I actually need to pass the two parameters by value to a trampoline via the kernel (and thus cannot use something like Option) and then pass pointers. However, this won't always be fully initialized when SA_SIGINFO is not used - in particular, initializing the ucontext is an expensive operation, since it requires moving a lot of data from a kernelspace buffer to a userspace one.

(As mentioned, socket APIs also has fun when it comes to protocols, though I can't immediately come up with an API that has a glaring issue - unlike Lilium itself which passes certain aggregates by value all the time, C and POSIX seem to do it rarely. If there was ever a more comprehensive POSIX api implementation though, a lot of varargs functions might run into fun shenanigans when you start talking about Handles)

@carbotaniuman
Copy link

And I was gonna say this is not the case for MaybeUninit, but it seems I'm wrong and you can implement MaybeUninit yourself.

It is not currently possible to implement the semantics of MaybeUninit<T> in stable Rust. It requires a feature with a design that is in flux and very likely not on a path to stabilization.

@workingjubilee
Copy link
Member

@RalfJung Where is the documented guarantee that MaybeUninit preserves padding bytes?

@RalfJung
Copy link
Member Author

RalfJung commented Feb 27, 2025

In general, I already use this to represent the signature of signal, with a MaybeValid<extern "C" fn(i32)> wrapper.
Also, signal is pretty much where this comes into play, since the actual "signature" of a sigaction is extern "C" fn(i32, siginfo_t , ucontext_t ), but to support exception->signal translation properly (In particular, allowing the program to call raise from a signal handler, when recursing into the exception handling entrypoint in userspace causes a program abort), I believe I actually need to pass the two parameters by value to a trampoline via the kernel (and thus cannot use something like Option) and then pass pointers. However, this won't always be fully initialized when SA_SIGINFO is not used - in particular, initializing the ucontext is an expensive operation, since it requires moving a lot of data from a kernelspace buffer to a userspace one.

Sorry, I don't follow. Please strip all the detail that's unnecessary for this thread and focus on the actual question: passing some uninitialized data across the ABI. You seem to be saying you want the ucontext_t to be uninit. However, there is no ucontext_t argument in the signature you showed. There is a ucontext_t* argument. Do you want the pointer itself to be uninitialized? That seems entirely unnecessary? Just make it a null pointer, or whatever. Also, (thin) raw pointers are scalar types, so if you really want to do this, MaybeUninit<*const ucontext_t> and *const ucontext_t are actually still ABI-compatible under my proposed changed spec. I am also fairly sure that leaving that pointer uninitialized is UB in C.

If it's just about leaving the data the pointer points to uninitialized, then that's entirely off-topic for this discussion as the ABI does not care about that. *const T and *const U for arbitrary sized types T, U are already declared ABI-compatible and I am not suggesting changing that.

Where is the documented guarantee that MaybeUninit preserves padding bytes?

We don't document this. But people still think it is true, I've myself thought it is true, and it is quite reasonable to think it is true. It's also a much more useful guarantee than the ABI compatibility one. We're still looking for even a single use-case that needs the ABI-compatibility for non-scalar types, whereas we already had a standard library correctness bug due to the lack of padding byte preservation.

@chorman0773
Copy link
Contributor

So the signature I need to actually call looks more like extern "C" fn(i32, siginfo_t, ucontext_t) (note the lack of pointers). There would then be a rust trampoline that gets "called" from the exception handler, which takes the address of the latter two parameters before calling the signature. The case where ucontext_t will be uninitialized though, the C function that gets called won't actually have the two parameters.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 27, 2025

The case where ucontext_t will be uninitialized though, the C function that gets called won't actually have the two parameters.

Uh, then, I'd suggest you call it with a Rust function pointer / declaration that has the matching number of parameters? It's anyway UB if caller and callee disagree on the number of parameters.

@chorman0773
Copy link
Contributor

Its the same vein as calling main from __libc_start_main - actually an impossible task under strict ABI compatibility because you have to match one of 4 different signatures without knowing which signature it is. There are a lot of very low-level shenanigans that abuse abi details like this.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 27, 2025

If you are anyway already causing technical UB by having the signature not match, you shouldn't be worried about venturing into "unspecified" land here either. Just have a version of ucontext_t where every scalar field is wrapped in MaybeUninit; that can still be fully uninitialized and you don't need to rely on MaybeUninit<Aggregate> doing anything specific with the ABI. We don't currently guarantee that to be ABI-compatible (we don't have a structural congruence rule for ABI compatibility of repr(C) structs), but you're already breaking our ABI compatibility rules so that's "fine".

@workingjubilee
Copy link
Member

Did these standard library implementations surface this bug before or after you recommended changing the implementations of these things to use MaybeUninit?

@RalfJung
Copy link
Member Author

RalfJung commented Feb 27, 2025

That sounds like you are trying to insinuate something -- I don't know exactly what you are referring I don't know what I recommended when, but using T instead of MaybeUninit<T> would not have helped.

Let's not discuss the history of rust-lang/rust#134713 here please; there's already a separate issue for that. The fact is that that implementation made it in (neither authored nor reviewed by me), which is a pretty clear sign that "bag of bytes" semantics are (a) useful, and (b) easy to assume to already be the current semantics even if they are not.

@chorman0773
Copy link
Contributor

chorman0773 commented Feb 27, 2025

If you are anyway already causing technical UB by having the signature not match, you shouldn't be worried about venturing into "unspecified" land here either

Ffor the record here - the call that I'm worried about is going to be to the Rust trampoline - which does always have the same parameters. Then I call the actual sa_sighandler or sa_sigaction from that trampoline with appropriate pointers.

But in the case of handler, the ABI level is very well-defined (even if the Rust Level isn't). Whereas wrapping MaybeUninit<ucontext_t> and MaybeUninit<siginfo_t> is not quite as well-defined.

@chorman0773
Copy link
Contributor

Also I speak of a position of being involved in the discussion here - a third party having absolutely zero idea this is happening may have just as much cause to rely on a stable language guarantee. And given much of this code just won't even run in miri (miri won't even run winter-lily, which is my current project touching Lilium), this is probably in the realm of "Breaks silently, until it doesn't".

@Lokathor
Copy link
Contributor

Changing how pointers to MaybeUninit works does appear to be in the proposal, and MaybeUninit integers would also specifically not be affected, so I guess you're concerned that people are passing MaybeUninit<SomeStruct> by value to/from an extern "C" function? Do I have that right?

@RalfJung
Copy link
Member Author

RalfJung commented Feb 28, 2025

@chorman0773

Yes I am proposing to take back a documented language guarantee, and replace it with a different, more useful guarantee. I think long-term this will cause less harm. So I was asking if there's any cases where the ABI guarantee is useful or even needed.

I am still extremely confused about your example. You keep bringing up more and more concepts you're not explaining and there's too many parties calling each other so it's not even clear which call you are talking about when. When I think I understand is that there's a particular function call where the caller uses signature extern "C" fn(i32, siginfo_t, ucontext_t) (but you never even stated that signature! you stated a different one, and then later said that's not the real signature). But then based on some ambient information the caller might know that the callee is actually declared as extern "C" fn(i32, siginfo_t) and therefore you want to leave the last argument uninitialized. In other cases all 3 arguments exist and you want to actually pass the data, and then the ABI must of course match. But there's also a "Rust trampoline" involved somehow and now you completely lost me again.
There should be a single function call that matters, where caller and callee use different but ABI-compatible types. Please explain everything about that call, and leave away everything else.

Could you achieve your goals without the ABI guarantee (and without worsening performance)? And if yes, would that solution be any less "natural" than what you are currently doing? Frankly, based on what you said so far, any possible alternative seems more natural to me. ;)

@Lokathor
Copy link
Contributor

(sorry for any confusion Ralf, but my own question was directed at chorman)

@chorman0773
Copy link
Contributor

chorman0773 commented Feb 28, 2025

Ok, I'll restate what the flow is:

  • The Lilium Kernel has a core concept called an exception handler. This is invoked by the kernel (in much the same manner that a POSIX signal handler is invoked - asynchronously or synchronously, and assume same sort of ordering constraints on an asynchronous exception)
  • For certain exception types, the userspace runtime invokes a C signal handler. In order to do this, it first invokes a trampoline function. This trampoline always has signature extern "C" fn(i32, MaybeUninit<siginfo_t>, MaybeUinit<ucontext_t>) The exception handler knows when to initialize siginfo_t and ucontext_t but always passes them. This is because ucontext_t in particular is partially initialized with a handle needed to resume handling the previous exception.
    • The trampoline is invoked first because the exception handler is actually "Returning" to it. It needs to resume from the Exception Handling state so that the signal handler can call raise (if it was asynchronous), which in turn is translated to the synchronous entry point to exceptions, being ExceptionHandleSynchronous (calling this on a thread that is currently handling an exception causes the thread to exit with an unmanaged exception)
    • The Trampoline "call" is setup manually, so this relies on knowing (and matching) the ABI for MaybeUninit<siginfo_t> and MaybeUninit<ucontext_t>. This in turn requires an ABI guarantee for those types.
  • The trampoline then knows whether or not to call sa_sigaction (with a signature of unsafe extern "C" fn (i32, *mut siginfo_t, *mut ucontext_t)) or sa_sighandler (with a signature of unsafe extern "C" fn(i32)) (the trampoline also has a bunch of other setup to do to fully support the full set of sigaction options from POSIX). For efficiency reasons, the call is just merged into one without a branch (though this can easily be done from asm).

ucontext_t and siginfo_t are defined in another library that's also used by consumers of the API. The types there probably don't want to let the callee deinit things (especially in ucontext_t which is then read back by the trampoline, before the trampoline goes back into the Exceptionhandling context to resume handling the exception through other means).

@workingjubilee
Copy link
Member

@RalfJung You are the Pope of Rust, or at least the Pope of Rust Safety Models. Whenever you say something, even if you say something blatantly wrong, almost no one calls you out on it. You are the proxy author and reviewer of all std code because everyone is reading everything you are writing and thinking about it when writing unsafe code. You are repeatedly cited in these discussions. If you repeatedly say something wrong, you can convince other people it's true, simply by repeating the wrong thing.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 28, 2025 via email

@workingjubilee
Copy link
Member

workingjubilee commented Feb 28, 2025

That is a reasonable stance, honestly.

I just am not surprised incorrect code is written based on something you say again and again, and don't think it should be taken as evidence the confusion is that widespread if it might instead be your confusion spreading widely.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 28, 2025 via email

@comex
Copy link

comex commented Feb 28, 2025

@chorman0773

Ok, I'll restate what the flow is:

Okay. It sounds to me like even if MaybeUninit's ABI compatibility guarantee is restricted to primitive types and its ABI for other types is unspecified, you have some options:

  1. Define MaybeUninitSiginfo and MaybeUninitUcontext structs that are like siginfo_t and ucontext_t but wrap the individual fields in MaybeUninit.

    This would still be guaranteed ABI-compatible with siginfo_t/ucontext_t by-value parameters in C.

    Maybe this is an abstraction violation, but it sounds like it shouldn’t be the end of the world since this is such a special case.

  2. Change the trampoline to take siginfo_t and ucontext_t by pointer, i.e. extern "C" fn(i32, *const siginfo_t, *const ucontext_t).

    After all, it's not like this is some syscall interface where you can't pass parameters via the stack. siginfo_t and ucontext_t are both large structs that will be passed on the stack anyway. Is the issue that the trampoline itself is a stable ABI boundary you don't want to break (separately from the stable POSIX API)? If so, then I think you've already made a mistake. ucontext_t may need to grow over time to account for future architecture extensions, so it should never be passed by value across a stable ABI boundary.

However, I think we can do better. Even if MaybeUninit<T> is not guaranteed to be ABI-compatible with T, it definitely should be guaranteed to have a stable ABI (when passed to extern "C" functions). It shouldn't be something that can change on rustc upgrades.

That should be enough of a guarantee that you could keep doing what you're currently doing. Since you're setting up the call manually, all that matters is that there is some stable ABI, not what it is exactly. The only issue would be if (a) you are worried about breaking an existing stable ABI boundary (for your apparently still-in-development OS), and (b) the stable ABI rustc adopts for MaybeUninit<siginfo_t> or MaybeUninit<ucontext_t> doesn't match what rustc does today. But in practice it is going to match, because the structs were already being passed on the stack.

Beyond that, if MaybeUninit has a stable ABI, then we may as well define it as being ABI-compatible with something. For example, if T is a struct, MaybeUninit<T> could be documented as ABI-compatible with an equivalent struct where u8 fields are added to fill all padding bytes. (This would need to be fleshed out a bit more to deal with nested structs/unions/whatever, but the basic idea should work.) This would ensure that functions taking MaybeUninit by-value can still be called from C or whatever.

…With all that said, I am definitely sympathetic to the alternative view that MaybeUninit<T> ought to remain ABI-compatible with T. It's definitely surprising if they're incompatible. But I also don't want Rust to have to add yet another wrapper type.

@carbotaniuman
Copy link

carbotaniuman commented Feb 28, 2025

I think this breaking change is not justified by the stated goal, which is expressly not any inherent unsoundness, but just a desire to reduce a paper cut for unsafe Rust users. I think that's a great goal (and to be clear, I support it myself), but unsafe code is tricky, with many corner cases. Box noalias remains on the books today despite being a far larger footgun. We have not yet ruled out Stacked Borrows as an aliasing model! MaybeUninit not keeping padding may be surprising, but I think it can also be explained relatively easily and is a small footgun compared to other complexities.

In addition, this creates a special case in the ABI compatibility rules. Windows until a few months ago, passed MaybeUninit by value across FFI boundaries. Now, I'm 95% sure that all of those types are simple pointers, but we've now added a footgun in cases where they aren't. And any cases which relied on this ABI compatibility will likely break in weird and spectacular ways only at runtime - these are the types of cases likely to be underindexed on Crater.

Making MaybeUninit<T> preserve the padding bytes of T also comes with a performance downside - previously these padding bytes were garbage, and the compiler could not pass it across function boundaries (especially in extern "Rust", where this guarantee would definitely apply). Making the padding worthwhile would likely regress common use cases of storing or transferring a maybe uninitialized T in favor of use cases that do need this bag of bits functionality.

I think it's also disingenuous to act like the use cases here are contrived and useless - this is a documented language feature, not RUSTC_BOOTSTRAP. And yes, while niche use cases are indeed niche and some of us here may consider the stabilization to be a mistake, we (speaking as a community) did make the mistake.

I am also confused as to why my compatibility ideas seemed to have been (intentionally?) ignored. MaybeUninit is a vocabulary type today - you cannot write the semantics of it in library code. #[repr(transparent)] on unions has real problems with the design, and may not be stabilized in a "reasonable" amount of time. Migration to something else for the use cases that need it will be all but impossible, or users will just take the hit and write T instead, willfully invoking undefined behavior due to the lack of an alternative.

To me, MaybeUninitIgnoringPadding and MaybeUninitBagOfBits are different types with different use cases. What they may be named or how those capabilities are actually written are immaterial to me, what is important is that we do not throw away a core guaranteed capability without sufficient migration patterns and justification.

@comex
Copy link

comex commented Mar 2, 2025

Making MaybeUninit<T> preserve the padding bytes of T also comes with a performance downside - previously these padding bytes were garbage, and the compiler could not pass it across function boundaries (especially in extern "Rust", where this guarantee would definitely apply). Making the padding worthwhile would likely regress common use cases of storing or transferring a maybe uninitialized T in favor of use cases that do need this bag of bits functionality.

I think you will be hard-pressed to find a case where this makes a measurable difference, even if you specifically microbenchmark the function call.

I'm neutral regarding the rest of your post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests