Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: Coalesce wake-up events #4150

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

madsmtm
Copy link
Member

@madsmtm madsmtm commented Feb 27, 2025

Windows' message queues has a default limit of 10.000, which EventLoopProxy::wake_up can end up exhausting from another thread if e.g. the main thread is busy processing some long-running operation.

To avoid that, we coalesce wake up events with an Arc<AtomicBool>. This is explicitly and intentionally allowed by the documentation for EventLoopProxy::wake_up, and is also what's done on other platforms (although macOS/iOS uses CFRunLoopSource instead of events, which does coalescing automatically. I didn't find a similar abstraction in the Win32 API, though I'm not too familiar with it, so it might exist).

If the user wants to handle a specific number of messages, they can use a bounded channel.

See #3687 for further discussion of the problem.

Tested with the following program in a Windows 11 VM, and verified that PostMessageW error-ed before the PR, and doesn't after this PR:

Test program

use std::error::Error;
use std::sync::mpsc::{channel, Receiver};

use winit::application::ApplicationHandler;
use winit::event_loop::{ActiveEventLoop, EventLoop};

#[path = "util/tracing.rs"]
mod tracing;

struct Application {
    receiver: Receiver<u64>,
}

impl ApplicationHandler for Application {
    fn can_create_surfaces(&mut self, _event_loop: &dyn ActiveEventLoop) {}

    fn window_event(
        &mut self,
        _event_loop: &dyn ActiveEventLoop,
        _window_id: winit::window::WindowId,
        _event: winit::event::WindowEvent,
    ) {
    }

    fn proxy_wake_up(&mut self, _event_loop: &dyn ActiveEventLoop) {
        println!("proxy_wake_up");
        for event in self.receiver.try_iter() {
            println!("User event: {event:?}");

            if event > 150000 {
                std::process::exit(0);
            }
        }
    }
}

fn main() -> Result<(), Box<dyn Error>> {
    tracing::init();

    let event_loop = EventLoop::new()?;
    let proxy = event_loop.create_proxy();
    let (sender, receiver) = channel();

    std::thread::spawn(move || {
        std::thread::sleep(std::time::Duration::from_secs(1));

        let mut counter = 0;
        loop {
            proxy.wake_up();

            counter += 1;

            if counter > 150000 {
                sender.send(counter).unwrap();
                let mut wakeup_counter = 1;
                loop {
                    proxy.wake_up();
                    println!("Sent {wakeup_counter} WakeUp events");
                    wakeup_counter += 1;
                }
            }
        }
    });

    event_loop.run_app(Application { receiver })?;

    Ok(())
}

CC @amrbashir, could I get you to test this too?

@madsmtm madsmtm added B - bug Dang, that shouldn't have happened DS - windows labels Feb 27, 2025
@madsmtm madsmtm requested a review from notgull as a code owner February 27, 2025 06:32
@madsmtm madsmtm marked this pull request as draft February 27, 2025 06:32
@madsmtm madsmtm added this to the Version 0.31.0 milestone Feb 27, 2025
@madsmtm madsmtm added the S - platform parity Unintended platform differences label Feb 27, 2025
Comment on lines 747 to 758
if self.has_sent_wakeup_msg.fetch_or(true, Ordering::AcqRel) {
// Do not send a wakeup event if one has already been sent, but hasn't been processed
// yet. This prevents errors when the internal message queue fills up, and effectively
// coalesces wakeups.
tracing::trace!("avoided sending wake up, previous wake-up has yet to be processed");
return;
}
if unsafe { PostMessageW(self.target_window, PROXY_WAKEUP_MSG_ID.get(), 0, 0) } == 0 {
// _can_ technically fail, but realistically won't, since we've prevented the most
// common case (queue full) above.
tracing::error!("failed waking event loop: {}", std::io::Error::last_os_error());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just fix the thing properly? As in, given that we know that the queue is full, it means that the event loop will wake up anyway, thus if we check for the bool in the event loop as well, we'll be able to properly wake-up the user.

In case of PostMessageW failure here, we can just set some check that we need to wake-up the user even though, we haven't send the message with PROXY_WAKE_MSG_ID, thus, it should generally work.

Checking before going to sleep won't be required, since it'll either be in the queue or we'd have events ready and flag ready as well for it to be processed.

There could be an issue with over wake-up in some cases, but I guess it's fine, since the user will stop wake-ups eventually, and also, we don't guarantee anything wrt to delivery times IIRC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of PostMessageW failure here, we can just set some check that we need to wake-up the user even though, we haven't send the message with PROXY_WAKE_MSG_ID, thus, it should generally work.

Hmm, but we want to avoid filling the queue, as that'll prevent other messages from arriving; i.e. having a single PROXY_WAKEUP_MSG_ID in the queue is desirable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm misunderstanding what you're advocating for?

Copy link
Member

@kchibisov kchibisov Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, that we should send a message, and also check a bit, and if we can not send a message, but check a bit, we still deliver the event, even though, it may be not in a queue, because queue was full.

Since if the queue was full, it means that the loop will wake-up anyway, and the message we send was not really required, since it's a user message.

So it'll fix your _can_ technically fail quote, since even if you fail, you'll deliver the event properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, that we should send a message, and also check a bit, and if we can not send a message, but check a bit, we still deliver the event, even though, it may be not in a queue, because queue was full.

Since if the queue was full, it means that the loop will wake-up anyway, and the message we send was not really required, since it's a user message.

Do you mean to check the boolean inside thread_event_target_callback regardless of whether the event is USER_EVENT_MSG_ID?

So it'll fix your _can_ technically fail quote, since even if you fail, you'll deliver the event properly.

I mean, I'm not really worried about the "can technically fail", there's a bunch of other places where we try to add to the queue that can currently fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to check the boolean inside thread_event_target_callback regardless of whether the event is USER_EVENT_MSG_ID?

yes, though, I'm not sure about impl details on windows, the point is that even if sending a message errors, we can wake-up user anyaway.

I mean, I'm not really worried about the "can technically fail", there's a bunch of other places where we try to add to the queue that can currently fail.

then they should be fixed as well, linux backends are not affected by that for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented what you propose locally, but when thinking about it, I realized that it runs into issues with priority:
Let's imagine the following case; there's a thread that continuously runs proxy.wake_up(), processing each wake-up takes 1 second, and the user is wiggling their cursor and generating 5 input events per second. After 10 seconds, there will be ~40 pending cursor events in the queue if we check the "should wake" boolean on the window thread on every event, since we will end up processing the wake-up before other events that has been sitting in the queue for a longer time.

To fix it while retaining priority, I think we'd have to use two booleans, one for the normal case, and one for the case where the event wasn't pushed. And that'd still run into the issue of preferring proxy wake-ups when the queue is full. I'd really rather not bother.

Copy link
Member

@kchibisov kchibisov Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we check before sleep or right after wake-up special bool that tells that we've failed to add something to the queue since everything is processed in some kind of batch? e.i. right after NewEvents?

That's what I've suggested initially, you use one to control the amount you send, and the other one whether you need to wake-up by your own means, because the queue got full, you don't need to check in every event, just after NewEvents.

It's just with 8k res mice, it can really break in game and e.g. break a loop, so I'd rather ensure that wakeups kind of work.

@madsmtm madsmtm force-pushed the madsmtm/coalesce-windows-wakeups branch from 03cb11e to 68f5052 Compare February 27, 2025 07:08
Copy link
Contributor

@amrbashir amrbashir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works on my end, it avoids filling up the queue.

Easier to understand
Base automatically changed from madsmtm/windows-use-application-handler to master March 1, 2025 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B - bug Dang, that shouldn't have happened DS - windows S - platform parity Unintended platform differences
Development

Successfully merging this pull request may close these issues.

3 participants