This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Use WSAEventSelect for MainLoop polling on windows
ClosedPublic

Authored by labath on Aug 4 2022, 5:35 AM.

Details

Summary

This patch switches the MainLoop class to use the WSAEventSelect
mechanism to wait for multiple sockets to become readable. The
motivation for doing that is that this allows us to wait for other kinds
of events as well (as long as they can be converted to WSAEvents). This
will allow us to avoid (abstract away) pipe-based multiplexing
mechanisms in the generic code, since pipes cannot be combined with
sockets on windows.

Since the windows implementation will now look completely different than
the posix (file descriptor-based) implementations, I have split the
MainLoop class into two (MainLoopPosix and MainLoopWindows), with the
common code going into MainLoopBase.

Diff Detail

Event Timeline

labath created this revision.Aug 4 2022, 5:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2022, 5:35 AM
labath requested review of this revision.Aug 4 2022, 5:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2022, 5:35 AM
jingham added a subscriber: jingham.Aug 4 2022, 8:55 AM

This is a WIP, presumably in the final version there won't be prominent #ifdef _WIN32 in a file in the "Host/common" directory.

mgorny accepted this revision.Aug 5 2022, 12:19 AM

Only guessing what various WSA functions do, it seems to all make sense to me. Just don't forget to clang-format ;-).

This revision is now accepted and ready to land.Aug 5 2022, 12:19 AM
labath added a comment.Aug 8 2022, 5:40 AM

This is a WIP, presumably in the final version there won't be prominent #ifdef _WIN32 in a file in the "Host/common" directory.

Yeah, that's the main reason for the WIPness. I still have to figure out how to factor this (there's a fair amount of code that can still be shared between the two implementations, but there's also an (increasing) portion that cannot.

labath updated this revision to Diff 450825.Aug 8 2022, 8:39 AM

non-wip version

labath retitled this revision from [WIP][lldb] Use WSAEventSelect for MainLoop polling to [lldb] Use WSAEventSelect for MainLoop polling on windows.Aug 8 2022, 8:41 AM
labath edited the summary of this revision. (Show Details)
labath added a reviewer: mstorsjo.
mgorny accepted this revision.Aug 8 2022, 9:25 AM

Thanks for doing this. The common and POSIX parts LGTM. Just one minor suggestion since you've touched that line.

lldb/source/Host/posix/MainLoopPosix.cpp
124

I don't really have experience with this API - is this pattern (creating and closing event objects for each poll run) the common way of using this API? Is there any potential performance risk compared with keeping event objects around (if that's possible?).

Can @alvinhochun take the patch for a spin? I think you've got more realistic usecases of lldb to try it out in. (Do you want me to make a build with the patch for you to try out?) @labath What's the best way to exercise this code in actual use of lldb?

labath added a comment.Aug 9 2022, 8:29 AM

I don't really have experience with this API - is this pattern (creating and closing event objects for each poll run) the common way of using this API? Is there any potential performance risk compared with keeping event objects around (if that's possible?).

Well.. I can't exactly say I'm familiar with the APIs either, but I suspect this is not the "common" way of using them. It is, however, the closest I could get to the purely level-triggered behavior that one gets with select/poll on unix. Normally (with event reuse), windows requires that one calls the "reset" function (e.g. recv) at least once (the call itself is sufficient, it does not have to actually read all pending data) before the event can be triggered again. I think possible that nothing depends on this behavior (or could be fixed to not depend on it) -- after all you're usually doing this because you actually want to read from the socket. However, we did have one unit test which relied on this. That was mostly because it was being lazy (testing other things), but I wanted to start out with something safe-ish.

As for performance, it's definitely slower than not creating the events every time around, but I doubt that difference would be noticeable. However, now that I think about it, I think the repeating the call to WSAEventSelect should be sufficient to get the level-triggered behavior, even if it's just reusing the same event object. Let me try how that works.

Can @alvinhochun take the patch for a spin? I think you've got more realistic usecases of lldb to try it out in. (Do you want me to make a build with the patch for you to try out?) @labath What's the best way to exercise this code in actual use of lldb?

I would appreciate that. I have tried to build it and run the test suite, but I didn't have a clean baseline, so I can't really say whether it works for everything. The main user of this code is the gdb-remote communication code, so debugging something via lldb-server (or running the lldb-server test suite) should be a good indicator of whether it works.

Can @alvinhochun take the patch for a spin? I think you've got more realistic usecases of lldb to try it out in. (Do you want me to make a build with the patch for you to try out?) @labath What's the best way to exercise this code in actual use of lldb?

I would appreciate that. I have tried to build it and run the test suite, but I didn't have a clean baseline, so I can't really say whether it works for everything. The main user of this code is the gdb-remote communication code, so debugging something via lldb-server (or running the lldb-server test suite) should be a good indicator of whether it works.

I presume this is covered by running ninja check-lldb, or should I flip the switch to prefer lldb-server on Windows before doing that? (I guess I can do both.) I've got an environment where check-lldb mostly passes, I can check that it doesn't regress it.

Can @alvinhochun take the patch for a spin? I think you've got more realistic usecases of lldb to try it out in. (Do you want me to make a build with the patch for you to try out?) @labath What's the best way to exercise this code in actual use of lldb?

I would appreciate that. I have tried to build it and run the test suite, but I didn't have a clean baseline, so I can't really say whether it works for everything. The main user of this code is the gdb-remote communication code, so debugging something via lldb-server (or running the lldb-server test suite) should be a good indicator of whether it works.

I presume this is covered by running ninja check-lldb, or should I flip the switch to prefer lldb-server on Windows before doing that? (I guess I can do both.) I've got an environment where check-lldb mostly passes, I can check that it doesn't regress it.

I tested out this patch with running ninja check-lldb. Out of the box, this patch doesn't regress anything - the tests still pass. (In my environment, there's a bunch of occasional stray failures already before, but by rerunning the tests a couple times, I can get a clean run.)

If I edit ShouldUseLLDBServer() in lldb/source/Plugins/Process/Windows/Common/ProcessWindows.cpp to always return true (before applying this patch), most tests still pass, but I have 11 tests that hang so that I end up having to kill lldb-server. With this patch on top, I still get the same set of 11 hanging tests, so I think that means that this patch doesn't regress anything with respect to that.

I gave this a spin with LLDB_USE_LLDB_SERVER=1, but there looks to be some pre-existing issues with the lldb-server implementation so I can't really test much normally using builds of Krita:

  • Manually set breakpoints don't work (they stay unresolved?)
  • Debugger apparently breaks on every first chance exceptions (running ASAN builds stops too often)
  • Backtraces have no symbol names and seems to show bogus addresses

Other than that, it runs the debuggee and I can continue on exception stops. That's as much as I can tell.

emaste added a subscriber: emaste.Aug 16 2022, 12:11 PM
labath updated this revision to Diff 453686.Aug 18 2022, 9:42 AM

Switch to the implementation which uses persistent WSAEvents for
listening. I have no idea whether this makes a difference in practice,
but it /feels/ more efficient.

Thank you, Martin and Alvin for trying this out. Setting LLDB_USE_LLDB_SERVER definitely increases the number of tests that use lldb-server, but I wouldn't really say it's necessary, as I would expect that a change like this either works, or fails spectacularly.

Can I assume that you are ok with checking this in?

As for performance, it's definitely slower than not creating the events every time around, but I doubt that difference would be noticeable. However, now that I think about it, I think the repeating the call to WSAEventSelect should be sufficient to get the level-triggered behavior, even if it's just reusing the same event object. Let me try how that works.

I have implemented this now.

This revision was automatically updated to reflect the committed changes.

FYI, this patch added some new compilation warnings:

[4055/7050] Building CXX object tools/lldb/source/Host/CMakeFiles/lldbHost.dir/windows/MainLoopWindows.cpp.obj
llvm-project/lldb/source/Host/windows/MainLoopWindows.cpp:28:9: warning: unused variable 'result' [-Wunused-variable]
    int result = WSAEventSelect(fd, info.event, FD_READ | FD_ACCEPT | FD_CLOSE);
        ^
llvm-project/lldb/source/Host/windows/MainLoopWindows.cpp:38:9: warning: unused variable 'result' [-Wunused-variable]
    int result = WSAEventSelect(fd.first, WSA_INVALID_EVENT, 0);
        ^
llvm-project/lldb/source/Host/windows/MainLoopWindows.cpp:86:8: warning: unused variable 'result' [-Wunused-variable]
  BOOL result = WSACloseEvent(it->second.event);
       ^
3 warnings generated.

(They're only used in asserts.) I guess we should add void casts to mark the variables as used outside of asserts.

(They're only used in asserts.) I guess we should add void casts to mark the variables as used outside of asserts.

Actually, there's a dedicated UNUSED_IF_ASSERT_DISABLED macro for that. Please use it instead.

(They're only used in asserts.) I guess we should add void casts to mark the variables as used outside of asserts.

Actually, there's a dedicated UNUSED_IF_ASSERT_DISABLED macro for that. Please use it instead.

One of these days, I'm going to have to delete it. If void casts are good enough for the rest of llvm, I don't see why should we be any different.

(They're only used in asserts.) I guess we should add void casts to mark the variables as used outside of asserts.

Actually, there's a dedicated UNUSED_IF_ASSERT_DISABLED macro for that. Please use it instead.

One of these days, I'm going to have to delete it. If void casts are good enough for the rest of llvm, I don't see why should we be any different.

Ok, so is this discussion settled and I can go ahead and use void casts for these trivial fixups? (I usually just push such patches directly without passing via review.)

(They're only used in asserts.) I guess we should add void casts to mark the variables as used outside of asserts.

Actually, there's a dedicated UNUSED_IF_ASSERT_DISABLED macro for that. Please use it instead.

One of these days, I'm going to have to delete it. If void casts are good enough for the rest of llvm, I don't see why should we be any different.

Ok, so is this discussion settled and I can go ahead and use void casts for these trivial fixups? (I usually just push such patches directly without passing via review.)

I'm might not call it "settled", but I don't think anybody will be too upset regardless of that you do. You won't be the first or the last one to push such patches, and that's the reason why I think that the UNUSED_IF_ASSERT_DISABLED macro is doomed to fail (or at least be used inconsistently).