This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Use __builtin_expect and __builtin_assume in _LIBCPP_ASSERT
ClosedPublic

Authored by ldionne on Mar 24 2022, 6:27 AM.

Details

Summary

Since we expect the condition to be true most of the time, we might
as well tell the compiler. And when assertions are disabled, we
might as well tell the compiler that it's allowed to assume that
the condition holds.

Diff Detail

Event Timeline

ldionne created this revision.Mar 24 2022, 6:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2022, 6:27 AM
ldionne requested review of this revision.Mar 24 2022, 6:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2022, 6:27 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

Maybe it would make sense to use __builtin_assume if assertions are disabled?

Maybe it would make sense to use __builtin_assume if assertions are disabled?

My understanding is that __builtin_assume(x) does not evaluate x ever at runtime, but the documentation is a little bit light on the details. I think this is a fantastic idea, I'll try it out. If __builtin_assume does evaluate x, I added a test that should fail in my assertions handler patch.

Maybe it would make sense to use __builtin_assume if assertions are disabled?

My understanding is that __builtin_assume(x) does not evaluate x ever at runtime, but the documentation is a little bit light on the details. I think this is a fantastic idea, I'll try it out. If __builtin_assume does evaluate x, I added a test that should fail in my assertions handler patch.

I'm pretty sure it doesn't evaluate at compile-time. https://godbolt.org/z/8d7fh8nc7 clang specifically warns that side-effects will be discarded.

I'm pretty sure it doesn't evaluate at compile-time. https://godbolt.org/z/8d7fh8nc7 clang specifically warns that side-effects will be discarded.

Yup, I just noticed that while implementing your suggestion too!

ldionne updated this revision to Diff 418007.Mar 24 2022, 12:07 PM
ldionne retitled this revision from [libc++] Use __builtin_expect in _LIBCPP_ASSERT to [libc++] Use __builtin_expect and __builtin_assume in _LIBCPP_ASSERT.
ldionne edited the summary of this revision. (Show Details)

Address review comments.

Mordante accepted this revision as: Mordante.Mar 24 2022, 12:23 PM
Mordante added a subscriber: Mordante.

Maybe it would make sense to use __builtin_assume if assertions are disabled?

My understanding is that __builtin_assume(x) does not evaluate x ever at runtime, but the documentation is a little bit light on the details. I think this is a fantastic idea, I'll try it out. If __builtin_assume does evaluate x, I added a test that should fail in my assertions handler patch.

Clang's documentation is truncated. Based on my archeology (D122423) the last sentence should read
The argument itself is never evaluated, so any side effects of the expression will be discarded.

We already claim that the assertions should be fatal and returning from the assertion handler is undefined behaviour. Therefore I think using __builtin_assume isn't wrong. I only wonder whether we should add an opt-out for vendors who don't like possible undefined behaviour of __builtin_assume in case the code hits an assertion that's wrong. (Much like the false positives in the Clang assertions.)

The original patch LGTM, but I would like you to investigate @philnik's suggestion further.

Note I started writing my previous message before @philnik's second comment was posted.

philnik accepted this revision.Mar 24 2022, 12:45 PM

__builtin_assume seems to ignore most of our asserts, but it definitely doesn't hurt. (https://godbolt.org/z/W7Gb43cx5). LGTM.

This revision is now accepted and ready to land.Mar 24 2022, 12:45 PM

__builtin_assume seems to ignore most of our asserts, but it definitely doesn't hurt. (https://godbolt.org/z/W7Gb43cx5). LGTM.

It might hurt when the assertion is wrong and the user builds without assertions.
https://godbolt.org/z/hsxhr9bWY
test1 validates the pointer and test2 doesn't since we told the compiler the if is always true.

Note that I'm not against doing this. I only wonder whether or not we need an opt-out for vendors/users.

__builtin_assume seems to ignore most of our asserts, but it definitely doesn't hurt. (https://godbolt.org/z/W7Gb43cx5). LGTM.

It might hurt when the assertion is wrong and the user builds without assertions.
https://godbolt.org/z/hsxhr9bWY
test1 validates the pointer and test2 doesn't since we told the compiler the if is always true.

Note that I'm not against doing this. I only wonder whether or not we need an opt-out for vendors/users.

IMO our assertions should simply not be wrong, otherwise we have a whole other problem when assertions are enabled. Assertions already encode things that, if they aren't satisfied, then the code exhibits UB. I'm not sure there is a lot of value in dialling how messed up the UB is going to be, so I'd suggest going for simplicity and adding a escape hatch if we have a reason to later on. And to that -- if someone came and asked for a escape hatch, I'd immediately ask them why they are not turning assertions on instead.

ldionne updated this revision to Diff 418198.Mar 25 2022, 6:04 AM

Workaround test failure. Unfortunately I don't know how to fix the test without disabling it and without removing the assertion from vector::swap altogether. IMO keeping the assertion has more value than being robust against ADL in that case.

__builtin_assume seems to ignore most of our asserts, but it definitely doesn't hurt. (https://godbolt.org/z/W7Gb43cx5). LGTM.

It might hurt when the assertion is wrong and the user builds without assertions.
https://godbolt.org/z/hsxhr9bWY
test1 validates the pointer and test2 doesn't since we told the compiler the if is always true.

Note that I'm not against doing this. I only wonder whether or not we need an opt-out for vendors/users.

IMO our assertions should simply not be wrong, otherwise we have a whole other problem when assertions are enabled. Assertions already encode things that, if they aren't satisfied, then the code exhibits UB. I'm not sure there is a lot of value in dialling how messed up the UB is going to be, so I'd suggest going for simplicity and adding a escape hatch if we have a reason to later on. And to that -- if someone came and asked for a escape hatch, I'd immediately ask them why they are not turning assertions on instead.

I agree assertions should always be right. But in my experience, bugs happen, and assertions can be wrong.
However I don't feel very strongly about this, I mainly want to have it considered. So I'm not objecting
against the simple approach without an escape hatch.

ldionne updated this revision to Diff 418848.Mar 29 2022, 5:24 AM

Fix CI failure

Just a heads up, we've bisected some leaks to this change. There are some calls to operator new that aren't be optimized away because they're now the subject of various llvm.assume() calls. Still investigating.

Just a heads up, we've bisected some leaks to this change. There are some calls to operator new that aren't be optimized away because they're now the subject of various llvm.assume() calls. Still investigating.

Ugh, thanks for the heads up. Please keep me in the loop (here or ideally in Discord so I don't miss it). I'm open to removing __builtin_assume until the issue is fixed if it gets to that.

Just a heads up, we've bisected some leaks to this change. There are some calls to operator new that aren't be optimized away because they're now the subject of various llvm.assume() calls. Still investigating.

Ugh, thanks for the heads up. Please keep me in the loop (here or ideally in Discord so I don't miss it). I'm open to removing __builtin_assume until the issue is fixed if it gets to that.

I can't share the preprocessed source code but I did reduce the issue down to missed DSE (https://github.com/llvm/llvm-project/blob/3b9833597e810d4c485487d2f094a8e223af5548/llvm/lib/Analysis/CaptureTracking.cpp#L375) causing the call to operator new to not get removed

@global = external constant i8

define void @f() {
  %tmp1 = tail call noalias i8* @_Znwm(i64 noundef 32)
  %tmp2 = icmp ugt i8* %tmp1, @global
  ; this assume isn't even necessary to stop DSE, just the icmp
  ;tail call void @llvm.assume(i1 %tmp2)
  store i8 0, i8* %tmp1, align 1
  ret void
}

declare i8* @_Znwm(i64)

declare void @llvm.assume(i1)

So __builtin_assume is causing worse codegen in some cases. I think putting this under a flag would be nice while we figure out remaining regressions caused by more __builtin_assume.

In our case I believe we actually do have a leak in some sense in the original code which was previously optimized away. (although whether that's a real leak depends on your definition of a leak, see https://discourse.llvm.org/t/eliminating-global-memory-roots-or-not-to-help-leak-checkers/58066, anyway that's drifting too far from this issue)

https://reviews.llvm.org/D123175 to add a macro that turns off calling __builtin_assume

were there any benchmarks showing that this was beneficial? as discussed in https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609, sprinkling assumes everywhere is not necessarily a worthwhile tradeoff and perhaps the default should be to not add assumes everywhere