This is an archive of the discontinued LLVM Phabricator instance.

[IRBuilder] Remove positivity check from CreateAlignmentAssumption()
ClosedPublic

Authored by lebedev.ri on Nov 16 2018, 2:58 PM.

Details

Summary

An alignment should be non-zero positive power-of-two, anything and everything else is UB.
We should not have that check for all these prerequisites here, it's just UB.
Also, that was likely confusing middle-end passes.

While there, CreateIntCast() should be called with /*isSigned*/ false.
Think about it, there are two explanations: "An alignment should be positive",
therefore the sign bit is unset, so zext and sext is equivalent.
Or a second one: you have i2 0b10 - a valid alignment,
now you sext it: i2 0b110 - no longer valid alignment.

Diff Detail

Repository
rL LLVM

Event Timeline

This revision is now accepted and ready to land.Nov 16 2018, 3:14 PM
craig.topper added inline comments.Nov 16 2018, 3:49 PM
include/llvm/IR/IRBuilder.h
2328

is Alignment->getType() and IntPtrTy the same thing here? I ask because the Constant in Sub1 was created with one and the ICmp was created with the other.

hfinkel requested changes to this revision.Nov 16 2018, 8:07 PM
hfinkel added a subscriber: hfinkel.
hfinkel added inline comments.
include/llvm/IR/IRBuilder.h
57

Why not just use a Boolean?

2315

True, but I don't see why we're checking it here. It's just UB.

This revision now requires changes to proceed.Nov 16 2018, 8:07 PM
erichkeane edited the summary of this revision. (Show Details)
lebedev.ri added inline comments.Nov 19 2018, 7:12 AM
include/llvm/IR/IRBuilder.h
2316

I would personally call whatever clang codegen(!) function to produce an assumption instead, yes.

lebedev.ri added inline comments.Nov 26 2018, 12:24 AM
include/llvm/IR/IRBuilder.h
2317–2319

Would it be better to only keep the predicate change, and leave the power-of-two as UB?

erichkeane marked an inline comment as done.Nov 26 2018, 6:27 AM
erichkeane added inline comments.
include/llvm/IR/IRBuilder.h
57

CreateAlignmentAssumption has a default pointer parameter that would convert to boolean. Additionally, I'm becoming quite militantly against boolean parameters lately :)

2315

@jyknight encouraged this patch on IRC. I believe it makes @lebedev.ri s job on the sanitizer easier.

2317–2319

We already have that before this change, though it has a mis-signedness. However, there isn't really issue treating it as signed always, since values outside of the range of a signed value are illegal/UB anyway. IMO, this would be the 'don't change things' opinion.

2328

I believe they are the same (we validate it on 2195, and 'fix' it on 2196), but I can see why it casues confusion. I'll change them all to Alignment->getType() for clarity's sake.

lebedev.ri added inline comments.Nov 26 2018, 6:49 AM
include/llvm/IR/IRBuilder.h
2315

These patches do not affect the sanitizer patches in any way.
Using correct predicate will simply mean that you will not erroneously discard alignment request of 2147483648U (sign bit).

2317–2319

Replied to the wrong comment maybe?
The question was - do we really want to do @llvm.assume() if the requested alignment
is really power-of-two at the run time? That will somewhat confuse the middle-end,
and is UB in any case, as @hfinkel said.

lebedev.ri added inline comments.Nov 26 2018, 6:55 AM
include/llvm/IR/IRBuilder.h
2317–2319

err

The question was - do we really want to ONLY do @llvm.assume() if the requested alignment
is REALLY power-of-two at the run time? That will somewhat confuse the middle-end,
and is UB in any case, as @hfinkel said.

erichkeane marked an inline comment as done.Nov 26 2018, 6:59 AM
erichkeane added inline comments.
include/llvm/IR/IRBuilder.h
2317–2319

Perhaps I'm missing something (it IS monday :) ), but I would presume that the cost of a branch/more BBs to determine that (rather than a select) would be more harmful than just doing a no-op assume.

jyknight added inline comments.Nov 29 2018, 1:12 PM
include/llvm/IR/IRBuilder.h
2317–2319

The point of this attribute is to allow the optimizer to make an assumption about the alignment of the pointer returned by a call such as (but not limited to) aligned_alloc. I believe that in almost all relevant situations where this will at all help optimization, the alignment will (eventually, after inlining and simplification) be a constant. At that point, it doesn't matter how complex the expression is, it all folds away.

The attribute shouldn't cause the compiler to emit UB, because the function it's attached to may in fact have defined behavior in the face of such an argument (e.g., returning NULL, or using the next higher power-of-2 alignment).

hfinkel added inline comments.Dec 1 2018, 3:59 PM
include/llvm/IR/IRBuilder.h
2317–2319

... At that point, it doesn't matter how complex the expression is, it all folds away.

I agree that in nearly all cases where we can use this the alignment ends up being constant (although this is not strictly necessary should we know that the number is greater than some value). Nevertheless, it is also desirable to minimize the cost when we can't use the information, and having multiple basic blocks that we can't eliminate until we hit CodeGen will interfere with optimization more than just having some extra instructions in a single basic block.

jyknight added inline comments.Dec 3 2018, 1:55 PM
include/llvm/IR/IRBuilder.h
2317–2319

I agree, we shouldn't introduce multiple basic blocks here. This patch doesn't seem to, however. That is, I think what's here seems good -- not creating additional UB, and not introducing new basic blocks.

Perhaps I'm misunderstanding something?

hfinkel added inline comments.Dec 3 2018, 5:23 PM
include/llvm/IR/IRBuilder.h
2317–2319

This is still adding the IsPowerOf2 check, and I don't see why that's necessary. If it's not a power of 2, that's UB. Also, that's a separate change from allowing the caller to choose whether to treat the alignment as signed or unsigned.

To be fair and consistent, I'm not sure why we're checking IsPositive either. It looks like @erichkeane added that in r298643 (D29598). I don't see any discussion of this during the review, but it seems also somewhat unfortunate. If the formula used for the alignment assumptions are in any way non-trivial, then utilities like computeKnownBits are far less likely to say useful things based on them.

erichkeane marked an inline comment as done.Dec 4 2018, 6:00 AM
erichkeane added inline comments.
include/llvm/IR/IRBuilder.h
2317–2319

I'm not terribly sure why I added the 'IsPositive' check here in the beginning to be honest, perhaps it was simply because the fixed-version did so.

If we think that returning negative numbers to UB is acceptable, I can definitely revert all checking here and just emit the assumption instead.

I guess the gist of the question we need to ask is UB vs branches? That question is perhaps made easier by @lebedev.ri and his efforts to add these alignment assumptions to UBSan.

jyknight added inline comments.Dec 4 2018, 9:06 AM
include/llvm/IR/IRBuilder.h
2317–2319

You say that it's UB -- but based on what? The alloc_align attribute and __builtin_assume_aligned call aren't part of the standard. The former could be attached to any function the user desires. Such function may well decide to define the behavior in the face of the error condition, rather than cause UB.

Of course, we can decide that the attribute should cause the compiler to generate UB in such a case, but that's not preordained. And IMO, we should not generate UB where doing so is unnecessary.

So, that gets back to the question before -- what goes wrong (will it ruin the ability to optimize) if we check the value before making an invalid assumption? I'm still not seeing why it would -- AFAICT, either the alignment will be eventually constant (in which case everything folds away), or else we can't really determine anything at all (in which case there are extra IR instructions, which eventually get dropped).

Also, FWIW, I just checked GCC's behavior, which is as close to a spec as we can get here: it does not create UB from passing an invalid alignment. It just ignores the information (as this patch also causes to be the case).

hfinkel added inline comments.Dec 4 2018, 9:25 AM
include/llvm/IR/IRBuilder.h
2317–2319

I'm not terribly sure why I added the 'IsPositive' check here in the beginning to be honest, perhaps it was simply because the fixed-version did so.

Ah, indeed. That's a perfectly reasonable answer :-)

If we think that returning negative numbers to UB is acceptable, I can definitely revert all checking here and just emit the assumption instead.

I'd prefer we not have the extra code. It's UB it it's negative, and we should let UBSan detect that. As a user, I'd rather UBSan tell me that I'm accidentally passing -2 as the alignment (instead of it silently being changed to zero and disappearing). In the mean time, if we get into a mode where we're emitting a lot of these alignment assumptions (which we might if users start sprinkling attributes around liberally) then there's a cost to the extra IR and it's harder for the optimizer to understand.

hfinkel added inline comments.Dec 4 2018, 9:28 AM
include/llvm/IR/IRBuilder.h
2317–2319

I'm not terribly sure why I added the 'IsPositive' check here in the beginning to be honest, perhaps it was simply because the fixed-version did so.

Ah, indeed. That's a perfectly reasonable answer :-)

Also, we might change the fixed version to an assert for consistency if we do so.

hfinkel added inline comments.Dec 4 2018, 10:07 AM
include/llvm/IR/IRBuilder.h
2317–2319

You say that it's UB -- but based on what?

Fair point. I did a lot of the underlying implementation and my comments are based on my impressions of the intent of the functionality combined with our best practices in this space.

The former could be attached to any function the user desires. Such function may well decide to define the behavior in the face of the error condition, rather than cause UB.

Of course, we can decide that the attribute should cause the compiler to generate UB in such a case, but that's not preordained. And IMO, we should not generate UB where doing so is unnecessary.

I definitely disagree. Silently hiding errors, which is what defining this behavior does, is not the best option here. UB isn't bad, and is in fact desirable, when it does at least one of the following: 1) Makes it easier to create tools to catch user errors and 2) Assists with optimizations. In this case, the UB does both.

So, that gets back to the question before -- what goes wrong (will it ruin the ability to optimize) if we check the value before making an invalid assumption? I'm still not seeing why it would -- AFAICT, either the alignment will be eventually constant (in which case everything folds away), or else we can't really determine anything at all (in which case there are extra IR instructions, which eventually get dropped).

I understand what you're saying, and I agree that in most cases where you get information from the assumptions will be where the alignment ends up being constant. However: 1) This is not the only case where the optimizer can extract useful information (but it seems far less likely for this to be true if the conditions are not a simple expression) and 2) There's a cost to the extra IR. There are compile time costs, and this can be a concern should we start generating a lot of these assumptions.

Thus, to me, all factors point toward using a simpler expression that might exhibit UB, but those are more useful and allow for more error checking.

Also, FWIW, I just checked GCC's behavior, which is as close to a spec as we can get here: it does not create UB from passing an invalid alignment. It just ignores the information (as this patch also causes to be the case).

If compatibility with a particular language extension requires stronger semantics, then the frontend should add those checks. They don't belong in the underlying utility function. It would be helpful to indicate that you mean: What did you check and what did you see?

erichkeane retitled this revision from Correct CreateAlignmentAssumption to check sign and power-of-2 (LLVM Part) to Remove branch from CreateAlignmentAssumption.
erichkeane edited the summary of this revision. (Show Details)

As Hal suggested, revert the change to check for positivity at all.

Only test affected is in Clang, so I'll fix that as a part of submitting this.

Can you also update the relevant clang patch?

Can you also update the relevant clang patch?

The clang patch actually becomes unnecessary, so it'll be simply abandoned if this is deemed acceptable by @hfinkel .

lebedev.ri accepted this revision.Dec 4 2018, 10:47 AM

Ah, ok.
Since we no longer do "ispositive" check, we will no longer errneously ignore unsigned(INT_MIN) alignment, so this should still address my initial concern.
And extra UB is super awesome, and will surely be caught by the sanitizer (once the [[ D54590 | compiler-rt part ]] is reviewed, that is.) :)

jyknight added inline comments.Dec 4 2018, 1:22 PM
include/llvm/IR/IRBuilder.h
2317–2319

Regarding the alloc_align attribute, an error certainly isn't silently hidden -- it is, instead, passed on to the library function such as aligned_alloc to deal with as it pleases. That may report an error by returning a null pointer, for example.

And, actually -- while I previously thought I was only speaking of user-defined functions which may use this attribute, as it turns out, C11 aligned_alloc actually _doesn't_ allow for UB anymore. It was modified in a DR, post C11, to require that it return a null pointer.

C11 DR 460 http://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_460 modifies the requirements on the function as follows:

The aligned_alloc function allocates space for an object whose
alignment is specified by 'alignment', whose size is specified by 'size',
and whose value is indeterminate.  If the value of 'alignment' is not
a valid alignment supported by the implementation the function shall
fail by returning a null pointer.

(That's also in the C2x draft).

As the function which is the primary use-case for this attribute requires that invalid alignments not generate UB, it follows that the attribute must not, for it to be useful.

I suppose the case for __builtin_assume_aligned to not generate UB is not as clear cut, but I'd still argue that it should have the same semantics, and thus also not UB, for simplicity.

Here's the code in GCC handling the attribute:
https://github.com/gcc-mirror/gcc/blob/master/gcc/tree-ssa-ccp.c#L1718

(Note that it only has to understand values which have already folded to a constant in this code, since it has a different architecture which allows this to work even when the value is only constant post-inlining.)

hfinkel added inline comments.Dec 4 2018, 2:48 PM
include/llvm/IR/IRBuilder.h
2317–2319

Regarding the alloc_align attribute, an error certainly isn't silently hidden -- it is, instead, passed on to the library function such as aligned_alloc to ...

This is a very good point. The GCC documentation says only that, "Meaningful alignments are powers of 2 greater than one." I can read this as implying UB for other values. The aligned_alloc specification is certainly something that we'll need to support, however, I don't see a conflict here. We do need to be careful that we don't generate code that would invoke IR-level UB for unsupported values in the case where the pointer is null. That seems to be true. The code generated is essentially assume(Ptr & (Alignment - 1) == 0), and regardless of what (Alignment-1) is, the assumption will be true when Ptr is null. We would need to make sure not to emit nsw/nuw on the subtraction. We need to document all of this.

Thus, I think that the simple expression will still work even with this use case.

Thus, I think that the simple expression will still work even with this use case.

Indeed so!

I think the rule can be more clearly stated as: When the pointer is non-null, then the alignment must be a positive power of 2 (otherwise UB). An ubsan checker will need to take that into account as well.

Thus, I think that the simple expression will still work even with this use case.

Indeed so!

I think the rule can be more clearly stated as: When the pointer is non-null, then the alignment must be a positive power of 2 (otherwise UB). An ubsan checker will need to take that into account as well.

(BTW, in case it wasn't clear here -- I no longer object to Hal's suggestion.)

@erichkeane are you still going to land this?
If you don't want to bother with updating my tests, want me to take over so this doesn't get abandoned?

@erichkeane are you still going to land this?
If you don't want to bother with updating my tests, want me to take over so this doesn't get abandoned?

I didn't realize Hal's and James' discussion had ended positively toward this patch :) Feel free to take over.

lebedev.ri commandeered this revision.Jan 24 2019, 6:04 AM
lebedev.ri edited reviewers, added: erichkeane; removed: lebedev.ri.
lebedev.ri retitled this revision from Remove branch from CreateAlignmentAssumption to [IRBuilder] Remove positivity check from CreateAlignmentAssumption().
lebedev.ri edited the summary of this revision. (Show Details)
lebedev.ri added a reviewer: rjmccall.
lebedev.ri set the repository for this revision to rL LLVM.

Took over as discussed, rewrote from scratch based on previous disscussions/review feedback.

@hfinkel et al - please formally accept if this makes sense.

LGTM

Thank you for the review.

From IRC:

[22:20:37] <LebedevRI> hfinkel: shall i wait until you have time to look at updated https://reviews.llvm.org/D54653 [IRBuilder] Remove positivity check from CreateAlignmentAssumption()   or is rjmccall's review enough?
[22:23:23] <hfinkel> LebedevRI: Thanks for the ping. LGTM too.

Landing.

hfinkel accepted this revision.Jan 24 2019, 11:29 AM

Took over as discussed, rewrote from scratch based on previous disscussions/review feedback.

@hfinkel et al - please formally accept if this makes sense.

LGTM too.

This revision is now accepted and ready to land.Jan 24 2019, 11:29 AM

I'd like to see the rule we ended up with documented somewhere.

That is:

When the pointer is non-null, then the alignment must be a positive power of 2 (otherwise UB).

It's not necessarily obvious that the UBness of invalid alignments ought to be conditional upon the pointer's value.

I'd like to see the rule we ended up with documented somewhere.

Okay, makes sense.
Any better place for that other than the comment before this function?

That is:

When the pointer is non-null, then the alignment must be a positive power of 2 (otherwise UB).

It's not necessarily obvious that the UBness of invalid alignments ought to be conditional upon the pointer's value.