This is an archive of the discontinued LLVM Phabricator instance.

[InferFuncAttributes] extend 'dereferenceable' attribute based on loads
AbandonedPublic

Authored by spatel on Jul 5 2019, 11:22 AM.

Details

Summary

This is similar to a 'nonnull' patch:
D27855
...that is still off by default because of C problems.

For this patch, the motivating case is shown in PR21780:
https://bugs.llvm.org/show_bug.cgi?id=21780

We are trying to preserve the ability of SLP (D64142) and/or the backend (D64205) to create a vector load even after some other pass like InstCombine has deleted scalar instructions by using demanded elements analysis. We do that by collecting all guaranteed accesses from a given pointer argument and creating a known dereferenceable byte range from those.

There's an alternate proposal to do something similar but more involved in:
D37579
...but that seems to have stalled.

And if I'm interpreting the comments there correctly, this is an implementation of a suggestion from @reames :
"...we can prove that the loads post dominate the entry to the function and could update the argument with the existing dereferenceability attribute. This might be an alternate approach and separately worth implementation."

Diff Detail

Event Timeline

spatel created this revision.Jul 5 2019, 11:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2019, 11:22 AM
spatel updated this revision to Diff 208221.Jul 5 2019, 12:36 PM

Patch updated:
I missed diffs in some existing over-reaching clang and AMDGPU tests. These regression tests should not be testing the entire optimization pipeline, but I adjusted the assertions to make them pass.

The direction of this makes total sense and we will need it. However this shoulnd't be here (wrt. the file/pass).

Assuming we want this right right now, it should life in FunctionAttrs.cpp. Assuming we want to do it "right" it should become part of the Attributor framework.

The early prototype of the "deref-or-null" abstract attribute already had this functionality, see https://reviews.llvm.org/D59202#C1381429NL1995, and the test case https://reviews.llvm.org/D59202#change-FJbHx7N4s6ye . For the new Attributor, dereferenceable-or-null has not yet been ported and the transfer of "close by information" is not part of the new model. Both things are going to change soon.

spatel added a comment.Jul 5 2019, 2:35 PM

The direction of this makes total sense and we will need it. However this shoulnd't be here (wrt. the file/pass).

Assuming we want this right right now, it should life in FunctionAttrs.cpp. Assuming we want to do it "right" it should become part of the Attributor framework.

The early prototype of the "deref-or-null" abstract attribute already had this functionality, see https://reviews.llvm.org/D59202#C1381429NL1995, and the test case https://reviews.llvm.org/D59202#change-FJbHx7N4s6ye . For the new Attributor, dereferenceable-or-null has not yet been ported and the transfer of "close by information" is not part of the new model. Both things are going to change soon.

Thanks for taking a look! I was just about to add you as a reviewer. I know you are working on a major overhaul of this functionality, but I have not gotten a chance to look at those patches.
The reason I did not put this into FunctionAttrs.cpp is because that's currently too late to catch the motivating example from PR21780 using the default opt pass pipeline. Ie, -instcombine runs before -functionattrs and kills the loads before we have a chance to update the arguments. I would like to get this in soon to make the next clang major release, so this seemed like the patch of least resistance. :)
If there's a better way, I can certainly try to make it happen.

The direction of this makes total sense and we will need it. However this shoulnd't be here (wrt. the file/pass).

Assuming we want this right right now, it should life in FunctionAttrs.cpp. Assuming we want to do it "right" it should become part of the Attributor framework.

The early prototype of the "deref-or-null" abstract attribute already had this functionality, see https://reviews.llvm.org/D59202#C1381429NL1995, and the test case https://reviews.llvm.org/D59202#change-FJbHx7N4s6ye . For the new Attributor, dereferenceable-or-null has not yet been ported and the transfer of "close by information" is not part of the new model. Both things are going to change soon.

Thanks for taking a look! I was just about to add you as a reviewer. I know you are working on a major overhaul of this functionality, but I have not gotten a chance to look at those patches.
The reason I did not put this into FunctionAttrs.cpp is because that's currently too late to catch the motivating example from PR21780 using the default opt pass pipeline. Ie, -instcombine runs before -functionattrs and kills the loads before we have a chance to update the arguments. I would like to get this in soon to make the next clang major release, so this seemed like the patch of least resistance. :)
If there's a better way, I can certainly try to make it happen.

Thanks for thinking of me ;) And again, I think this is an important change we need!

The Attributor is in tree and, if enabled, it is run very early (as I very very strongly believe it should). I think we can get the Attributor enabled for the next release (maybe with a low iteration count and restrictions on the attributes we derive). Now there are two missing parts to get this functionality into the Attributor in a decent way:

  1. A generic way to "look around for existing information" (more on this below).
  2. The abstract attribute for dereferenceability(_or_null) that makes use of 1) and potentially performs usual deduction.

Implementing 2) is fairly easy. It should not take long to create the boilerplate if we only want to rely on the deduction through 1). Also, the logic is already in this patch (and the old prototype).
Regarding 1):
I was going to work on this once I found some free cycles but I could do it now if we decide to go this way. The idea is that you specify a program point PP (=instruction) and a callback. The callback is then automatically applied to all instruction which have to be executed when PP is also reached, either before or after. I would like this to be an abstract interface from the get-go but I am also willing to provide the interface and the initial implementation that will at least suffice for this use case. It should then be used from the AbstractAttribute::initialize and AbstractAttribute::updateImpl method of the abstract attribute for the dereferenceable attribute (and others later as well).

P.S. You should be aware of the change to dereferenceability that is going to happen very soon, see D61652 and D63243 (I'm still fixing that one).

spatel added a comment.Jul 8 2019, 7:19 AM

Thanks for thinking of me ;) And again, I think this is an important change we need!

The Attributor is in tree and, if enabled, it is run very early (as I very very strongly believe it should). I think we can get the Attributor enabled for the next release (maybe with a low iteration count and restrictions on the attributes we derive). Now there are two missing parts to get this functionality into the Attributor in a decent way:

  1. A generic way to "look around for existing information" (more on this below).
  2. The abstract attribute for dereferenceability(_or_null) that makes use of 1) and potentially performs usual deduction.

Implementing 2) is fairly easy. It should not take long to create the boilerplate if we only want to rely on the deduction through 1). Also, the logic is already in this patch (and the old prototype).
Regarding 1):
I was going to work on this once I found some free cycles but I could do it now if we decide to go this way. The idea is that you specify a program point PP (=instruction) and a callback. The callback is then automatically applied to all instruction which have to be executed when PP is also reached, either before or after. I would like this to be an abstract interface from the get-go but I am also willing to provide the interface and the initial implementation that will at least suffice for this use case. It should then be used from the AbstractAttribute::initialize and AbstractAttribute::updateImpl method of the abstract attribute for the dereferenceable attribute (and others later as well).

P.S. You should be aware of the change to dereferenceability that is going to happen very soon, see D61652 and D63243 (I'm still fixing that one).

Thanks for the links. I'm still trying to digest where we stand currently and weigh the timing/risk/effort.

Attributor is in trunk, but it is not enabled by default? Or there are no transforms implemented that run with the default opt pipeline?
The current branch date for clang 9 is July 18 (10 days from today). That seems very tight to implement this based on the patch reviews that I skimmed. Ie, there's still a lot of back-and-forth going on in those reviews.

IMO, this patch carries significant risk alone. I'm basing that on the fact that D27855 still isn't enabled by default and the related D29999 was reverted because it caused crashing (I still haven't tracked down why).
So -- unless it would mean significantly more work to have this in trunk and then ported to the new and better way -- it is less overall work/risk to proceed here as an intermediate step since I already created this patch.

If there's nothing majorly wrong here, we can get several days of bot/fuzz/real-world testing on this code before the release. We are going to deprecate InferFunctionAttrs and its existing transform as part of switching to Attributor anyway, right? I can mark this with 'FIXME' and try to help with the porting if I've assessed that correctly.

hfinkel added inline comments.Jul 8 2019, 3:09 PM
llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
38

Why not also stores (or AtomicCmpXchg or AtomicRMW)?

42

This logic seems unnecessarily limited. Why not use GetPointerBaseWithConstantOffset?

120

Does this do the right thing if the lowest GEP index is negative? You could skip the negative ones first?

As a general point, I think that you want logic here mirroring (a subset of) what's in isOverwrite in DeadStoreElimination.cpp

spatel marked 3 inline comments as done.Jul 9 2019, 8:31 AM
spatel added inline comments.
llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
38

Oversight on my part - tunnel vision based on the motivating cases.

I've never done anything with the atomic opcodes, so I didn't remember them. For stores, how would dereferenceable aid optimization?
Ok if we make this a TODO enhancement/follow-up?

42

Another oversight on my part. I wasn't thinking about cases with pointer casts, so that made the logic simpler. Will change, but hoping a partial implementation is good enough for an initial patch (see next comment).

120

I don't think there was miscompile potential there, but that was accidental. I didn't think about negative offsets. Will fix.
I'd like to make the DSE-like enhancement to support arbitrary-sized sub-ranges (via pointer casts) a follow-up, so this patch doesn't get too complicated.

spatel updated this revision to Diff 208700.Jul 9 2019, 8:38 AM

Patch updated:

  1. Used GetPointerBaseWithConstantOffset() to allow more complex pattern matching.
  2. But limited that matching to cases where the argument and access have the same size to reduce complexity.
  3. Generalized variable names and comments to allow less churn for follow-up enhancements.
  4. Added tests with multiple dereferenceable arguments, pointer casts, and negative offsets.
lebedev.ri added a subscriber: lebedev.ri.

Passing-by thought:
Does attributor already being run in the pipeline?
Is it in the state where it should be extended instead of adding more backlog for porting into it?

I do not want to block this patch but I still believe that this is the wrong way to go (middle/long term). The fact that we need to put this not in FunctionAttrs.cpp, where the other deductions live, but in InferFunctionAttrs.cpp, where we so far only annotated library functions, should be a first sign. Also, the functionality here is only one way to deduce dereferenceable, arguably, you want all the ways together such that they can benefit from each other.

Passing-by thought:
Does attributor already being run in the pipeline?
Is it in the state where it should be extended instead of adding more backlog for porting into it?

The pass is always "run" in the pipeline but by default the Attributor object will not be created. The cmd line flag attributor-disable defaults to true for now.

I think we can reasonably extend it and we are making progress getting rid of the backlog. Multiple deductions got in already and if one doesn't depend on any AbstractAttribute stuck in the pipeline (which you can choose not to in the beginning), one can easily get it in.


Going the Attributor route:
I'll upload a prototype of a very generic visitor that can be used to inspect all instructions that "must be executed with" a given one. That functionality is useful on its own and for various attributes so I will first finish the visitor, then we need an interface for AbstractAttributes, and then we can add deductions based on instructions that are executed with.

I do not want to block this patch but I still believe that this is the wrong way to go (middle/long term). The fact that we need to put this not in FunctionAttrs.cpp, where the other deductions live, but in InferFunctionAttrs.cpp, where we so far only annotated library functions, should be a first sign. Also, the functionality here is only one way to deduce dereferenceable, arguably, you want all the ways together such that they can benefit from each other.

I don't disagree with the long-term view. My thoughts about the risk/reward are in my comment from yesterday. I'd like to get this in for the practical/short-term advantage to the clang 9.0 release.

hfinkel added inline comments.Jul 9 2019, 11:58 AM
llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
38

If you store to an address, then you know that it is dereferenceable. The point is not to aid the optimization of the store, but to aid the optimization of later loads. I'd like to see stores handled here - if we're going to find problems with this, that will make it more likely that we'll find them quickly.

Also, you must omit volatile loads and stores (IIRC, our semantics mean that volatile loads/stores won't imply dereferenceability for the non-volatile accesses).

Mostly comments to improve this. Two required changes.

Maybe we could mention that this logic should, or better will, move into the Attributor framework somewhere?

llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
42

"Style": Is this the only use of the "match" function? If so, why not do the (in the middle-end) more familiar pattern of dyn_cast and getPointerOperand?

49

I don't think you can get a nullptr back.

59

Maybe a bit more general, something like:

bits = DL.getTypeSizeInBits(ArgTy->getType()->getPointerElementType()`
// Round down to the nearest multiple of 8, dereferenceable attributes uses bytes.
bits = bits - bits % 8;
if (!bits)
  continue;

(GEPOperator::accumulateConstantOffset uses DL.getTypeAllocSize which we could probably use as well.)

72

Two ideas:

  1. We could only track minimum + maximum ByteOffset values, iff we know you cannot "jump" between allocations.
  2. Regardless of 1), we could use the maximum ByteOffset value we found for a inbounds GEP as a lower bound for the dereferenceable bytes. inbounds GEPs should not allow to do any "jumping" or starting outside the object.
134

Wrt. the above changes it would probably be: MaxOffset * EltSize / 8.

140

You need to remove deref_or_null as well.

llvm/test/Transforms/InferFunctionAttrs/dereferenceable.ll
114–115

volatile should not cause deref, I think this was said:

This means the compiler may not use a volatile operation to prove a non-volatile access to that address has defined behavior.

173

Yes ;)

spatel marked 8 inline comments as done.Jul 10 2019, 9:12 AM

Mostly comments to improve this. Two required changes.

Thanks!

Maybe we could mention that this logic should, or better will, move into the Attributor framework somewhere?

Yes - I'll add a FIXME to the top of this file, so we know the whole thing should go away.

llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
42

Yes - that's the only use of match. Since we're going to support stores now, it will go away.
Note: 'match' is the more familiar pattern to me because that's used throughout instcombine.

49

Ah, I misunderstood the API - I thought if there's no constant offset, the base should be returned as nullptr.

Independent of this patch: add a line to the documentation comment to make the behavior clear as part of the cleanup in D64468?

llvm/test/Transforms/InferFunctionAttrs/dereferenceable.ll
173

I've left this as a TODO for now because it's highly unusual to see non-power-of-2 bitwidths.
We're going to rewrite the code fairly soon, and we'll have a code comment and this test in place to remind us that we can make the logic more flexible.

Fwiw, I don't need all the proposed improvements. Getting this in with one or more FIXMEs is fine with me.

llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
49

Independent of this patch: add a line to the documentation comment to make the behavior clear as part of the cleanup in D64468?

Done, take a look if that is what you wanted. (The wrappers are "not well" documented but only the base function is.)

llvm/test/Transforms/InferFunctionAttrs/dereferenceable.ll
173

Fine with me.

spatel updated this revision to Diff 208999.Jul 10 2019, 10:07 AM
spatel marked 3 inline comments as done.

Patch updated:

  1. Allow stores to have the same inferences as loads. This exposed more clang test failures, so those diffs are included.
  2. Don't infer anything from volatile (non-simple) memory accesses.
  3. There was a bug in how we dealt isGuaranteedToTransferExecutionToSuccessor(), so added an assert and a test with a function call to verify that.
  4. Added code/test for replacing DereferenceableOrNull attribute.
  5. Added FIXME comment to indicate that this pass should be subsumed by Attributor.
jfb added a comment.Jul 10 2019, 10:15 AM

Right now the control flow isn't clever, but I wonder if, as this analysis becomes more powerful, it'll have to act differently when -fno-delete-null-pointer-checks is specified? Is there a simple test that you can add to make sure null pointer checks don't cause false assumptions whenever this optimization becomes smarter?

llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
38

It does seem like you want to handle non-volatile atomic load / store, as well as cmpxchg and RMW.

In D64258#1578820, @jfb wrote:

Right now the control flow isn't clever, but I wonder if, as this analysis becomes more powerful, it'll have to act differently when -fno-delete-null-pointer-checks is specified? Is there a simple test that you can add to make sure null pointer checks don't cause false assumptions whenever this optimization becomes smarter?

I hadn't seen that flag before. In IR we have this translation of the minimal test in clang/test/CodeGen/nonnull.c:

define void @foo(i32* nocapture dereferenceable(4) %x) #0 {
  store i32 0, i32* %x, align 4, !tbaa !3
  ret void
}

attributes #0 = { ... "null-pointer-is-valid"="true" ... }

This pass/patch doesn't act on the dereferenceable attribute; it just adds it. So some other pass would be at risk if it tries a transform based on "dereferenceable" without checking the "null-pointer-is-valid" function attribute or "nonnull" argument attribute? Argument::hasNonNullAttr() seems safe.

spatel marked 2 inline comments as done.Jul 10 2019, 11:28 AM
spatel added inline comments.
llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
38

Ok - one more TODO. :)

spatel updated this revision to Diff 209033.Jul 10 2019, 11:30 AM
spatel marked an inline comment as done.

Patch updated:
Add TODO code comment about using "isSimple()" and add test with an atomic load.

jfb added a comment.Jul 10 2019, 12:35 PM

Are there ever cases where you know something is always dereferenceable based on where the memory lives? For example, globals can always be dereferenced, so can the stack (but I don't think the stack exists here). Are there other locations?

How do you deal with address spaces? Should you only infer this attribute for address space 0?

Also, would it make sense to separate readable from writable? We currently have this bug where LLVM will promote all const static globals to rodata, and sometimes generate atomic cmpxchg to them (e.g. because we're trying to load a 128-bit value). Similarly, we might want to honor R / W memory protection in general. Right now dereferenceable just means "you can load from this", because we can't speculate most stores.

In D64258#1579146, @jfb wrote:

Are there ever cases where you know something is always dereferenceable based on where the memory lives? For example, globals can always be dereferenced, so can the stack (but I don't think the stack exists here). Are there other locations?

Probably, and the Attributor implementation can do deduction of that (also the stack case). However, dereferenceable_globally is not available yet (D61652). For now, we cannot distinguish between "globally dereferenceable" and "locally dereferenceable".

How do you deal with address spaces? Should you only infer this attribute for address space 0?

I do think all address spaces are fine. Must accesses should always imply dereferenceability, afaik, except if they are volatile.

Also, would it make sense to separate readable from writable? We currently have this bug where LLVM will promote all const static globals to rodata, and sometimes generate atomic cmpxchg to them (e.g. because we're trying to load a 128-bit value). Similarly, we might want to honor R / W memory protection in general. Right now dereferenceable just means "you can load from this", because we can't speculate most stores.

I do not understand the problem but I have the feeling this is an orthogonal issue.

jfb added a comment.Jul 10 2019, 1:32 PM

Also, would it make sense to separate readable from writable? We currently have this bug where LLVM will promote all const static globals to rodata, and sometimes generate atomic cmpxchg to them (e.g. because we're trying to load a 128-bit value). Similarly, we might want to honor R / W memory protection in general. Right now dereferenceable just means "you can load from this", because we can't speculate most stores.

I do not understand the problem but I have the feeling this is an orthogonal issue.

mprotect can make memory readable but not writable, or writable but not readable... or neither. What does dereferenceable mean when faced with this fact? Further, what happens to dereferenceable when mprotect is called (any opaque function could call it)? I don't think this is an orthogonal problem at all.

jdoerfert added a comment.EditedJul 10 2019, 2:09 PM
In D64258#1579214, @jfb wrote:

Also, would it make sense to separate readable from writable? We currently have this bug where LLVM will promote all const static globals to rodata, and sometimes generate atomic cmpxchg to them (e.g. because we're trying to load a 128-bit value). Similarly, we might want to honor R / W memory protection in general. Right now dereferenceable just means "you can load from this", because we can't speculate most stores.

I do not understand the problem but I have the feeling this is an orthogonal issue.

mprotect can make memory readable but not writable, or writable but not readable... or neither. What does dereferenceable mean when faced with this fact? Further, what happens to dereferenceable when mprotect is called (any opaque function could call it)? I don't think this is an orthogonal problem at all.

So, I guess what the above means is "dereferenceable" is too coarse grained. We have "global dereferenceability" that cannot be changed, and we have "local dereferenceability" that can be changed, e.g., through calls to free, realloc, or mprotect. From accesses we can only deduce "local dereferenceability". Now, that is why we need D61652, or more precisely, D63243. After those changes landed, the reasoning introduced in this patch should be fine, before, it is as broken as Clang is when it emits dereferenceable for arguments passed by reference. (The logic above, with the same problems and more, is also used in ArgumentPromotion right now...).

In D64258#1579214, @jfb wrote:

Also, would it make sense to separate readable from writable? We currently have this bug where LLVM will promote all const static globals to rodata, and sometimes generate atomic cmpxchg to them (e.g. because we're trying to load a 128-bit value). Similarly, we might want to honor R / W memory protection in general. Right now dereferenceable just means "you can load from this", because we can't speculate most stores.

I do not understand the problem but I have the feeling this is an orthogonal issue.

mprotect can make memory readable but not writable, or writable but not readable... or neither. What does dereferenceable mean when faced with this fact? Further, what happens to dereferenceable when mprotect is called (any opaque function could call it)? I don't think this is an orthogonal problem at all.

So, I guess what the above means is "dereferenceable" is too coarse grained. We have "global dereferenceability" that cannot be changed, and we have "local dereferenceability" that can be changed, e.g., through calls to free, realloc, or mprotect. From accesses we can only deduce "local dereferenceability". Now, that is why we need D61652, or more precisely, D63243. After those changes landed, the reasoning introduced in this patch should be fine, before, it is as broken as Clang is when it emits dereferenceable for arguments passed by reference. (The logic above, with the same problems and more, is also used in ArgumentPromotion right now...).

So we are saying that the current attribute is too vague/broken to be useful? Ie, this patch must be abandoned?

In D64258#1579214, @jfb wrote:

Also, would it make sense to separate readable from writable? We currently have this bug where LLVM will promote all const static globals to rodata, and sometimes generate atomic cmpxchg to them (e.g. because we're trying to load a 128-bit value). Similarly, we might want to honor R / W memory protection in general. Right now dereferenceable just means "you can load from this", because we can't speculate most stores.

I do not understand the problem but I have the feeling this is an orthogonal issue.

mprotect can make memory readable but not writable, or writable but not readable... or neither. What does dereferenceable mean when faced with this fact? Further, what happens to dereferenceable when mprotect is called (any opaque function could call it)? I don't think this is an orthogonal problem at all.

So, I guess what the above means is "dereferenceable" is too coarse grained. We have "global dereferenceability" that cannot be changed, and we have "local dereferenceability" that can be changed, e.g., through calls to free, realloc, or mprotect. From accesses we can only deduce "local dereferenceability". Now, that is why we need D61652, or more precisely, D63243. After those changes landed, the reasoning introduced in this patch should be fine, before, it is as broken as Clang is when it emits dereferenceable for arguments passed by reference. (The logic above, with the same problems and more, is also used in ArgumentPromotion right now...).

So we are saying that the current attribute is too vague/broken to be useful? Ie, this patch must be abandoned?

The current situation is broken but works "so far". Adding this will expose the broken part further, that is the reason why I started to fix the situation in the first place ;)
That being said, I think the patch is "fine", at the latest after D63243 is in.

Now for timeline, in case you want to avoid exposing the problem with this patch:
I hope to land D61652 this week, after I can test the latest update of D63243 which depends on nosync. D63243 would then land a week or so later, giving front-end people time to update from dereferenceable to dereferenceable_globally if they actually want that behavior.

uenoku added a subscriber: uenoku.Jul 13 2019, 1:18 AM

Quick update: I finally put the skeleton of what I wanted to use in the Attributor online: D64975. The bridge to the Attributor is still missing but the dereferenceable deduction and propagation part is already there: D64876

where are we with this patch vs attributor? are we close to the attributor being able to perform the equivalent of some of the improvements to the test cases?

Attributor catches some of the cases but it doesn't create the access map (see getArgToOffsetsMap). @uenoku is that correct?

Attributor catches some of the cases but it doesn't create the access map (see getArgToOffsetsMap). @uenoku is that correct?

Right. I'll try to create the access map.

I uploaded a patch(D70714). As far as I see, almost all the cases are covered.

I uploaded a patch(D70714). As far as I see, almost all the cases are covered.

Thanks! The new patch appears to do everything that this patch tried to do and much more, so I'm happy to abandon and continue discussion on the new patch.

There's 1 new test that I was planning to add based on feedback here:

; TODO: We should allow inference for atomic (but not volatile) ops.

define void @atomic_is_alright(i16* %ptr) {
; CHECK-LABEL: @atomic_is_alright(i16* %ptr)
  %arrayidx0 = getelementptr i16, i16* %ptr, i64 0
  %arrayidx1 = getelementptr i16, i16* %ptr, i64 1
  %arrayidx2 = getelementptr i16, i16* %ptr, i64 2
  %t0 = load atomic i16, i16* %arrayidx0 unordered, align 2
  %t1 = load i16, i16* %arrayidx1
  %t2 = load i16, i16* %arrayidx2
  ret void
}

...so I can commit that as-is to trunk as a baseline test, and then you can rebase D70714 as needed.

where are we with this patch vs attributor? are we close to the attributor being able to perform the equivalent of some of the improvements to the test cases?

I am wondering whether Attributor will be enabled in next release.

There was 1 more test added here that I missed in the last comment:

define void @more_bytes_and_not_null(i32* dereferenceable_or_null(8) %ptr) {

Committed that and:

define void @atomic_is_alright(i16* %ptr) {

rG2bd252ea8941

So let's abandon this and find out when Attributor will be enabled in D70714.

spatel abandoned this revision.Nov 26 2019, 6:20 AM