Page MenuHomePhabricator
Feed Advanced Search

Thu, Jul 18

spatel committed rGe6547859122c: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483) (authored by spatel).
[x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483)
Thu, Jul 18, 5:49 AM
spatel committed rL366431: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).
[x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483)
Thu, Jul 18, 5:49 AM
spatel closed D64707: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).
Thu, Jul 18, 5:48 AM · Restricted Project
spatel accepted D64551: [X86] EltsFromConsecutiveLoads - support common source loads.

It's worth noting here in the review that this patch depends on the dereferenceable attribute (see D64205), and that attribute could change meaning as part of the larger changes related to the Attributor pass (D63243).
Based on current definitions, I think this is correct and allowable, so LGTM.

Thu, Jul 18, 5:43 AM · Restricted Project
spatel updated the diff for D64707: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).

Patch updated:
Just the add/sub opcodes for now.

Thu, Jul 18, 5:11 AM · Restricted Project
spatel added a comment to D64707: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).

If we focussed just on PR40483 for now - do we just need X86ISD ADD + SUB (ADC + SBB) ?

Thu, Jul 18, 5:09 AM · Restricted Project

Wed, Jul 17

spatel added inline comments to D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.
Wed, Jul 17, 1:27 PM · Restricted Project
spatel added inline comments to D64713: [InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0.
Wed, Jul 17, 12:06 PM · Restricted Project
spatel updated the diff for D64432: [InstCombine] try to narrow a truncated load.

Patch updated:
Add a test with no 'dereferenceable' attribute on the pointer argument.

Wed, Jul 17, 8:46 AM · Restricted Project
spatel added a comment to D64432: [InstCombine] try to narrow a truncated load.

I think you are missing the negative test without dereferenceable.

Wed, Jul 17, 8:46 AM · Restricted Project
spatel added a comment to D64142: [SLP] try to create vector loads from bitcasted scalar pointers.

Ping * 2.

Wed, Jul 17, 6:39 AM · Restricted Project

Tue, Jul 16

spatel committed rGd746a210e169: [x86] use more phadd for reductions (authored by spatel).
[x86] use more phadd for reductions
Tue, Jul 16, 2:37 PM
spatel committed rL366268: [x86] use more phadd for reductions.
[x86] use more phadd for reductions
Tue, Jul 16, 2:30 PM
spatel closed D64760: [x86] use more phadd for reductions.
Tue, Jul 16, 2:30 PM · Restricted Project
spatel updated the diff for D64432: [InstCombine] try to narrow a truncated load.

Patch updated:

  1. Add limitation based on dereferenceable attribute to prevent information loss.
  2. Add/adjust tests to include dereferenceable attributes.
Tue, Jul 16, 12:43 PM · Restricted Project
spatel added a comment to D64432: [InstCombine] try to narrow a truncated load.

Could we check here if the base pointer has dereferenceable annotation and use that as a condition for this transformation? (It's more complicated to be completely lossless but this seems to be an easy to test starting point).

Tue, Jul 16, 12:36 PM · Restricted Project
spatel updated the diff for D64760: [x86] use more phadd for reductions.

Patch updated - no functional changes from the previous draft:

  1. Move local variable for NumElts closer to uses.
  2. Add TODO comment about handling bigger-than-256-bit types.
Tue, Jul 16, 9:42 AM · Restricted Project
spatel added inline comments to D64512: [InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563).
Tue, Jul 16, 9:22 AM · Restricted Project
spatel added a comment to D64707: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).

Are the multiply test changes due to the flags being used by seto? But seto usage should never be in danger of creating the instruction duplication we're seeing in the motivating case. It does look like we're getting an improvement on those tests, but not for the reason we're selecting LEA.

Tue, Jul 16, 9:15 AM · Restricted Project
spatel updated subscribers of D64512: [InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563).
Tue, Jul 16, 8:05 AM · Restricted Project
spatel updated the diff for D64760: [x86] use more phadd for reductions.

Patch updated:
Allow 256-bit reductions by extracting and using 1 more 128-bit hop.

Tue, Jul 16, 7:23 AM · Restricted Project

Mon, Jul 15

spatel updated the diff for D64760: [x86] use more phadd for reductions.

Patch updated:
Early exit if wrong types or subtarget.

Mon, Jul 15, 7:20 PM · Restricted Project
spatel added inline comments to D64760: [x86] use more phadd for reductions.
Mon, Jul 15, 4:51 PM · Restricted Project
spatel updated the diff for D64707: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).

Patch updated:

  1. Check flag uses to avoid unintended transform.
  2. Add TODO comment about && vs. ||.
Mon, Jul 15, 1:31 PM · Restricted Project
spatel added inline comments to D64707: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).
Mon, Jul 15, 1:28 PM · Restricted Project
spatel committed rGeb99165b97b7: [x86] try to keep FP casted+truncated+extracted vector element out of GPRs (authored by spatel).
[x86] try to keep FP casted+truncated+extracted vector element out of GPRs
Mon, Jul 15, 11:19 AM
spatel committed rL366098: [x86] try to keep FP casted+truncated+extracted vector element out of GPRs.
[x86] try to keep FP casted+truncated+extracted vector element out of GPRs
Mon, Jul 15, 11:18 AM
spatel closed D64710: [x86] try to keep FP casted+truncated+extracted vector element out of GPRs.
Mon, Jul 15, 11:18 AM · Restricted Project
spatel created D64760: [x86] use more phadd for reductions.
Mon, Jul 15, 10:13 AM · Restricted Project
spatel committed rGa53e779edc85: [x86] add tests for reductions that might be better with more horizontal ops… (authored by spatel).
[x86] add tests for reductions that might be better with more horizontal ops…
Mon, Jul 15, 10:01 AM
spatel committed rL366082: [x86] add tests for reductions that might be better with more horizontal ops….
[x86] add tests for reductions that might be better with more horizontal ops…
Mon, Jul 15, 10:01 AM
spatel added inline comments to D64710: [x86] try to keep FP casted+truncated+extracted vector element out of GPRs.
Mon, Jul 15, 5:11 AM · Restricted Project

Sun, Jul 14

spatel created D64710: [x86] try to keep FP casted+truncated+extracted vector element out of GPRs.
Sun, Jul 14, 1:12 PM · Restricted Project
spatel added inline comments to D64707: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).
Sun, Jul 14, 1:00 PM · Restricted Project
spatel created D64707: [x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483).
Sun, Jul 14, 7:25 AM · Restricted Project
spatel committed rG03d5e28fe943: [x86] add test for sub-with-flags opportunity (PR40483); NFC (authored by spatel).
[x86] add test for sub-with-flags opportunity (PR40483); NFC
Sun, Jul 14, 7:10 AM
spatel committed rL366019: [x86] add test for sub-with-flags opportunity (PR40483); NFC.
[x86] add test for sub-with-flags opportunity (PR40483); NFC
Sun, Jul 14, 7:10 AM

Sat, Jul 13

spatel committed rG22cc1030f6a9: Revert "[InstCombine] add tests for umin/umax via usub.sat; NFC" (authored by spatel).
Revert "[InstCombine] add tests for umin/umax via usub.sat; NFC"
Sat, Jul 13, 6:17 AM
spatel committed rL366000: Revert "[InstCombine] add tests for umin/umax via usub.sat; NFC".
Revert "[InstCombine] add tests for umin/umax via usub.sat; NFC"
Sat, Jul 13, 6:17 AM
spatel added a reverting change for rL365999: [InstCombine] add tests for umin/umax via usub.sat; NFC: rL366000: Revert "[InstCombine] add tests for umin/umax via usub.sat; NFC".
Sat, Jul 13, 6:17 AM
spatel committed rG0f6148df23ed: [InstCombine] add tests for umin/umax via usub.sat; NFC (authored by spatel).
[InstCombine] add tests for umin/umax via usub.sat; NFC
Sat, Jul 13, 5:55 AM
spatel committed rL365999: [InstCombine] add tests for umin/umax via usub.sat; NFC.
[InstCombine] add tests for umin/umax via usub.sat; NFC
Sat, Jul 13, 5:54 AM
spatel committed rG2097f75eabb9: [x86] simplify cmov with same true/false operands (authored by spatel).
[x86] simplify cmov with same true/false operands
Sat, Jul 13, 5:06 AM
spatel committed rL365998: [x86] simplify cmov with same true/false operands.
[x86] simplify cmov with same true/false operands
Sat, Jul 13, 5:05 AM

Fri, Jul 12

spatel committed rGe26bacb652ab: [x86] add test for bogus cmov (PR40483); NFC (authored by spatel).
[x86] add test for bogus cmov (PR40483); NFC
Fri, Jul 12, 11:39 AM
spatel committed rL365941: [x86] add test for bogus cmov (PR40483); NFC.
[x86] add test for bogus cmov (PR40483); NFC
Fri, Jul 12, 11:38 AM
spatel added a reviewer for D64432: [InstCombine] try to narrow a truncated load: jdoerfert.

I'm not sure that doing this at the IR level is the best idea. The problem is that when we narrow, we loose the dereferenceable fact about part of the memory access. This can in turn limit other transforms which would have been profitable. As an example:
a = load <2 x i8>* p
b = load <2 x i8>* (p+1)
sum = a[0] + a[1] + b[1]

Narrowing the b load to i8 looses the fact that the memory location corresponding to b[0] is dereferenceable, which would prevent transforms such as:
a = load <4 x i8>* p
a[2] = 0;
sum = horizontal_sum(a);

(Note: I'm not saying this alternate transform is always profitable. I'm just making a point about lost opportunity.)

Fri, Jul 12, 10:31 AM · Restricted Project

Thu, Jul 11

spatel added inline comments to D64275: [InstCombine] Generalize InstCombiner::foldAndOrOfICmpsOfAndWithPow2()..
Thu, Jul 11, 1:12 PM · Restricted Project
spatel accepted D64572: [UpdateTestChecks] Emit warning when invalid test paths.

LGTM

Thu, Jul 11, 1:06 PM · Restricted Project
spatel added inline comments to D64275: [InstCombine] Generalize InstCombiner::foldAndOrOfICmpsOfAndWithPow2()..
Thu, Jul 11, 11:03 AM · Restricted Project
spatel added a comment to D64285: [InstCombine] Fold select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y) to ashr (X, Y)).

So to answer your question, it's about both code duplication and redundant matching. And while the death by a thousand cuts may be unavoidable, we should still try to not hasten it along unduly...

Thu, Jul 11, 10:42 AM · Restricted Project
spatel committed rG5cc7c9ab9399: [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483) (authored by spatel).
[X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483)
Thu, Jul 11, 8:57 AM
spatel committed rL365791: [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483).
[X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483)
Thu, Jul 11, 8:56 AM
spatel closed D58875: [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483) (WIP).
Thu, Jul 11, 8:56 AM · Restricted Project
spatel added inline comments to D58875: [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483) (WIP).
Thu, Jul 11, 8:50 AM · Restricted Project
spatel updated the diff for D58875: [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483) (WIP).

Patch updated:
No code changes - just regenerated the test diffs.

Thu, Jul 11, 7:53 AM · Restricted Project
spatel commandeered D58875: [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483) (WIP).

Commandeering to post the rebased patch.

Thu, Jul 11, 7:48 AM · Restricted Project
spatel accepted D63653: [DAGCombine] narrowInsertExtractVectorBinOp - add CONCAT_VECTORS support.

I don't have a good sense of how we make fast-isel speed vs. perf trade-offs, so if anyone else has thoughts about that case, feel free to comment.

Thu, Jul 11, 7:20 AM · Restricted Project
spatel added a comment to D58875: [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483) (WIP).

If we rebase the tests after rL365711, I don't see any regressions. Not sure if we're getting all of the optimizations that were intended, but this patch seems safe to commit now.

Thu, Jul 11, 7:03 AM · Restricted Project
spatel committed rG3487791fea9f: [InstCombine] don't move FP negation out of a constant expression (authored by spatel).
[InstCombine] don't move FP negation out of a constant expression
Thu, Jul 11, 6:45 AM
spatel committed rL365774: [InstCombine] don't move FP negation out of a constant expression.
[InstCombine] don't move FP negation out of a constant expression
Thu, Jul 11, 6:44 AM

Wed, Jul 10

spatel committed rG138328e45cdf: [SDAG] commute setcc operands to match a subtract (authored by spatel).
[SDAG] commute setcc operands to match a subtract
Wed, Jul 10, 4:26 PM
spatel committed rL365711: [SDAG] commute setcc operands to match a subtract.
[SDAG] commute setcc operands to match a subtract
Wed, Jul 10, 4:26 PM
spatel closed D63958: [SDAG] commute setcc operands to match a subtract.
Wed, Jul 10, 4:25 PM · Restricted Project
spatel added a comment to D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.
In D64258#1579214, @jfb wrote:

Also, would it make sense to separate readable from writable? We currently have this bug where LLVM will promote all const static globals to rodata, and sometimes generate atomic cmpxchg to them (e.g. because we're trying to load a 128-bit value). Similarly, we might want to honor R / W memory protection in general. Right now dereferenceable just means "you can load from this", because we can't speculate most stores.

I do not understand the problem but I have the feeling this is an orthogonal issue.

mprotect can make memory readable but not writable, or writable but not readable... or neither. What does dereferenceable mean when faced with this fact? Further, what happens to dereferenceable when mprotect is called (any opaque function could call it)? I don't think this is an orthogonal problem at all.

So, I guess what the above means is "dereferenceable" is too coarse grained. We have "global dereferenceability" that cannot be changed, and we have "local dereferenceability" that can be changed, e.g., through calls to free, realloc, or mprotect. From accesses we can only deduce "local dereferenceability". Now, that is why we need D61652, or more precisely, D63243. After those changes landed, the reasoning introduced in this patch should be fine, before, it is as broken as Clang is when it emits dereferenceable for arguments passed by reference. (The logic above, with the same problems and more, is also used in ArgumentPromotion right now...).

Wed, Jul 10, 3:21 PM · Restricted Project
spatel added a comment to D64142: [SLP] try to create vector loads from bitcasted scalar pointers.

Unanswered questions:

  1. Is there a better cost query than checking if the target has a vector register ( TTI->getRegisterBitWidth(true) ) that exceeds the load size?
  2. Do we require that multiple scalar loads are subsumed by the vector load?
Wed, Jul 10, 11:41 AM · Restricted Project
spatel added a comment to D63958: [SDAG] commute setcc operands to match a subtract.

Ping @jpienaar

Wed, Jul 10, 11:35 AM · Restricted Project
spatel updated the diff for D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.

Patch updated:
Add TODO code comment about using "isSimple()" and add test with an atomic load.

Wed, Jul 10, 11:30 AM · Restricted Project
spatel added inline comments to D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.
Wed, Jul 10, 11:30 AM · Restricted Project
spatel added a comment to D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.
In D64258#1578820, @jfb wrote:

Right now the control flow isn't clever, but I wonder if, as this analysis becomes more powerful, it'll have to act differently when -fno-delete-null-pointer-checks is specified? Is there a simple test that you can add to make sure null pointer checks don't cause false assumptions whenever this optimization becomes smarter?

Wed, Jul 10, 11:17 AM · Restricted Project
spatel added inline comments to D64468: Replace three "strip & accumulate" implementations with a single one.
Wed, Jul 10, 10:16 AM · Restricted Project
spatel updated the diff for D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.

Patch updated:

  1. Allow stores to have the same inferences as loads. This exposed more clang test failures, so those diffs are included.
  2. Don't infer anything from volatile (non-simple) memory accesses.
  3. There was a bug in how we dealt isGuaranteedToTransferExecutionToSuccessor(), so added an assert and a test with a function call to verify that.
  4. Added code/test for replacing DereferenceableOrNull attribute.
  5. Added FIXME comment to indicate that this pass should be subsumed by Attributor.
Wed, Jul 10, 10:08 AM · Restricted Project
spatel added a comment to D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.

Mostly comments to improve this. Two required changes.

Wed, Jul 10, 9:12 AM · Restricted Project
spatel committed rG9cd82a4fbd2d: [InferFunctionAttrs] add/adjust tests for dereferenceable; NFC (authored by spatel).
[InferFunctionAttrs] add/adjust tests for dereferenceable; NFC
Wed, Jul 10, 7:42 AM
spatel committed rL365636: [InferFunctionAttrs] add/adjust tests for dereferenceable; NFC.
[InferFunctionAttrs] add/adjust tests for dereferenceable; NFC
Wed, Jul 10, 7:42 AM

Tue, Jul 9

spatel created D64432: [InstCombine] try to narrow a truncated load.
Tue, Jul 9, 11:31 AM · Restricted Project
spatel committed rG5f4d7c9d4f20: [InstCombine] add tests for trunc(load); NFC (authored by spatel).
[InstCombine] add tests for trunc(load); NFC
Tue, Jul 9, 11:08 AM
spatel committed rL365523: [InstCombine] add tests for trunc(load); NFC.
[InstCombine] add tests for trunc(load); NFC
Tue, Jul 9, 11:06 AM
spatel added a comment to D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.

I do not want to block this patch but I still believe that this is the wrong way to go (middle/long term). The fact that we need to put this not in FunctionAttrs.cpp, where the other deductions live, but in InferFunctionAttrs.cpp, where we so far only annotated library functions, should be a first sign. Also, the functionality here is only one way to deduce dereferenceable, arguably, you want all the ways together such that they can benefit from each other.

Tue, Jul 9, 10:29 AM · Restricted Project
spatel added a comment to D64285: [InstCombine] Fold select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y) to ashr (X, Y)).

Are we really allowed to change the exact flag from InstSimplify?

Tue, Jul 9, 8:48 AM · Restricted Project
spatel updated the diff for D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.

Patch updated:

  1. Used GetPointerBaseWithConstantOffset() to allow more complex pattern matching.
  2. But limited that matching to cases where the argument and access have the same size to reduce complexity.
  3. Generalized variable names and comments to allow less churn for follow-up enhancements.
  4. Added tests with multiple dereferenceable arguments, pointer casts, and negative offsets.
Tue, Jul 9, 8:39 AM · Restricted Project
spatel added inline comments to D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.
Tue, Jul 9, 8:32 AM · Restricted Project
spatel committed rGfb453353dabe: [InferFunctionAttrs] add more tests for derefenceable; NFC (authored by spatel).
[InferFunctionAttrs] add more tests for derefenceable; NFC
Tue, Jul 9, 7:46 AM
spatel committed rL365495: [InferFunctionAttrs] add more tests for derefenceable; NFC.
[InferFunctionAttrs] add more tests for derefenceable; NFC
Tue, Jul 9, 7:43 AM

Mon, Jul 8

spatel committed rG3dee113ebcb3: [InstCombine] fold insertelement into splat of same scalar (authored by spatel).
[InstCombine] fold insertelement into splat of same scalar
Mon, Jul 8, 12:50 PM
spatel committed rL365379: [InstCombine] fold insertelement into splat of same scalar.
[InstCombine] fold insertelement into splat of same scalar
Mon, Jul 8, 12:48 PM
spatel committed rG77ccc04700ca: [InstCombine] add tests for insert of same splatted scalar; NFC (authored by spatel).
[InstCombine] add tests for insert of same splatted scalar; NFC
Mon, Jul 8, 11:05 AM
spatel committed rL365362: [InstCombine] add tests for insert of same splatted scalar; NFC.
[InstCombine] add tests for insert of same splatted scalar; NFC
Mon, Jul 8, 11:05 AM
spatel accepted D64037: [IR][PatternMatch] introduce m_Unless() matcher.

LGTM

Mon, Jul 8, 9:56 AM · Restricted Project
spatel committed rG0b59103a73bf: [InstCombine] canonicalize insert+splat to/from element 0 of vector (authored by spatel).
[InstCombine] canonicalize insert+splat to/from element 0 of vector
Mon, Jul 8, 9:28 AM
spatel committed rL365342: [InstCombine] canonicalize insert+splat to/from element 0 of vector.
[InstCombine] canonicalize insert+splat to/from element 0 of vector
Mon, Jul 8, 9:26 AM
spatel committed rG320a28200f24: [InstCombine] fix typo in test; NFC (authored by spatel).
[InstCombine] fix typo in test; NFC
Mon, Jul 8, 8:41 AM
spatel committed rL365333: [InstCombine] fix typo in test; NFC.
[InstCombine] fix typo in test; NFC
Mon, Jul 8, 8:41 AM
spatel committed rG74cbaa37b663: [InstCombine] add tests for splat shuffles; NFC (authored by spatel).
[InstCombine] add tests for splat shuffles; NFC
Mon, Jul 8, 7:50 AM
spatel committed rL365325: [InstCombine] add tests for splat shuffles; NFC.
[InstCombine] add tests for splat shuffles; NFC
Mon, Jul 8, 7:49 AM
spatel added a comment to D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.

Thanks for thinking of me ;) And again, I think this is an important change we need!

The Attributor is in tree and, if enabled, it is run very early (as I very very strongly believe it should). I think we can get the Attributor enabled for the next release (maybe with a low iteration count and restrictions on the attributes we derive). Now there are two missing parts to get this functionality into the Attributor in a decent way:

  1. A generic way to "look around for existing information" (more on this below).
  2. The abstract attribute for dereferenceability(_or_null) that makes use of 1) and potentially performs usual deduction.

    Implementing 2) is fairly easy. It should not take long to create the boilerplate if we only want to rely on the deduction through 1). Also, the logic is already in this patch (and the old prototype). Regarding 1): I was going to work on this once I found some free cycles but I could do it now if we decide to go this way. The idea is that you specify a program point PP (=instruction) and a callback. The callback is then automatically applied to all instruction which have to be executed when PP is also reached, either before or after. I would like this to be an abstract interface from the get-go but I am also willing to provide the interface and the initial implementation that will at least suffice for this use case. It should then be used from the AbstractAttribute::initialize and AbstractAttribute::updateImpl method of the abstract attribute for the dereferenceable attribute (and others later as well).

P.S. You should be aware of the change to dereferenceability that is going to happen very soon, see D61652 and D63243 (I'm still fixing that one).

Mon, Jul 8, 7:19 AM · Restricted Project

Sat, Jul 6

spatel added a comment to D64285: [InstCombine] Fold select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y) to ashr (X, Y)).

If this transform always returns an existing value, it can go in InstSimplify. Please pre-commit the baseline tests (in the InstSimplify directory and change the RUN line if I got that right).

Sat, Jul 6, 7:50 AM · Restricted Project

Fri, Jul 5

spatel added a comment to D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.

The direction of this makes total sense and we will need it. However this shoulnd't be here (wrt. the file/pass).

Assuming we want this right right now, it should life in FunctionAttrs.cpp. Assuming we want to do it "right" it should become part of the Attributor framework.

The early prototype of the "deref-or-null" abstract attribute already had this functionality, see https://reviews.llvm.org/D59202#C1381429NL1995, and the test case https://reviews.llvm.org/D59202#change-FJbHx7N4s6ye . For the new Attributor, dereferenceable-or-null has not yet been ported and the transfer of "close by information" is not part of the new model. Both things are going to change soon.

Fri, Jul 5, 2:37 PM · Restricted Project
spatel updated the diff for D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.

Patch updated:
I missed diffs in some existing over-reaching clang and AMDGPU tests. These regression tests should not be testing the entire optimization pipeline, but I adjusted the assertions to make them pass.

Fri, Jul 5, 12:37 PM · Restricted Project
spatel created D64258: [InferFuncAttributes] extend 'dereferenceable' attribute based on loads.
Fri, Jul 5, 11:23 AM · Restricted Project