Page MenuHomePhabricator

spatel (Sanjay Patel)
User

Projects

User does not belong to any projects.

User Details

User Since
May 22 2014, 1:24 PM (325 w, 17 h)

Recent Activity

Today

spatel added a comment to D85969: [SCEV] Model (xor (shl x, C), (-1 << C)) as (shl (xor x, -1), C).

I don't have much experience with SCEV either, but I think your draft change for instcombine made sense:
https://bugs.llvm.org/show_bug.cgi?id=47136#c8

Fri, Aug 14, 6:00 AM · Restricted Project

Yesterday

spatel added a comment to D84547: [ConstraintElimination] Add constraint elimination pass..

We definitely stretch instcombine beyond the basics to try to reduce icmps, so if we can do that more efficiently here, that seems like a win.
Do you know if http://bugs.llvm.org/PR47091 could (eventually) be handled here?
Is there any initial compile-time data?

Thu, Aug 13, 1:01 PM · Restricted Project
spatel added a comment to D85787: [InstCombine] Aggregate reconstruction simplification (PR47060).

@ reviewers - i'm not so much interested in deep code/algo review,
but more like in the general direction disscussion, like, is this okay for instcombine? :)

Thu, Aug 13, 12:39 PM · Restricted Project, Restricted Project
spatel added a comment to D85709: [InstSimplify] Implement Instruction simplification for X/sqrt(X) to sqrt(X)..

For x = 0, x/sqrt(0) result in "nan". However when we specify -ffast-math we are setting "nnan" flag. The nnan flag says "Allow optimizations to assume the arguments and result are not NaN" so we can transform x/sqrt(x) to sqrt(x) under -ffast-math. is that the right understanding here?

Thu, Aug 13, 11:44 AM · Restricted Project
spatel accepted D85912: [VectorCombine] Fix for non-zero addrspace when creating vector load from scalar load.

Thanks! LGTM - see inline for a minor mod.

Thu, Aug 13, 8:56 AM · Restricted Project
spatel added a comment to D85709: [InstSimplify] Implement Instruction simplification for X/sqrt(X) to sqrt(X)..

After looking at the codegen, I'm not sure if we can do this transform in IR with the expected performance in codegen because the transform loses information:
https://godbolt.org/z/7b84rG

The codegen for the case of "sqrt(x)" has to account for a 0.0 input. Ie, we filter out a 0.0 (or potentially denorm) input to avoid the NAN answer that we would get from "0.0 / 0.0". But the codegen for the case of "x/sqrt(x)" does not have to do that - NAN is the correct answer for a 0.0 input, so the code has implicitly signaled to us that 0.0 is not a valid input when compiled with -ffast-math (we can ignore possible NANs).

It might help to see the motivating code that produces the x/sqrt(x) pattern to see if there's something else we should be doing there.

Current AMD "x86_64" targets don't have the reciprocal sqrt instruction for the double precision types.
so x/sqrt(x) ends up with "vsqrtsd" followed by "vdivsd". This transform is basically to improve the efficiency.

Ah, I see. I think we should handle that in generic DAGCombiner then. There, we can make the target- and CPU-specific trade-offs necessary to get the (presumably) ideal asm code. I don't know how we would recover the missing div-by-0 info that I mentioned here.
Let me know if you want to try that patch. If not, I can take a shot at it.

Sure I will work on the DAGCombiner patch .

Thu, Aug 13, 8:38 AM · Restricted Project
spatel committed rGdd1a900575ff: [AArch64][x86] add tests for x/sqrt(x); NFC (authored by spatel).
[AArch64][x86] add tests for x/sqrt(x); NFC
Thu, Aug 13, 8:36 AM
spatel added a comment to D85709: [InstSimplify] Implement Instruction simplification for X/sqrt(X) to sqrt(X)..

After looking at the codegen, I'm not sure if we can do this transform in IR with the expected performance in codegen because the transform loses information:
https://godbolt.org/z/7b84rG

The codegen for the case of "sqrt(x)" has to account for a 0.0 input. Ie, we filter out a 0.0 (or potentially denorm) input to avoid the NAN answer that we would get from "0.0 / 0.0". But the codegen for the case of "x/sqrt(x)" does not have to do that - NAN is the correct answer for a 0.0 input, so the code has implicitly signaled to us that 0.0 is not a valid input when compiled with -ffast-math (we can ignore possible NANs).

It might help to see the motivating code that produces the x/sqrt(x) pattern to see if there's something else we should be doing there.

Current AMD "x86_64" targets don't have the reciprocal sqrt instruction for the double precision types.
so x/sqrt(x) ends up with "vsqrtsd" followed by "vdivsd". This transform is basically to improve the efficiency.

Thu, Aug 13, 7:17 AM · Restricted Project
spatel added a comment to D85825: [ARM] Enabled VMLAV and Add instructions to use VMLAVA.

The part I am a little unsure about is that it is converting add(vmlav) into a new vmlava, but we are starting at the vmlav to do it. So need to set the insertion point and use replaceAllUsesWith/eraseInstFromFunction. I couldn't find anything else that worked like that in instcombine, but maybe just missed it. It seems to work just fine like this, but it might be cause potential problems with the way instcombine iterates? It hasn't caused any problems we have seen though.

Thu, Aug 13, 7:09 AM · Restricted Project

Wed, Aug 12

spatel committed rG23bd33c6acc4: [InstCombine] prefer xor with -1 because 'not' is easier to understand (PR32706) (authored by spatel).
[InstCombine] prefer xor with -1 because 'not' is easier to understand (PR32706)
Wed, Aug 12, 12:51 PM
spatel committed rG0a1514d7ca4f: [InstCombine] add test for 'not' vs 'xor'; NFC (authored by spatel).
[InstCombine] add test for 'not' vs 'xor'; NFC
Wed, Aug 12, 12:50 PM
Herald added a project to D32255: [InstCombine] prefer xor with -1 because 'not' is easier to understand: Restricted Project.
Wed, Aug 12, 12:50 PM · Restricted Project
spatel added a comment to D72396: [InstCombine] fold zext of masked bit set/clear.

What is status of this patch?

Wed, Aug 12, 12:20 PM · Restricted Project
spatel committed rGcc892fd9f4cb: [VectorCombine] early exit if target has no vector registers (authored by spatel).
[VectorCombine] early exit if target has no vector registers
Wed, Aug 12, 6:29 AM
spatel committed rG89a7f64afc79: [VectorCombine] add test for x86 target with SSE disabled; NFC (authored by spatel).
[VectorCombine] add test for x86 target with SSE disabled; NFC
Wed, Aug 12, 6:29 AM
spatel committed rG912c09e845cb: [InstCombine] eliminate a pointer cast around insertelement (authored by spatel).
[InstCombine] eliminate a pointer cast around insertelement
Wed, Aug 12, 6:10 AM
spatel closed D85647: [InstCombine] eliminate a pointer cast around insertelement.
Wed, Aug 12, 6:10 AM · Restricted Project
spatel added a comment to D81766: [VectorCombine] try to create vector loads from scalar loads.
In D81766#2211952, @srj wrote:

Can we just add another condition (!VectorSize) to this bailout?
if (!ScalarSize || VectorSize % ScalarSize != 0)

Changing it to if (!ScalarSize || !VectorSize || VectorSize % ScalarSize != 0) does indeed seem to make our failure go away, so that would be fine as a quick fix.

(I'm still surprised that getMinVectorRegisterBitWidth() should ever return 0, but I can take that up with Qualcomm folks separately; it is entirely possible I don't understand the full semantics of that method.)

rGb0b95dab1ce2
l'll see if I can find a better API and/or test case for that tomorrow.

Wed, Aug 12, 5:58 AM · Restricted Project
spatel committed rGb97e402ca5ba: [VectorCombine] add test for Hexagon that would crash; NFC (authored by spatel).
[VectorCombine] add test for Hexagon that would crash; NFC
Wed, Aug 12, 5:38 AM
spatel added inline comments to D85825: [ARM] Enabled VMLAV and Add instructions to use VMLAVA.
Wed, Aug 12, 5:16 AM · Restricted Project

Tue, Aug 11

spatel added a comment to D81766: [VectorCombine] try to create vector loads from scalar loads.
In D81766#2211952, @srj wrote:

Can we just add another condition (!VectorSize) to this bailout?
if (!ScalarSize || VectorSize % ScalarSize != 0)

Changing it to if (!ScalarSize || !VectorSize || VectorSize % ScalarSize != 0) does indeed seem to make our failure go away, so that would be fine as a quick fix.

(I'm still surprised that getMinVectorRegisterBitWidth() should ever return 0, but I can take that up with Qualcomm folks separately; it is entirely possible I don't understand the full semantics of that method.)

Tue, Aug 11, 5:33 PM · Restricted Project
spatel committed rGb0b95dab1ce2: [VectorCombine] add safety check for 0-width register (authored by spatel).
[VectorCombine] add safety check for 0-width register
Tue, Aug 11, 5:30 PM
spatel added a comment to D81766: [VectorCombine] try to create vector loads from scalar loads.
In D81766#2211936, @srj wrote:

Update: it appears that VectorSize (from TTI.getMinVectorRegisterBitWidth()) is zero in this case, which causes the assertion failure.

This appears to be the case because HexagonTTIImpl::getMinVectorRegisterBitWidth() returns 0 if useHVX() isn't true... and useHVX() returns false if the HexagonAutoHVX option isn't enabled.

By design, Halide doesn't enable the HexagonAutoHVX option; we like to do all the vectorization ourselves.

I'm not sure how to resolve this issue -- the flaw here seems to lie in HexagonTTIImpl's assumption that disabling HexagonAutoHVX should cause it to report zero-width vectors, which seems to be a dubious decision (and one that is at odds with every other implementation of getMinVectorRegisterBitWidth() that I see in trunk LLVM (none of them appear to ever return 0).

Would it make sense to consider backing out this change until this can be resolved, since it clearly appears to have bad consequences for Hexagon/HVX codegen?

Tue, Aug 11, 5:13 PM · Restricted Project
spatel updated the diff for D85647: [InstCombine] eliminate a pointer cast around insertelement.

Patch updated:
Given the transform above this one and the type checks included there and here, I think we can assert that sizes match. Let me know if that handles the possible problem patterns that we must avoid.

Tue, Aug 11, 1:59 PM · Restricted Project
spatel added a comment to D85709: [InstSimplify] Implement Instruction simplification for X/sqrt(X) to sqrt(X)..

After looking at the codegen, I'm not sure if we can do this transform in IR with the expected performance in codegen because the transform loses information:
https://godbolt.org/z/7b84rG

Tue, Aug 11, 12:34 PM · Restricted Project
spatel committed rG1470ce4a76fc: [InstSimplify] fold min/max with matching min/max operands (authored by spatel).
[InstSimplify] fold min/max with matching min/max operands
Tue, Aug 11, 8:24 AM
spatel committed rGbad205fe0c74: [InstSimplify] add tests for min/max intrinsics with common operands; NFC (authored by spatel).
[InstSimplify] add tests for min/max intrinsics with common operands; NFC
Tue, Aug 11, 8:24 AM
spatel committed rGea8c186c408d: [InstCombine] add tests for pointer casts with insertelement; NFC (authored by spatel).
[InstCombine] add tests for pointer casts with insertelement; NFC
Tue, Aug 11, 8:24 AM
spatel added a comment to D85709: [InstSimplify] Implement Instruction simplification for X/sqrt(X) to sqrt(X)..

I'm fairly sure this transform is a performance loss. For a target like Skylake Server, a SQRT(x) can take up to 20 cycles. But a RSQRT(x) is about 6 cycles and a MUL(y) is 4 cycles. We'd be better off with a X*RSQRT(X).

That is up to backends to decide. InstSimplify/InstCombine (and a few others) are canonicalization, target-independent passes.
A single sqrt(x) is more canonical IR than x/sqrt(x), because it's less instructions and x has less uses.

I agree with that. It should be canonicalized. It would also be good to make sure that the backends have lowering code in place before introducing a 2x performance hit.

Tue, Aug 11, 7:53 AM · Restricted Project
spatel added a comment to D85709: [InstSimplify] Implement Instruction simplification for X/sqrt(X) to sqrt(X)..

Just a drive-by comment : Possibly the more general form of this needs to be optimised too?

The more general form of this is

X^a * X^b -> X ^ (a+b)

X^a / X^b -> X ^ (a-b)
Tue, Aug 11, 7:24 AM · Restricted Project

Mon, Aug 10

spatel updated subscribers of D85593: [InstCombine] ~(~X + Y) -> X - Y.

I agree that this looks wrong for the example with undef substituted in...but that also means that almost every instcombine that doesn't specify hasOneUse() is wrong!
But then how are we not miscompiling more often? We must be simplifying instructions with undefs before they can escape and cause trouble as @nikic hinted at?
@efriedma @regehr @nlopes - comments?

Mon, Aug 10, 3:03 PM · Restricted Project
spatel added a comment to D54749: Saturating float to int casts: Basics [1/n].

It seems like there are quite a few people who would like to use this intrinsic, though. The only lack of interest seems to be in reviewing.
I was planning on submitting something similar for the fixed-point support, but now that I know this patch exists, I feel like helping this move forward is the better option.

Mon, Aug 10, 7:22 AM · Restricted Project
spatel requested review of D85647: [InstCombine] eliminate a pointer cast around insertelement.
Mon, Aug 10, 6:32 AM · Restricted Project
spatel committed rG3d5118b75c65: [InstCombine] auto-generate test checks; NFC (authored by spatel).
[InstCombine] auto-generate test checks; NFC
Mon, Aug 10, 5:28 AM
spatel committed rGbebca662d4ff: [InstCombine] rearrange code for readability; NFC (authored by spatel).
[InstCombine] rearrange code for readability; NFC
Mon, Aug 10, 5:08 AM

Sun, Aug 9

spatel added a comment to D85593: [InstCombine] ~(~X + Y) -> X - Y.

Please pre-commit the test file and upload diff with context.

Sun, Aug 9, 6:47 AM · Restricted Project
spatel committed rG43bdac290663: [VectorCombine] try to create vector loads from scalar loads (authored by spatel).
[VectorCombine] try to create vector loads from scalar loads
Sun, Aug 9, 6:19 AM
spatel closed D81766: [VectorCombine] try to create vector loads from scalar loads.
Sun, Aug 9, 6:19 AM · Restricted Project
spatel added inline comments to D85593: [InstCombine] ~(~X + Y) -> X - Y.
Sun, Aug 9, 6:13 AM · Restricted Project

Sat, Aug 8

spatel abandoned D64142: [SLP] try to create vector loads from bitcasted scalar pointers.

@spatel Has this been superseded by D81766?

Sat, Aug 8, 9:26 AM · Restricted Project
spatel updated the diff for D81766: [VectorCombine] try to create vector loads from scalar loads.

Patch updated:
Changed pointer check to an assert.

Sat, Aug 8, 8:30 AM · Restricted Project
spatel committed rGf22ac1d15b1b: [DAGCombiner] reassociate reciprocal sqrt expression to eliminate FP division… (authored by spatel).
[DAGCombiner] reassociate reciprocal sqrt expression to eliminate FP division…
Sat, Aug 8, 8:11 AM
spatel committed rGba4c214181d1: [x86] add tests for another reciprocal sqrt pattern; NFC (authored by spatel).
[x86] add tests for another reciprocal sqrt pattern; NFC
Sat, Aug 8, 8:11 AM

Fri, Aug 7

spatel added a comment to D85533: [InstCombine] Optimize select(freeze(icmp eq/ne x, y), x, y).

Is there a reason to do this in InstCombine rather than InstSimplify? We are not creating any new instructions.

Fri, Aug 7, 2:07 PM · Restricted Project
spatel updated the diff for D81766: [VectorCombine] try to create vector loads from scalar loads.

Patch updated:
Use TTI.getScalarizationOverhead() to model insert cost of original code. I had not used this API before and the documentation comment isn't entirely clear to me, so please see if that looks as intended.

Fri, Aug 7, 9:15 AM · Restricted Project
spatel accepted D85416: [ScalarizeMaskedMemIntrin] Scalarize constant mask expandload as shuffle(build_vector,pass_through).

I'm ok with this as-is (mod "-1" code nit) followed by cleanup, so LGTM.
Wait a bit to see if anyone else wants to comment though.

Fri, Aug 7, 8:14 AM · Restricted Project
spatel added a comment to D84250: [InstSimplify] ExpandBinOp should fold an expression only when it's safe.

Should we go to D84792's direction?
If it is, I think we should mention that it is a workaround for the miscompilations including PR33165, and it can still be problematic in theory but chose to do it for performance.

Fri, Aug 7, 8:06 AM · Restricted Project
spatel added a comment to D85504: [Reassociate] [PowerPC] stop common out mul factors if fma is preferred on target.

The reassociate pass (like instcombine) is supposed to be a target-independent canonicalization pass:

  // This pass reassociates commutative expressions in an order that is designed
  // to promote better constant propagation, GCSE, LICM, PRE, etc.

So it's intentionally avoiding using TTI/TLI and trying not to do all forms of reassociation. As Roman mentioned, see D67383 as an example of a possible TTI-aware reassociation pass.

Fri, Aug 7, 7:50 AM · Restricted Project
spatel updated the diff for D81766: [VectorCombine] try to create vector loads from scalar loads.

Patch updated:
Adjusted to match the most basic pattern that starts with an insertelement (so there's no extract created here). Hopefully, that removes any concern about interactions with other passes. Ie, the transform should almost always be profitable. (We could make an argument that this could be part of canonicalization, but we conservatively try not to create vector ops from scalar ops in passes like instcombine.)

Fri, Aug 7, 6:27 AM · Restricted Project

Thu, Aug 6

spatel added inline comments to rGe8760bb9a8a3: [InstSimplify] fold icmp with mul nsw and constant operands.
Thu, Aug 6, 1:08 PM
spatel committed rG250a167c4181: [InstSimplify] avoid crashing by trying to rem-by-zero (authored by spatel).
[InstSimplify] avoid crashing by trying to rem-by-zero
Thu, Aug 6, 1:07 PM
spatel committed rGc9bcc237a284: [VectorCombine] add tests for load+insert; NFC (authored by spatel).
[VectorCombine] add tests for load+insert; NFC
Thu, Aug 6, 12:45 PM
spatel committed rG60f2c6a94cdf: [PatternMatch] allow intrinsic form of min/max with existing matchers (authored by spatel).
[PatternMatch] allow intrinsic form of min/max with existing matchers
Thu, Aug 6, 7:53 AM
spatel closed D85230: [PatternMatch] allow intrinsic form of min/max with existing matchers.
Thu, Aug 6, 7:52 AM · Restricted Project
spatel added inline comments to D85416: [ScalarizeMaskedMemIntrin] Scalarize constant mask expandload as shuffle(build_vector,pass_through).
Thu, Aug 6, 7:18 AM · Restricted Project

Wed, Aug 5

spatel committed rGc66169136fe6: [InstCombine] fold icmp with 'mul nsw/nuw' and constant operands (authored by spatel).
[InstCombine] fold icmp with 'mul nsw/nuw' and constant operands
Wed, Aug 5, 2:30 PM
spatel committed rG0315571a19bb: [InstCombine] add tests for icmp with mul nsw/nuw; NFC (authored by spatel).
[InstCombine] add tests for icmp with mul nsw/nuw; NFC
Wed, Aug 5, 2:30 PM
spatel committed rGe8760bb9a8a3: [InstSimplify] fold icmp with mul nsw and constant operands (authored by spatel).
[InstSimplify] fold icmp with mul nsw and constant operands
Wed, Aug 5, 11:39 AM
spatel committed rGf879c9b79621: [InstSimplify] fold icmp with mul nuw and constant operands (authored by spatel).
[InstSimplify] fold icmp with mul nuw and constant operands
Wed, Aug 5, 11:32 AM
spatel committed rGa569a0af0d96: [InstSimplify] add vector tests for icmp with mul nuw; NFC (authored by spatel).
[InstSimplify] add vector tests for icmp with mul nuw; NFC
Wed, Aug 5, 11:32 AM
spatel committed rG719954eacb70: [InstSimplify] add tests for icmp with 'mul nuw' operand; NFC (authored by spatel).
[InstSimplify] add tests for icmp with 'mul nuw' operand; NFC
Wed, Aug 5, 9:47 AM
spatel committed rGbd2c88b253b0: [InstSimplify] reduce code duplication in simplifyICmpWithMinMax(); NFC (authored by spatel).
[InstSimplify] reduce code duplication in simplifyICmpWithMinMax(); NFC
Wed, Aug 5, 8:40 AM

Tue, Aug 4

spatel added a comment to D85230: [PatternMatch] allow intrinsic form of min/max with existing matchers.

Maybe make it clear at the start of InstCombinerImpl::foldSelectOpOp where we use the matchers that we can safely use the matchers for the select(icmp) pattern OR the intrinsics?

Tue, Aug 4, 2:18 PM · Restricted Project
spatel requested review of D85230: [PatternMatch] allow intrinsic form of min/max with existing matchers.
Tue, Aug 4, 11:12 AM · Restricted Project
spatel committed rG960cef75f4d2: [InstSimplify] add tests for compare of min/max; NFC (authored by spatel).
[InstSimplify] add tests for compare of min/max; NFC
Tue, Aug 4, 10:55 AM
spatel committed rGa16882047a3f: [InstSimplify] refactor min/max folds with shared operand; NFC (authored by spatel).
[InstSimplify] refactor min/max folds with shared operand; NFC
Tue, Aug 4, 9:21 AM
spatel committed rG04e45ae1c6d2: [InstSimplify] fold nested min/max intrinsics with constant operands (authored by spatel).
[InstSimplify] fold nested min/max intrinsics with constant operands
Tue, Aug 4, 5:45 AM
spatel committed rG011e15bea345: [InstSimplify] add tests for min/max with constants; NFC (authored by spatel).
[InstSimplify] add tests for min/max with constants; NFC
Tue, Aug 4, 5:19 AM
spatel committed rG20c71e55aad5: [InstSimplify] reduce code for min/max analysis; NFC (authored by spatel).
[InstSimplify] reduce code for min/max analysis; NFC
Tue, Aug 4, 5:19 AM

Mon, Aug 3

spatel committed rG9e5cf6bde596: [InstSimplify] fold variations of max-of-min with common operand (authored by spatel).
[InstSimplify] fold variations of max-of-min with common operand
Mon, Aug 3, 12:16 PM
spatel committed rG7efd9ceb588b: [InstSimplify] add tests for min-of-max variants; NFC (authored by spatel).
[InstSimplify] add tests for min-of-max variants; NFC
Mon, Aug 3, 12:16 PM
spatel committed rG23693ffc3ba6: [InstCombine] reduce xor-of-or's bitwise logic (PR46955); 2nd try (authored by spatel).
[InstCombine] reduce xor-of-or's bitwise logic (PR46955); 2nd try
Mon, Aug 3, 7:28 AM
spatel added a reverting change for rG2265d01f2a5b: [InstCombine] reduce xor-of-or's bitwise logic (PR46955): rGf19a9be385ef: Revert "[InstCombine] reduce xor-of-or's bitwise logic (PR46955)".
Mon, Aug 3, 6:00 AM
spatel committed rGf19a9be385ef: Revert "[InstCombine] reduce xor-of-or's bitwise logic (PR46955)" (authored by spatel).
Revert "[InstCombine] reduce xor-of-or's bitwise logic (PR46955)"
Mon, Aug 3, 6:00 AM
spatel committed rG2265d01f2a5b: [InstCombine] reduce xor-of-or's bitwise logic (PR46955) (authored by spatel).
[InstCombine] reduce xor-of-or's bitwise logic (PR46955)
Mon, Aug 3, 5:32 AM
spatel committed rGb57ea8ef2a8a: [InstCombine] add tests for xor-of-ors; NFC (authored by spatel).
[InstCombine] add tests for xor-of-ors; NFC
Mon, Aug 3, 5:32 AM

Sun, Aug 2

spatel committed rG4abc69c6f541: [InstSimplify] fold max (max X, Y), X --> max X, Y (authored by spatel).
[InstSimplify] fold max (max X, Y), X --> max X, Y
Sun, Aug 2, 8:51 AM
spatel committed rGe37987563ad1: [InstSimplify] add tests for max(max x,y), x) and variants; NFC (authored by spatel).
[InstSimplify] add tests for max(max x,y), x) and variants; NFC
Sun, Aug 2, 8:51 AM

Sat, Aug 1

spatel updated the diff for D81766: [VectorCombine] try to create vector loads from scalar loads.

Patch updated:
No code changes, but added tests for non-zero gep offsets. This actually works more than I was expecting given that we're only using the base stripPointerCasts(). If there are enough dereferenceable bytes, isSafeToLoadUnconditionally() can still manage to use the offset load via the existing gep.

Sat, Aug 1, 7:26 AM · Restricted Project
spatel added inline comments to D81766: [VectorCombine] try to create vector loads from scalar loads.
Sat, Aug 1, 7:20 AM · Restricted Project
spatel committed rGd620a6fe98f7: [VectorCombine] add tests for non-zero gep offsets; NFC (authored by spatel).
[VectorCombine] add tests for non-zero gep offsets; NFC
Sat, Aug 1, 7:19 AM
spatel added a comment to D85053: [InstSimplify] Fold abs(abs(x)) -> abs(x).

Will need update after:
rG04b99a4d18cf

Sat, Aug 1, 7:03 AM · Restricted Project
spatel committed rG04b99a4d18cf: [InstSimplify] simplify abs if operand is known non-negative (authored by spatel).
[InstSimplify] simplify abs if operand is known non-negative
Sat, Aug 1, 4:55 AM
spatel committed rG1aa52d67d1c1: [InstSimplify] add abs test with assume; NFC (authored by spatel).
[InstSimplify] add abs test with assume; NFC
Sat, Aug 1, 4:55 AM
spatel closed D85043: [InstSimplify] simplify abs if operand is known non-negative.
Sat, Aug 1, 4:55 AM · Restricted Project

Fri, Jul 31

spatel committed rG77a02527dc39: [InstSimplify] add tests for abs intrinsic; NFC (authored by spatel).
[InstSimplify] add tests for abs intrinsic; NFC
Fri, Jul 31, 3:50 PM
spatel updated the diff for D85043: [InstSimplify] simplify abs if operand is known non-negative.

Patch updated:
Pass in more info from the SimplifyQuery to ValueTracking.
Not sure what it takes to show the benefit, so no test diffs, but if there are suggestions, let me know.

Fri, Jul 31, 3:46 PM · Restricted Project
spatel accepted D85000: [ValueTracking] Improve llvm.abs handling in computeKnownBits..
Fri, Jul 31, 3:40 PM · Restricted Project
spatel added a comment to D85000: [ValueTracking] Improve llvm.abs handling in computeKnownBits..

LGTM

Fri, Jul 31, 3:40 PM · Restricted Project
spatel requested review of D85043: [InstSimplify] simplify abs if operand is known non-negative.
Fri, Jul 31, 1:48 PM · Restricted Project
spatel committed rGe591713bff1f: [ConstantFolding] fold abs intrinsic (authored by spatel).
[ConstantFolding] fold abs intrinsic
Fri, Jul 31, 11:09 AM
spatel closed D84942: [ConstantFolding] fold abs intrinsic.
Fri, Jul 31, 11:09 AM · Restricted Project
spatel accepted D84250: [InstSimplify] ExpandBinOp should fold an expression only when it's safe.

It's not clear to me how these all of the undef safety features will come together, but this looks correct and unlikely to cause regressions, so LGTM.

Fri, Jul 31, 9:14 AM · Restricted Project
spatel accepted D84971: [ValueTracking] Add ComputeNumSignBits support for llvm.abs intrinsic.

LGTM

Fri, Jul 31, 9:02 AM · Restricted Project
spatel added a comment to D81766: [VectorCombine] try to create vector loads from scalar loads.

Ping.

Fri, Jul 31, 9:02 AM · Restricted Project
spatel added inline comments to D85000: [ValueTracking] Improve llvm.abs handling in computeKnownBits..
Fri, Jul 31, 9:02 AM · Restricted Project
spatel added a comment to D84250: [InstSimplify] ExpandBinOp should fold an expression only when it's safe.

I tried to find other examples where this could happen, and I can't find anything else either.
So I think this just needs some minor changes.
However, Alive2 does not show the "f_dontfold" test example or my reduced candidate test as incorrect - is that a bug for Alive2?
https://alive2.llvm.org/ce/z/dbGLzN

Fri, Jul 31, 6:02 AM · Restricted Project

Thu, Jul 30

spatel added a comment to D84963: [ValueTracking] Add basic computeKnownBits support for llvm.abs intrinsic.

The SelectionDAG version does the following, so it would make sense to adapt these here:

// If the source's MSB is zero then we know the rest of the bits already.
if (Known2.isNonNegative()) {
  Known.Zero = Known2.Zero;
  Known.One = Known2.One;
  break;
}
Thu, Jul 30, 4:02 PM · Restricted Project
spatel accepted D84963: [ValueTracking] Add basic computeKnownBits support for llvm.abs intrinsic.

LGTM

Thu, Jul 30, 3:55 PM · Restricted Project
spatel committed rG7551fd5ef8fd: [InstCombine] update test checks; NFC (authored by spatel).
[InstCombine] update test checks; NFC
Thu, Jul 30, 11:17 AM