spatel (Sanjay Patel)
User

Projects

User does not belong to any projects.

User Details

User Since
May 22 2014, 1:24 PM (156 w, 4 d)

Recent Activity

Today

spatel added a comment to D33172: [InstCombine] Simpify inverted predicates in 'or'.

Nothing's easy with this one...the transform that I thought could incur a regression may not even be legal...but it's not clear if anyone knows for certain. :)
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html

Mon, May 22, 2:48 PM
spatel added a comment to D33375: [InstCombine] Don't fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2) when X is a PHI node..
In D33375#761185, @wdng wrote:
<badref> = icmp slt i32 %myValue.0, 254
Mon, May 22, 10:06 AM
spatel added a comment to D33375: [InstCombine] Don't fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2) when X is a PHI node..

Sorry, but I still don't understand.

Mon, May 22, 9:47 AM
spatel added a comment to D33375: [InstCombine] Don't fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2) when X is a PHI node..

I don't understand why this change is needed. Can you explain how the test in this patch can miscompile?

Mon, May 22, 9:09 AM
spatel updated subscribers of D33338: [InstCombineCasts] Take in account final size when transforming sext->lshr->trunc patterns.

This is strictly safer (does the transform less often), so LGTM. Bonus: we're not creating an unnecessary cast in the case where the result size is the same as the source op.

@davide, let me know if you plan to work on the (lshr (sext X), C) patterns. I checked in the tests with rL303504.

I do plan to work on them, but if you have some bandwidth, any help would be appreciated :)
Should we open a bug so that we don't forget?

Mon, May 22, 8:38 AM
spatel added a comment to D32725: [InstCombine] Apply deMorgan to (and/or (not cmp1), cmp2) when cmp1 has multiple uses, but cmp2 has a single use.

I forgot that I already filed this bug that will result if we lift the one-use restriction for // xor (cmp A, B), true = not (cmp A, B) = !cmp A, B :
https://bugs.llvm.org/show_bug.cgi?id=32791

Mon, May 22, 7:30 AM
spatel added inline comments to D33406: PR28129 expand vector oparation to an IR constant..
Mon, May 22, 6:42 AM

Yesterday

spatel accepted D33338: [InstCombineCasts] Take in account final size when transforming sext->lshr->trunc patterns.

This is strictly safer (does the transform less often), so LGTM. Bonus: we're not creating an unnecessary cast in the case where the result size is the same as the source op.

Sun, May 21, 8:22 AM
spatel committed rL303504: [InstCombine] add tests for potential (lshr(sext X), C) folds; NFC.
[InstCombine] add tests for potential (lshr(sext X), C) folds; NFC
Sun, May 21, 8:19 AM
spatel added a comment to D33172: [InstCombine] Simpify inverted predicates in 'or'.

There are at least a couple of current patches that delve into the same philosophical territory, so I think it's worth following the discussions here too:
https://reviews.llvm.org/D33342
https://reviews.llvm.org/D33338

Sun, May 21, 8:09 AM
spatel added a comment to D33172: [InstCombine] Simpify inverted predicates in 'or'.

Rebased onto rL303133. Sorry, I'm a new contributor, so I'm having trouble understanding how to proceed here. Is there anything specifically I should change in this diff? Should I be using the isCanonicalPredicate helper added in rL303261? Or should I be using a different approach altogether?

Sun, May 21, 7:45 AM

Fri, May 19

spatel accepted D32916: [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X).

I agree that we shouldn't gate this patch waiting for a reply to the i1024 question. If you're confident that the code in those super-wide cases is still correct, then LGTM.

Fri, May 19, 10:41 AM
spatel added a comment to D33137: [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790).

Ping.

Fri, May 19, 10:22 AM
spatel added a comment to D32143: [InstSimplify] use ConstantRange to simplify more and-of-icmps.

Here's a proposal to fix xor-of-icmps better:
D33342

Fri, May 19, 10:14 AM
spatel added a comment to D33338: [InstCombineCasts] Take in account final size when transforming sext->lshr->trunc patterns.

Here are some tests for the smaller patterns that we might want to match:

Fri, May 19, 8:50 AM
spatel added a comment to D33338: [InstCombineCasts] Take in account final size when transforming sext->lshr->trunc patterns.

So what youre suggesting here is to canonicalize and remove this Xform entirely?

Fri, May 19, 8:43 AM
spatel added a comment to D33338: [InstCombineCasts] Take in account final size when transforming sext->lshr->trunc patterns.

I think it's good to solve the miscompile case ASAP, but fundamentally, this block of code is error-prone and incomplete because we're missing earlier folds that would have kept us out of trouble from trying to reason about a 3 instruction sequence (trunc(lshr (sext A), C).

Fri, May 19, 7:49 AM

Thu, May 18

spatel created D33342: [InstCombine] try to canonicalize xor-of-icmps to and-of-icmps.
Thu, May 18, 4:41 PM
spatel committed rL303387: [InstCombine] add more tests for xor-of-icmps; NFC.
[InstCombine] add more tests for xor-of-icmps; NFC
Thu, May 18, 4:01 PM
spatel committed rL303381: [InstCombine] add helper to foldXorOfICmps(); NFCI.
[InstCombine] add helper to foldXorOfICmps(); NFCI
Thu, May 18, 2:06 PM
spatel accepted D33331: [InstSimplify] Make m_Not work for xor -1, X.

LGTM.

Thu, May 18, 1:02 PM
spatel committed rL303364: [InstCombine] move test and use better checks; NFC.
[InstCombine] move test and use better checks; NFC
Thu, May 18, 11:01 AM
spatel added a comment to D33172: [InstCombine] Simpify inverted predicates in 'or'.

A couple of updates:

  1. Added the tests included here to show the current IR with: rL303133 / rL303185
Thu, May 18, 9:05 AM
spatel added a comment to D30686: [SLP] PR32078: convert scalar operations to vector..

For reference, the instcombine patch proposal was D32093.

Thu, May 18, 8:35 AM
spatel added a reviewer for D32916: [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X): chfast.

The test file(s) were added with D24478 / rL281403. cc'ing @chfast to see if it's necessary.

Thu, May 18, 6:46 AM

Wed, May 17

spatel committed rL303315: [InstCombine] add test for xor-of-icmps; NFC.
[InstCombine] add test for xor-of-icmps; NFC
Wed, May 17, 4:36 PM
spatel added a comment to D32143: [InstSimplify] use ConstantRange to simplify more and-of-icmps.

This should reduce the chance of finding the optimization gap:
rL303312

Wed, May 17, 4:24 PM
spatel committed rL303312: [InstCombine] handle icmp i1 X, C early to avoid creating an unknown pattern.
[InstCombine] handle icmp i1 X, C early to avoid creating an unknown pattern
Wed, May 17, 3:43 PM
spatel committed rL303310: [InstCombine] add test for missing icmp bool fold; NFC.
[InstCombine] add test for missing icmp bool fold; NFC
Wed, May 17, 3:33 PM
spatel committed rL303309: [InstCombine] move icmp bool canonicalizations to helper; NFC.
[InstCombine] move icmp bool canonicalizations to helper; NFC
Wed, May 17, 3:28 PM
spatel committed rL303295: [InstSimplify] handle all icmp i1 X, C in one place; NFCI.
[InstSimplify] handle all icmp i1 X, C in one place; NFCI
Wed, May 17, 1:41 PM
spatel added a comment to D32143: [InstSimplify] use ConstantRange to simplify more and-of-icmps.

This is a fun one. The ultimate problem is that by making InstSimplify smarter, we revealed a shortcoming in InstCombine. InstCombine is missing a fold for xor-of-icmps like this:

Wed, May 17, 9:34 AM
spatel committed rL303261: [InstCombine] add isCanonicalPredicate() helper function and use it; NFCI.
[InstCombine] add isCanonicalPredicate() helper function and use it; NFCI
Wed, May 17, 7:34 AM
spatel closed D33247: [InstCombine] add isCanonicalPredicate() helper function and use it; NFCI by committing rL303261: [InstCombine] add isCanonicalPredicate() helper function and use it; NFCI.
Wed, May 17, 7:34 AM
spatel added a comment to D32143: [InstSimplify] use ConstantRange to simplify more and-of-icmps.

I'm seeing a performance regression which I traced back to this commit. A reduced test case to show the problem is this:

define i1 @test(i64 %a) {
  %cmpge = icmp sge i64 %a, 1
  %cmpeq = icmp eq i64 %a, 1
  %cmpne = icmp eq i1 %cmpeq, false
  %cmp = and i1 %cmpge, %cmpne
  ret i1 %cmp
}

Before this commit, InstCombine would reduce the above to simply:

define i1 @test(i64 %a) {
  %1 = icmp sgt i64 %a, 1
  ret i1 %1
}

but for some reason this no longer happens now.

Note that when directly using an "icmp ne" instead of inverting the result of the "icmp eq", the optimization does still happen.

Wed, May 17, 7:17 AM
spatel committed rL303258: [x86] Update tests in psubus.ll; NFC.
[x86] Update tests in psubus.ll; NFC
Wed, May 17, 6:52 AM
spatel closed D32643: Update tests in psubus.ll by committing rL303258: [x86] Update tests in psubus.ll; NFC.
Wed, May 17, 6:52 AM

Tue, May 16

spatel committed rL303213: [InstSimplify] add folds for constant mask of value shifted by constant.
[InstSimplify] add folds for constant mask of value shifted by constant
Tue, May 16, 3:04 PM
spatel closed D33221: [InstSimplify] add folds for constant mask of value shifted by constant by committing rL303213: [InstSimplify] add folds for constant mask of value shifted by constant.
Tue, May 16, 3:04 PM
spatel updated the diff for D33247: [InstCombine] add isCanonicalPredicate() helper function and use it; NFCI.

Patch updated:
Although this patch is NFCI, we had exactly zero tests for the FCMP transforms, and the ICMP transforms were not tested completely. Given the lumpy handling of the predicates, I think it's best to check all of them to make sure we're doing exactly what we think we're doing.

Tue, May 16, 2:09 PM
spatel committed rL303203: [InstCombine] auto-generate better checks; NFC.
[InstCombine] auto-generate better checks; NFC
Tue, May 16, 1:23 PM
spatel added inline comments to D33247: [InstCombine] add isCanonicalPredicate() helper function and use it; NFCI.
Tue, May 16, 10:58 AM
spatel created D33247: [InstCombine] add isCanonicalPredicate() helper function and use it; NFCI.
Tue, May 16, 10:37 AM
spatel committed rL303185: [InstCombine] add motivational comment for tests; NFC.
[InstCombine] add motivational comment for tests; NFC
Tue, May 16, 9:44 AM
spatel closed D33242: [InstCombine] add motivational comment for tests; NFC by committing rL303185: [InstCombine] add motivational comment for tests; NFC.
Tue, May 16, 9:44 AM
spatel added a comment to D32643: Update tests in psubus.ll.

Do you need someone to commit this for you? I just hit a case where another transform would affect codegen in this file, so I'd like to make sure these tests are minimal.

Tue, May 16, 9:40 AM
spatel added a comment to D33242: [InstCombine] add motivational comment for tests; NFC.

The wording is OK. What kind of infloop are you talking about here? Can you please elaborate?

Tue, May 16, 9:27 AM
spatel updated the diff for D33221: [InstSimplify] add folds for constant mask of value shifted by constant.

Patch updated:

  1. Inverted the mask constant to simplify the code and make the 2 cases symmetrical.
  2. Modified the code comments to explain that logic.
Tue, May 16, 8:52 AM
spatel added inline comments to D33221: [InstSimplify] add folds for constant mask of value shifted by constant.
Tue, May 16, 8:45 AM
spatel created D33242: [InstCombine] add motivational comment for tests; NFC.
Tue, May 16, 8:19 AM

Mon, May 15

spatel committed rL303133: [InstCombine] add tests for PR32791; NFC.
[InstCombine] add tests for PR32791; NFC
Mon, May 15, 5:12 PM
spatel created D33221: [InstSimplify] add folds for constant mask of value shifted by constant.
Mon, May 15, 4:25 PM
spatel committed rL303127: [InstSimplify] add tests for unnecessary mask of shifted values; NFC.
[InstSimplify] add tests for unnecessary mask of shifted values; NFC
Mon, May 15, 4:08 PM
spatel committed rL303105: [InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949).
[InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949)
Mon, May 15, 12:41 PM
spatel closed D32970: [InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949) by committing rL303105: [InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949).
Mon, May 15, 12:41 PM
spatel committed rL303104: [InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949).
[InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949)
Mon, May 15, 12:30 PM
spatel closed D32954: [InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949) by committing rL303104: [InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949).
Mon, May 15, 12:30 PM
spatel committed rL303090: [InstCombine] use m_OneUse to reduce code; NFCI.
[InstCombine] use m_OneUse to reduce code; NFCI
Mon, May 15, 11:21 AM
spatel added a comment to D32970: [InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949).

This is the same as the other no? If so, LGTM.

Mon, May 15, 10:51 AM
spatel added reviewers for D32970: [InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949): davide, efriedma.

Ping.

Mon, May 15, 10:31 AM
spatel added a comment to D32954: [InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949).

Ping.

Mon, May 15, 10:28 AM
spatel added reviewers for D32954: [InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949): efriedma, davide.
Mon, May 15, 10:28 AM
spatel added a comment to D33172: [InstCombine] Simpify inverted predicates in 'or'.

The problem may be more general than that. It's really about recognizing inverted compares, so let's take selects out of the equation (using div ops to thwart hoisting/sinking here, but that's just to produce a minimal IR example):

define i32 @inverted_compares(i32 %a, i32 %b, i32 %c) {
entry:
  %cmp = icmp slt i32 %a, %b
  br i1 %cmp, label %t1, label %f1
Mon, May 15, 10:01 AM
spatel added a comment to D33172: [InstCombine] Simpify inverted predicates in 'or'.

Swapping select operands to eliminate an inverted compare seems like a generally good fold for some other pass too. Would that be GVN?

Not entirely sure I follow this last one.
NewGVN finds whether two expressions e1 and e2 are equivalent (actually, not general equivalence as checking equivalence of program expressions is undecidable, so we approximate with Herbrand equivalence). If you have an example (IR), that will help (and I can take a look).

Mon, May 15, 9:21 AM
spatel added a comment to D33172: [InstCombine] Simpify inverted predicates in 'or'.

I agree that we're missing some kind of more general transform. Eg:

Mon, May 15, 8:11 AM
spatel added inline comments to D32916: [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X).
Mon, May 15, 6:12 AM

Sun, May 14

spatel added inline comments to D33172: [InstCombine] Simpify inverted predicates in 'or'.
Sun, May 14, 4:48 PM

Fri, May 12

spatel committed rL302950: [Doc] Document "Splat" in the lexicon.
[Doc] Document "Splat" in the lexicon
Fri, May 12, 2:43 PM
spatel closed D32964: [Doc] Document "Splat" in the lexicon by committing rL302950: [Doc] Document "Splat" in the lexicon.
Fri, May 12, 2:43 PM
spatel added a comment to D31724: [SelectionDAG] Remove special call to LHS computeKnownBits for ANDs with constant RHS..

Yeah I think we should. That failure only showed up when I was rebasing. Is that something I should look into?

Fri, May 12, 2:22 PM
spatel added a comment to D31724: [SelectionDAG] Remove special call to LHS computeKnownBits for ANDs with constant RHS..

Correction, the combine-and problem seems to be because SimplifyDemandedBits only considers scalar constants for shift amount, not constant splats from a build_vector.

Fri, May 12, 2:09 PM
spatel committed rL302949: [x86] add vector tests for demanded bits; NFC.
[x86] add vector tests for demanded bits; NFC
Fri, May 12, 2:07 PM
spatel added inline comments to D32416: [x86, SSE] AVX1 PR28129 .
Fri, May 12, 12:55 PM
spatel added inline comments to D33137: [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790).
Fri, May 12, 12:38 PM
spatel created D33137: [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790).
Fri, May 12, 9:56 AM
spatel committed rL302910: [x86] add tests for potential vector narrowing optimization (PR32790).
[x86] add tests for potential vector narrowing optimization (PR32790)
Fri, May 12, 9:10 AM
spatel added a comment to D25987: [X86] New pattern to generate PSUBUS from SELECT.

I agree, that it looks better as canonical, then my example, when trunc is in between the pattern's instructions. What do you think about defining trunc(sub(select())) canonical for this pattern? Then we can canonicalize my initial pattern using the InstCombine patch I attached(https://reviews.llvm.org/D33118) - it looks like this way the .ll code generated for both of them would be the same. After this transformation, the initial pattern can be transformed into 32bit sub(max). I haven't yet investigated how to transform it into psubus, but at least max is more profitable then currently generated instructions.

What do you think about this idea?

Fri, May 12, 7:30 AM

Wed, May 10

spatel committed rL302733: [InstCombine] remove fold that swaps xor/or with constants; NFCI.
[InstCombine] remove fold that swaps xor/or with constants; NFCI
Wed, May 10, 2:47 PM
spatel closed D33050: [InstCombine] remove fold that swaps xor/or with constants by committing rL302733: [InstCombine] remove fold that swaps xor/or with constants; NFCI.
Wed, May 10, 2:47 PM
spatel added a comment to D33050: [InstCombine] remove fold that swaps xor/or with constants.

It's a canonicalization of sorts; could help pick up more complicated patterns like ((a ^ c1) | c2) ^ c3. Please make sure we have a regression test like this for instcombine.

https://reviews.llvm.org/rL7264 for reference.

Wed, May 10, 1:33 PM
spatel created D33050: [InstCombine] remove fold that swaps xor/or with constants.
Wed, May 10, 9:38 AM
spatel committed rL302684: [InstSimplify, InstCombine] move 'or' simplification tests; NFC.
[InstSimplify, InstCombine] move 'or' simplification tests; NFC
Wed, May 10, 9:11 AM
spatel committed rL302676: [InstCombine] remove redundant tests.
[InstCombine] remove redundant tests
Wed, May 10, 8:08 AM
spatel committed rL302674: [InstCombine] fix auto-generated FileCheck-captured variable refs.
[InstCombine] fix auto-generated FileCheck-captured variable refs
Wed, May 10, 7:53 AM
spatel committed rL302669: [InstCombine] fix typo in test comment; NFC.
[InstCombine] fix typo in test comment; NFC
Wed, May 10, 7:38 AM
spatel added a comment to D32964: [Doc] Document "Splat" in the lexicon.

How about we add another hardware example and group those together:

Wed, May 10, 7:24 AM
spatel committed rL302659: [InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1.
[InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1
Wed, May 10, 7:10 AM

Tue, May 9

spatel committed rL302605: [InstCombine] add helper function for add X, C folds; NFCI.
[InstCombine] add helper function for add X, C folds; NFCI
Tue, May 9, 5:20 PM
spatel committed rL302599: [InstCombine] add tests for andn; NFC.
[InstCombine] add tests for andn; NFC
Tue, May 9, 4:53 PM
spatel committed rL302585: [InstCombine] update test file to use FileCheck; NFC.
[InstCombine] update test file to use FileCheck; NFC
Tue, May 9, 1:59 PM
spatel committed rL302581: [InstCombine] clean up matchDeMorgansLaws(); NFCI.
[InstCombine] clean up matchDeMorgansLaws(); NFCI
Tue, May 9, 1:18 PM
spatel accepted D31961: DAGCombine: Combine shuffles of splat-shuffles.

LGTM. Thanks for adding the examples in the code comments.

Tue, May 9, 10:25 AM
spatel committed rL302548: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern..
[InstCombineCasts] Fix checks in sext->lshr->trunc pattern.
Tue, May 9, 9:38 AM
spatel closed D32285: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern. by committing rL302548: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern..
Tue, May 9, 9:38 AM
spatel added a comment to D32916: [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X).

See inline for a few nits, but I think this makes sense now. If I'm seeing the diffs correctly, there was no codegen difference for x86 after adding the 'and' mask, so we must be recognizing and optimizing that pattern. @RKSimon - do you see any other problems?

Tue, May 9, 8:53 AM
spatel accepted D32285: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern..

I don't have commit access so if this is good to go could you commit it for me? Meanwhile, I'll work on actually cleaning up these two transforms.

Tue, May 9, 8:26 AM
spatel added inline comments to D31961: DAGCombine: Combine shuffles of splat-shuffles.
Tue, May 9, 8:18 AM
spatel added a comment to D32285: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern..

Ok, your explanation makes this clearer. Since we're fixing a miscompile, I think it's ok to keep this patch mostly as-is. But please add some FIXME notes and rebase this patch after rL302475:

  1. This code is trying to do too much in one place IMO. It (and the related zext transform above it) should be split off into a helper function so that the (common?) case where "ASize" is the same as the final size (CI.getType()) are handled separately. When those types match, we do not need to check hasOneUse as strictly. We're also wastefully calling CastInst::CreateIntegerCast() in that case.
  2. We've artificially excluded vector types by using m_ConstantInt().
  3. As I showed in the Alive example, we can handle the case where (ShiftAmt > SExtSize - ASize) by adding an 'and' mask. This transform makes sense with appropriate one-use checking.
  4. We can use m_OneUse() with the match() calls to make the code cleaner.
Tue, May 9, 7:46 AM

Mon, May 8

spatel added a comment to D32285: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern..

Note - Alive is very helpful to check that we're not either miscompiling or missing optimizations:
http://rise4fun.com/Alive/2I

Mon, May 8, 4:04 PM
spatel added a comment to D32285: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern..

I couldn't tell what was happening in all these cases from the code or tests, so I added the tests with their current output here:
https://reviews.llvm.org/rL302475

Mon, May 8, 3:51 PM