spatel (Sanjay Patel)
User

Projects

User does not belong to any projects.

User Details

User Since
May 22 2014, 1:24 PM (178 w, 3 d)

Recent Activity

Today

spatel accepted D38498: [CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads, part 1..

LGTM.

Mon, Oct 23, 6:34 AM

Yesterday

spatel added a comment to D39011: [SimplifyCFG] try harder to forward switch condition to phi (PR34471).

Delay switch condition forwarding to -latesimplifycfg:
rL316298

Sun, Oct 22, 12:17 PM
spatel committed rL316298: [SimplifyCFG] delay switch condition forwarding to -latesimplifycfg.
[SimplifyCFG] delay switch condition forwarding to -latesimplifycfg
Sun, Oct 22, 12:10 PM
spatel added a comment to D39011: [SimplifyCFG] try harder to forward switch condition to phi (PR34471).

So, in the original test and code it's hard to tell, but yes, if it's trying to undo the propagation of constants into phi nodes, it's already going to be fighting, because everything else is trying to do that transform specifically :)

It's probably worth taking a look at codegen and seeing how hard it is.
So i'd say "delay it all to latesimplifycfg for now, see how hard codegen is".
I'm hopeful it's not that tricky to fix codegen. But if it is, yeah let's see how far we want to go here, i don't think you should be on the hook for cleaning all of this up if it's hard.

(One could also argue about the value of constant propagation in this case. It's mostly useful for range analysis, jump threading, transformation into select, etc. I'm indifferent if we want to canonicalize late to some other form if we can show it's better)

Sun, Oct 22, 11:18 AM
spatel planned changes to D38631: [SimplifyCFG] use pass options and remove the latesimplifycfg pass.

My description of the new pass manager behavior was wrong and inverted. It doesn't always create "-earlysimplifycfg" equivalent invocations. It always creates "-latesimplifycfg" equivalent invocations. I need to correct this patch to preserve that behavior (and add tests so it isn't accidentally broken).

Sun, Oct 22, 11:18 AM
spatel added a comment to D39011: [SimplifyCFG] try harder to forward switch condition to phi (PR34471).

Hey Sanjay,
GVN/NewGVN/et al will undo this transform, and propagate constants. In fact, i expect pretty much any pass that can, will.

This definitely belongs in the late simplification category at best.

More than that, i'm concerned this is the wrong fix.

Why is the right answer to undo the transform (which will cause fights between passes) in simplifycfg instead of improving the codegen to recognize the issue?

It seems the same code you are using to detect and undo the transforrm in simplification could be used to codegen it better.

Sun, Oct 22, 10:46 AM
spatel added a reviewer for D38313: [InstCombine] Introducing Pattern Instruction Combine plug-in into InstCombine pass: majnemer.

If I'm seeing this correctly, it's an independent pass within InstCombine. It sits outside InstCombine's iteration loop, so it doesn't interact with the rest of the pass. What is the advantage of this approach vs. making a standalone pass?

Sun, Oct 22, 9:59 AM
spatel accepted D39164: [utils] Support -mtriple=powerpc64.

LGTM.

Sun, Oct 22, 9:52 AM
spatel committed rL316293: [SimplifyCFG] try harder to forward switch condition to phi (PR34471).
[SimplifyCFG] try harder to forward switch condition to phi (PR34471)
Sun, Oct 22, 9:52 AM
spatel closed D39011: [SimplifyCFG] try harder to forward switch condition to phi (PR34471) by committing rL316293: [SimplifyCFG] try harder to forward switch condition to phi (PR34471).
Sun, Oct 22, 9:52 AM

Sat, Oct 21

spatel updated subscribers of D39160: [CodeGen] __builtin_sqrt should map to the compiler's intrinsic sqrt function.

Oops - I wrongly made llvm-commits a subscriber rather than cfe-commits. Let me know if I should reopen under a new thread to get the patch to hit the right mailing list.

Sat, Oct 21, 8:22 AM
spatel created D39160: [CodeGen] __builtin_sqrt should map to the compiler's intrinsic sqrt function.
Sat, Oct 21, 8:12 AM

Fri, Oct 20

spatel committed rL316250: [CodeGen] add tests for __builtin_sqrt*; NFC.
[CodeGen] add tests for __builtin_sqrt*; NFC
Fri, Oct 20, 4:35 PM
spatel committed rL316242: [utils, x86] add regex for retl/retq to reduce duplicated FileChecking (PR35003).
[utils, x86] add regex for retl/retq to reduce duplicated FileChecking (PR35003)
Fri, Oct 20, 2:55 PM
spatel committed rL316223: [x86] avoid FileCheck assert duplication with retl/retq regex; NFC.
[x86] avoid FileCheck assert duplication with retl/retq regex; NFC
Fri, Oct 20, 11:35 AM
spatel added a comment to D38498: [CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads, part 1..

Sorry for the delay. I've made a few inline suggestions. I'm still going through the diffs to try to understand, but if you can reply/update on those, I think it will make it easier.

Fri, Oct 20, 9:53 AM
spatel added a comment to D39011: [SimplifyCFG] try harder to forward switch condition to phi (PR34471).

What's the interaction between this code and constant propagation?

Fri, Oct 20, 8:27 AM

Tue, Oct 17

spatel created D39011: [SimplifyCFG] try harder to forward switch condition to phi (PR34471).
Tue, Oct 17, 10:07 AM
spatel committed rL316012: [ARM, AArch64] adjust tests trying to maintain their objective; NFC.
[ARM, AArch64] adjust tests trying to maintain their objective; NFC
Tue, Oct 17, 9:55 AM
spatel committed rL316008: [SimplifyCFG] add test for part of PR34471 (switch squashing); NFC.
[SimplifyCFG] add test for part of PR34471 (switch squashing); NFC
Tue, Oct 17, 8:57 AM
spatel committed rL316006: [SimplifyCFG] update test to use auto-generated FileCheck asserts; NFC.
[SimplifyCFG] update test to use auto-generated FileCheck asserts; NFC
Tue, Oct 17, 8:51 AM
spatel accepted D38536: Improve lookThroughCast function..

LGTM - see inline notes for a couple of changes to the code comments.

Tue, Oct 17, 8:35 AM

Mon, Oct 16

spatel added a comment to D38536: Improve lookThroughCast function..

For using constant from cmp instruction (instead of recreation) we will need to check that truncated constant is equal to lower bits of cmp constant.
That is why I decided just to recreate constant which looks like more general solution.

Mon, Oct 16, 4:14 PM
spatel accepted D38531: Improve clamp recognition in ValueTracking..

If you want to change that first equality predicate check (the one that was removed from the top of the function) into an assert, I think that would be good since it's not obvious that we've filtered those cases out before we call this function.

Mon, Oct 16, 2:54 PM
spatel added a comment to D38536: Improve lookThroughCast function..

I have read more carefully your comment. As far as I understood in D26556 you want to change the process of decision making for function that folds cast operation into select.
According to suggestion from Eli you should try to match the size of a select to a cmp of its condition operand as a final result.

Mon, Oct 16, 1:09 PM
spatel requested changes to D38531: Improve clamp recognition in ValueTracking..

This patch can miscompile because it's not checking predicates correctly. I'm not sure how to expose this in an IR test because we don't canonicalize enough yet, so I added x86 tests at rL315913.

Mon, Oct 16, 8:24 AM
spatel committed rL315913: [x86] add minmax tests with more predicate coverage; NFC.
[x86] add minmax tests with more predicate coverage; NFC
Mon, Oct 16, 8:20 AM
spatel committed rL315910: [InstCombine] don't unnecessarily generate a constant; NFCI.
[InstCombine] don't unnecessarily generate a constant; NFCI
Mon, Oct 16, 7:47 AM
spatel committed rL315909: [ValueTracking] fix typos, formatting; NFC.
[ValueTracking] fix typos, formatting; NFC
Mon, Oct 16, 7:46 AM

Sun, Oct 15

spatel committed rL315857: revert r314984: revert r314698 - [InstCombine] remove one-use restriction for….
revert r314984: revert r314698 - [InstCombine] remove one-use restriction for…
Sun, Oct 15, 8:39 AM
spatel committed rL315855: [SimplifyCFG] use range-for-loops, tidy; NFCI.
[SimplifyCFG] use range-for-loops, tidy; NFCI
Sun, Oct 15, 7:43 AM
spatel accepted D38934: Move folding of icmp with zero after checking for min/max idioms..

Do you have commit access?

Sun, Oct 15, 7:04 AM

Sat, Oct 14

spatel updated the diff for D38756: [x86] use an insert op to put one variable element into a constant of vectors.

Patch updated:
IIUC, it's not safe to assume that we can always expand the insertelement node properly (even though I haven't found a case where that's a problem). So I've added an additional check before we try this:

(isOperationLegalOrCustom(ISD::INSERT_VECTOR_ELT, VT) ||
 isOperationLegalOrCustom(ISD::VECTOR_SHUFFLE, VT))) {
Sat, Oct 14, 8:07 AM

Fri, Oct 13

spatel committed rL315762: [InstCombine] use m_Neg() to reduce code; NFCI.
[InstCombine] use m_Neg() to reduce code; NFCI
Fri, Oct 13, 2:29 PM
spatel committed rL315752: [Reassociate] auto-generate better checks; NFC.
[Reassociate] auto-generate better checks; NFC
Fri, Oct 13, 1:56 PM
spatel committed rL315745: [InstCombine] move code to remove repeated constant check; NFCI.
[InstCombine] move code to remove repeated constant check; NFCI
Fri, Oct 13, 1:29 PM
spatel committed rL315743: [InstCombine] recycle adds for better efficiency.
[InstCombine] recycle adds for better efficiency
Fri, Oct 13, 1:12 PM
spatel committed rL315728: [InstCombine] use local var to reduce code duplication; NFCI.
[InstCombine] use local var to reduce code duplication; NFCI
Fri, Oct 13, 11:33 AM
spatel committed rL315726: [LLVMCore] fix description for OverflowingBinaryOperator; NFC.
[LLVMCore] fix description for OverflowingBinaryOperator; NFC
Fri, Oct 13, 11:25 AM
spatel committed rL315718: [InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing….
[InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing…
Fri, Oct 13, 10:47 AM
spatel committed rL315717: [InstCombine] add tests for add (zext (add nuw X, C2)), C --> zext (add nuw X….
[InstCombine] add tests for add (zext (add nuw X, C2)), C --> zext (add nuw X…
Fri, Oct 13, 10:42 AM
spatel committed rL315709: [InstCombine] use AddOne helper to reduce code; NFC.
[InstCombine] use AddOne helper to reduce code; NFC
Fri, Oct 13, 10:01 AM
spatel committed rL315703: [InstCombine] rearrange code to remove repeated constant check; NFCI.
[InstCombine] rearrange code to remove repeated constant check; NFCI
Fri, Oct 13, 9:44 AM
spatel committed rL315701: [InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types.
[InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types
Fri, Oct 13, 9:29 AM
spatel added a comment to D38631: [SimplifyCFG] use pass options and remove the latesimplifycfg pass.

Ping.

Fri, Oct 13, 9:06 AM
spatel abandoned D38316: [InstCombine] replace bitcast to scalar + insertelement with widening shuffle + vector bitcast.

Abandoning - we solved this more generally in one way, but more restricted in another in the DAG with D38388. If we find other gaps in this kind of code, we could resurrect this.

Fri, Oct 13, 8:59 AM
spatel committed rL315681: [InstCombine] add tests for boolean extend + add; NFC.
[InstCombine] add tests for boolean extend + add; NFC
Fri, Oct 13, 7:09 AM

Thu, Oct 12

spatel closed D38637: [InstSimplify] don't let poison inhibit an easy fold.

Closed with rL315595

Thu, Oct 12, 10:36 AM
spatel committed rL315595: [ValueTracking] return zero when there's conflict in known bits of a shift….
[ValueTracking] return zero when there's conflict in known bits of a shift…
Thu, Oct 12, 10:32 AM
spatel added inline comments to D38696: [DAGCombine] Permit combining of shuffle of equivalent splat BUILD_VECTORs.
Thu, Oct 12, 9:48 AM
spatel added a comment to D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part].

Thanks, fixed your comment in additional patch below.

Thu, Oct 12, 9:23 AM
spatel committed rL315589: [x86] replace isEqualTo with == for efficiency.
[x86] replace isEqualTo with == for efficiency
Thu, Oct 12, 9:16 AM

Wed, Oct 11

spatel updated the diff for D38756: [x86] use an insert op to put one variable element into a constant of vectors.

Patch updated:
Instead of creating a potentially illegal build vector of constants, create a legal vector of constants and explicitly load it for use in the insertelement.

Wed, Oct 11, 2:15 PM
spatel committed rL315485: [x86] avoid infinite loop from SoftenFloatOperand (PR34866).
[x86] avoid infinite loop from SoftenFloatOperand (PR34866)
Wed, Oct 11, 11:24 AM
spatel closed D38771: [x86] avoid infinite loop from SoftenFloatOperand (PR34866) by committing rL315485: [x86] avoid infinite loop from SoftenFloatOperand (PR34866).
Wed, Oct 11, 11:24 AM
spatel added a comment to D38771: [x86] avoid infinite loop from SoftenFloatOperand (PR34866).
In D38771#894855, @chh wrote:

This change LGTM.
I added more comments in https://bugs.llvm.org/show_bug.cgi?id=34866.

Note that fp128-cast.ll tests only cover the cases with +mmx.
If the original change in D31156 could affect fp128 and no-mmx cases,
it would be better to add test cases for clang -mno-mmx and/or -mno-sse.

Wed, Oct 11, 11:07 AM
spatel added inline comments to D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part].
Wed, Oct 11, 10:36 AM
spatel added a reviewer for D38536: Improve lookThroughCast function.: efriedma.

This reminds me that I haven't revisited D26556...in ~6 months.

Wed, Oct 11, 9:41 AM
spatel added reviewers for D38531: Improve clamp recognition in ValueTracking.: majnemer, efriedma, reames.

I think this makes sense, but adding reviewers for more potential opinions.

Wed, Oct 11, 9:26 AM
spatel committed rL315461: [InstCombine] add baseline tests for D38531; NFC.
[InstCombine] add baseline tests for D38531; NFC
Wed, Oct 11, 7:29 AM
spatel committed rL315460: [DAGCombiner] convert insertelement of bitcasted vector into shuffle.
[DAGCombiner] convert insertelement of bitcasted vector into shuffle
Wed, Oct 11, 7:12 AM
spatel closed D38388: [DAGCombiner, x86] convert insertelement of bitcasted vector into shuffle by committing rL315460: [DAGCombiner] convert insertelement of bitcasted vector into shuffle.
Wed, Oct 11, 7:12 AM
spatel updated the diff for D38771: [x86] avoid infinite loop from SoftenFloatOperand (PR34866).

Patch updated:
Use peekThroughBitcasts() to make the code shorter and more robust.

Wed, Oct 11, 6:21 AM
spatel added inline comments to D38771: [x86] avoid infinite loop from SoftenFloatOperand (PR34866).
Wed, Oct 11, 6:10 AM

Tue, Oct 10

spatel created D38771: [x86] avoid infinite loop from SoftenFloatOperand (PR34866).
Tue, Oct 10, 3:38 PM
spatel added a comment to D38756: [x86] use an insert op to put one variable element into a constant of vectors.

I tried to work-around the FP issue by creating a constant pool load, but that got a bit ugly - I had to call LowerConstantPool() to create a legal vector of constants.

Yes, it's a little ugly, but it's self-contained. I don't have a better suggestion.

Tue, Oct 10, 2:50 PM
spatel added a comment to D38756: [x86] use an insert op to put one variable element into a constant of vectors.

This doesn't happen with int constants...because int constants are legal but FP constants are not?

Yes, exactly.

We call LowerBUILD_VECTOR before we legalize the operands of a BUILD_VECTOR, so that shouldn't block this optimization. But we don't recompute the topological ordering when you create new nodes, so legalization can happen in an unexpected order if you create illegal nodes (by, for example, calling getBuildVector()).

Tue, Oct 10, 2:37 PM
spatel updated the diff for D38756: [x86] use an insert op to put one variable element into a constant of vectors.

Patch updated - no code changes, but tried to fix the bogus test assertions:

  1. Recommitted the baseline checks with the check-prefix fixes for X64 (rL315368).
  2. Rebased this patch.
Tue, Oct 10, 2:17 PM
spatel committed rL315368: [x86] fix prefix typos for CHECK lines; NFC.
[x86] fix prefix typos for CHECK lines; NFC
Tue, Oct 10, 2:12 PM
spatel created D38756: [x86] use an insert op to put one variable element into a constant of vectors.
Tue, Oct 10, 12:29 PM
spatel committed rL315312: [x86] add tests for insertelement; NFC.
[x86] add tests for insertelement; NFC
Tue, Oct 10, 6:45 AM

Mon, Oct 9

spatel committed rL315223: [InstCombine] fix formatting; NFC.
[InstCombine] fix formatting; NFC
Mon, Oct 9, 10:56 AM
spatel added a comment to D38388: [DAGCombiner, x86] convert insertelement of bitcasted vector into shuffle.

Ping.

Mon, Oct 9, 10:00 AM
spatel committed rL315206: [DAG] combine assertsexts around a trunc.
[DAG] combine assertsexts around a trunc
Mon, Oct 9, 8:24 AM
spatel committed rL315204: [x86] regenerate test checks; NFC.
[x86] regenerate test checks; NFC
Mon, Oct 9, 8:05 AM

Sun, Oct 8

spatel committed rL315203: [AArch64] fix typos in test assertions.
[AArch64] fix typos in test assertions
Sun, Oct 8, 6:31 PM
spatel added a comment to D37289: [X86] Speculatively load operands of select instruction.

I am deliberately using fuzzy wording like "may have" and "seems like". I would be happy to have the C language experts in the community weigh in on whether this is valid optimization.

Sun, Oct 8, 6:54 AM

Sat, Oct 7

spatel abandoned D38591: [InstCombine] don't assert that InstSimplify has removed a known true/false cmp (PR34838).

Abandoning - we can fix this ahead of InstCombine and retain our asserts.

Sat, Oct 7, 9:02 AM
spatel updated the diff for D38637: [InstSimplify] don't let poison inhibit an easy fold.

Patch updated:
Fix bogus comment about undef and add a TODO for a potential follow-up patch.

Sat, Oct 7, 9:00 AM
spatel added inline comments to D38637: [InstSimplify] don't let poison inhibit an easy fold.
Sat, Oct 7, 8:55 AM
spatel updated the diff for D38637: [InstSimplify] don't let poison inhibit an easy fold.

Patch updated:
Have computeKnownBitsFromShiftOperator() return a zero constant when we discover a conflict in known bits. This allows InstSimplify to fold compares.

Sat, Oct 7, 8:48 AM
spatel committed rL315152: [InstSimplify] add tests to show we can do better at folding poison; NFC.
[InstSimplify] add tests to show we can do better at folding poison; NFC
Sat, Oct 7, 8:41 AM

Fri, Oct 6

spatel committed rL315130: [InstCombine] use correct type when propagating constant condition in….
[InstCombine] use correct type when propagating constant condition in…
Fri, Oct 6, 4:44 PM
spatel committed rL315127: [InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments….
[InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments…
Fri, Oct 6, 4:22 PM
spatel added a comment to D38637: [InstSimplify] don't let poison inhibit an easy fold.

The exact definition of poison is still getting refined, but it's different from undef. undef is a bit-wise property, which is why ComputeKnownBits has to be careful around it. poison works differently; essentially, any arithmetic or logical operation which has poison as an input produces poison, no matter what the other input is. So it doesn't matter what ComputeKnownBits returns for a known poison value.

Fri, Oct 6, 4:08 PM
spatel added a comment to D38637: [InstSimplify] don't let poison inhibit an easy fold.

Also, this is in the header comment for computeKnownBits():
/ NOTE: we cannot consider 'undef' to be "IsZero" here. The problem is that
/ we cannot optimize based on the assumption that it is zero without changing
/ it to be an explicit zero. If we don't change it to zero, other code could
/ optimized based on the contradictory assumption that it is non-zero.

Fri, Oct 6, 1:09 PM
spatel added a comment to D38637: [InstSimplify] don't let poison inhibit an easy fold.

If I understand correctly, the reason computeKnownBits can't handle this is that it doesn't know what to do with a known poison value? We could just solve the issue in computeKnownBits: currently, it says there are no known bits when it detects a shift overflow, but it could just say, for example, that all the bits are known zero (since the result of computeKnownBits is only meaningful if the value isn't poison).

Fri, Oct 6, 12:54 PM
spatel added a comment to D38591: [InstCombine] don't assert that InstSimplify has removed a known true/false cmp (PR34838).

I've posted an alternate way to solve this in D38637. If that looks better, I'll abandon this patch.

Fri, Oct 6, 10:37 AM
spatel created D38637: [InstSimplify] don't let poison inhibit an easy fold.
Fri, Oct 6, 10:34 AM
spatel created D38631: [SimplifyCFG] use pass options and remove the latesimplifycfg pass.
Fri, Oct 6, 9:24 AM

Thu, Oct 5

spatel committed rL315021: [InstCombine] improve folds for icmp gt/lt (shr X, C1), C2.
[InstCombine] improve folds for icmp gt/lt (shr X, C1), C2
Thu, Oct 5, 2:13 PM
spatel closed D38514: [InstCombine] improve folds for icmp gt/lt (shr X, C1), C2 by committing rL315021: [InstCombine] improve folds for icmp gt/lt (shr X, C1), C2.
Thu, Oct 5, 2:13 PM
spatel added a comment to D38514: [InstCombine] improve folds for icmp gt/lt (shr X, C1), C2.

Given the possibility raised in PR34838 ( D38591 ), I'm going to change the asserts to checks here.

Thu, Oct 5, 1:31 PM
spatel added a comment to D38566: [SimplifyCFG] don't sink common insts too soon (PR34603).

Thanks for the explanation. So, to be clear, we're only delaying the "sinking through PHIs" type of sinking until late?

Thu, Oct 5, 1:11 PM
spatel created D38591: [InstCombine] don't assert that InstSimplify has removed a known true/false cmp (PR34838).
Thu, Oct 5, 12:16 PM
spatel committed rL314984: revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X….
revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X…
Thu, Oct 5, 7:28 AM
spatel added a comment to D38566: [SimplifyCFG] don't sink common insts too soon (PR34603).

This is a minimal patch to solve PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603

Can you please summarize here what's going on and how you arrived at this solution? Last I recall, in PR34603, we were discussing select formation, and how it might not be desirable to form selects early. By "aggressive flattening" do you mean select formation? What you're doing here seems like more than just disabling select formation, but sinking? I also think that we might not want to sink early, at least some things (because we can't later prove it safe to re-hoist), but it's not immediately clear how all of these things are related.

Thu, Oct 5, 7:04 AM

Wed, Oct 4

spatel created D38566: [SimplifyCFG] don't sink common insts too soon (PR34603).
Wed, Oct 4, 3:41 PM
spatel committed rL314930: [SimplifyCFG] put the optional assumption cache pointer in the options struct….
[SimplifyCFG] put the optional assumption cache pointer in the options struct…
Wed, Oct 4, 1:28 PM
spatel accepted D38521: [InstCombine] Improve support for ashr in foldICmpAndShift.

LGTM.

Wed, Oct 4, 9:56 AM