Page MenuHomePhabricator

spatel (Sanjay Patel)
User

Projects

User does not belong to any projects.

User Details

User Since
May 22 2014, 1:24 PM (256 w, 6 d)

Recent Activity

Today

spatel committed rG6f41bf948b5f: [DAGCombiner] scale repeated FP divisor by splat factor (authored by spatel).
[DAGCombiner] scale repeated FP divisor by splat factor
Wed, Apr 24, 3:28 PM
spatel committed rL359147: [DAGCombiner] scale repeated FP divisor by splat factor.
[DAGCombiner] scale repeated FP divisor by splat factor
Wed, Apr 24, 3:28 PM
spatel closed D61028: [DAGCombiner] scale repeated FP divisor by splat factor.
Wed, Apr 24, 3:28 PM · Restricted Project
spatel added inline comments to D61047: [X86] Delay creating index register negations during address matching until after we know for sure the match will succeed.
Wed, Apr 24, 11:32 AM · Restricted Project
spatel updated the diff for D61028: [DAGCombiner] scale repeated FP divisor by splat factor.

Patch updated:
Add a TODO comment about optimizing for size. That's an existing concern even for scalar code. But there may not be a clear answer for optsize because, for example, we may be able to hoist fdiv out of a loop by doing this transform.

Wed, Apr 24, 10:09 AM · Restricted Project
spatel added inline comments to D61068: [X86][SSE] Disable shouldFoldConstantShiftPairToMask for btver1/btver2 targets (PR40758).
Wed, Apr 24, 9:45 AM · Restricted Project
spatel updated the diff for D61028: [DAGCombiner] scale repeated FP divisor by splat factor.

Patch updated:
Remove reordering of reciprocal estimate and repeated divisor transforms. We still see a diff for the SSE target with an illegal type because we don't try to generate the reciprocal estimate sequence until the types are legal.

Wed, Apr 24, 9:25 AM · Restricted Project
spatel created D61075: [CodeGenPrepare] delay instruction deletion for efficiency.
Wed, Apr 24, 9:01 AM · Restricted Project
spatel added inline comments to D61028: [DAGCombiner] scale repeated FP divisor by splat factor.
Wed, Apr 24, 8:36 AM · Restricted Project
spatel committed rGb1b336890761: [x86] make sure horizontal op and broadcast types match to simplify (PR41414) (authored by spatel).
[x86] make sure horizontal op and broadcast types match to simplify (PR41414)
Wed, Apr 24, 7:04 AM
spatel committed rL359095: [x86] make sure horizontal op and broadcast types match to simplify (PR41414).
[x86] make sure horizontal op and broadcast types match to simplify (PR41414)
Wed, Apr 24, 7:03 AM

Yesterday

spatel accepted D60837: [CGP] Look through bitcasts when duplicating returns for tail calls.

Note that there is a dedicated folder for CGP IR tests at 'llvm/test/Transforms/CodeGenPrepare' with target subfolders under there, but there's also precedence for doing it this way, so we see IR and asm in the same file.

Tue, Apr 23, 2:45 PM · Restricted Project
spatel added inline comments to D60837: [CGP] Look through bitcasts when duplicating returns for tail calls.
Tue, Apr 23, 12:52 PM · Restricted Project
spatel updated subscribers of D60975: Convert a masked.gather of at most one element to a masked.load.

In the generic case, we're creating a masked load of a scalar (although as noted inline, there's apparently no support or test for that pattern). But I don't know what the codegen for a <1 x X> masked load would be (do we cmp/br around the load since it's not speculatable?). Someone else with masked load lowering knowledge (@craig.topper @efriedma ?) should also review this.

Tue, Apr 23, 12:35 PM · Restricted Project
spatel created D61028: [DAGCombiner] scale repeated FP divisor by splat factor.
Tue, Apr 23, 10:14 AM · Restricted Project
spatel committed rG7c0bd5a27c5b: [x86] fix test checks for fdiv combine; NFC (authored by spatel).
[x86] fix test checks for fdiv combine; NFC
Tue, Apr 23, 9:30 AM
spatel committed rL359008: [x86] fix test checks for fdiv combine; NFC.
[x86] fix test checks for fdiv combine; NFC
Tue, Apr 23, 9:29 AM
spatel committed rG171b74e31c72: [x86] add tests for vector fdiv with splat divisor; NFC (authored by spatel).
[x86] add tests for vector fdiv with splat divisor; NFC
Tue, Apr 23, 9:16 AM
spatel committed rL359006: [x86] add tests for vector fdiv with splat divisor; NFC.
[x86] add tests for vector fdiv with splat divisor; NFC
Tue, Apr 23, 9:16 AM
spatel committed rG12a561fa1b79: [x86] use psubus for more vsetcc lowering (PR39859) (authored by spatel).
[x86] use psubus for more vsetcc lowering (PR39859)
Tue, Apr 23, 8:21 AM
spatel committed rL358999: [x86] use psubus for more vsetcc lowering (PR39859).
[x86] use psubus for more vsetcc lowering (PR39859)
Tue, Apr 23, 8:20 AM
spatel closed D60838: [x86] use psubus for more vsetcc lowering (PR39859).
Tue, Apr 23, 8:20 AM · Restricted Project
spatel committed rG06ff5eae5b45: [DAGCombiner] generalize binop-of-splats scalarization (authored by spatel).
[DAGCombiner] generalize binop-of-splats scalarization
Tue, Apr 23, 6:16 AM
spatel committed rL358984: [DAGCombiner] generalize binop-of-splats scalarization.
[DAGCombiner] generalize binop-of-splats scalarization
Tue, Apr 23, 6:15 AM

Mon, Apr 22

spatel committed rGbf8aacb7151c: [SelectionDAG] move splat util functions up from x86 lowering (authored by spatel).
[SelectionDAG] move splat util functions up from x86 lowering
Mon, Apr 22, 3:42 PM
spatel committed rL358930: [SelectionDAG] move splat util functions up from x86 lowering.
[SelectionDAG] move splat util functions up from x86 lowering
Mon, Apr 22, 3:42 PM
spatel accepted D59703: Convert a masked.load of a dereferenceable address to an unconditional load.

LGTM. This canonicalization is reverse of the typical (shorter code is preferred), but the reasoning in the summary makes sense: we have lots of IR optimizer logic for selects, and this should be reversible in the backend.

Mon, Apr 22, 2:14 PM · Restricted Project
spatel accepted D60932: [NFC] Add baseline tests for int isKnownNonZero.

Sorry, I didn't notice the patch stack before commenting in D60846.

Mon, Apr 22, 9:02 AM · Restricted Project
spatel added a comment to D60846: [ValueTracking] Improve isKnowNonZero for Ints.

Can you push the baseline tests to trunk/head, so this diff is against an updated public baseline?
There's also a TODO comment/test about this enhancement in test/Transforms/LICM/hoist-mustexec.ll -- that should be updated as part of this patch.

Mon, Apr 22, 7:25 AM · Restricted Project
spatel committed rG9bc6c77220fb: [DAGCombiner] make variable name less ambiguous; NFC (authored by spatel).
[DAGCombiner] make variable name less ambiguous; NFC
Mon, Apr 22, 6:41 AM
spatel committed rL358886: [DAGCombiner] make variable name less ambiguous; NFC.
[DAGCombiner] make variable name less ambiguous; NFC
Mon, Apr 22, 6:40 AM
spatel committed rGd6989daae915: [DAGCombiner] prepare shuffle-of-splat to handle more patterns; NFC (authored by spatel).
[DAGCombiner] prepare shuffle-of-splat to handle more patterns; NFC
Mon, Apr 22, 6:37 AM
spatel committed rL358884: [DAGCombiner] prepare shuffle-of-splat to handle more patterns; NFC.
[DAGCombiner] prepare shuffle-of-splat to handle more patterns; NFC
Mon, Apr 22, 6:36 AM

Sun, Apr 21

spatel committed rG7fa3a0eec979: [AArch64] add tests with multiple binop+splat vals; NFC (authored by spatel).
[AArch64] add tests with multiple binop+splat vals; NFC
Sun, Apr 21, 8:02 AM
spatel committed rL358853: [AArch64] add tests with multiple binop+splat vals; NFC.
[AArch64] add tests with multiple binop+splat vals; NFC
Sun, Apr 21, 8:02 AM
spatel accepted D60656: [LVI][CVP] Calculate with.overflow result range.

This looks like a straightforward extension of the existing logic, so LGTM. Up to you if you want to wait for another review.

Sun, Apr 21, 7:54 AM · Restricted Project

Fri, Apr 19

spatel added a comment to D59703: Convert a masked.load of a dereferenceable address to an unconditional load.

These are 2 independent transforms, right? It would be better to split that into 2 smaller reviews/commits, so we are not missing anything (and easier to diagnose if there's trouble post-commit). The tests should show minimal IR patterns to exercise those 2 independent transforms.

Fri, Apr 19, 1:54 PM · Restricted Project
spatel committed rGe197c617a64c: [SelectionDAG] soften splat mask assert/unreachable (PR41535) (authored by spatel).
[SelectionDAG] soften splat mask assert/unreachable (PR41535)
Fri, Apr 19, 8:30 AM
spatel committed rL358761: [SelectionDAG] soften splat mask assert/unreachable (PR41535).
[SelectionDAG] soften splat mask assert/unreachable (PR41535)
Fri, Apr 19, 8:30 AM

Thu, Apr 18

spatel created D60890: [AArch64] splat before (f)mul to allow mul-by-element isel.
Thu, Apr 18, 3:07 PM · Restricted Project
spatel committed rG5d281ac9cede: [AArch64] add tests for mul-by-element; NFC (authored by spatel).
[AArch64] add tests for mul-by-element; NFC
Thu, Apr 18, 2:49 PM
spatel committed rL358718: [AArch64] add tests for mul-by-element; NFC.
[AArch64] add tests for mul-by-element; NFC
Thu, Apr 18, 2:49 PM
spatel added a comment to D60852: Fix for bug 41512: lower INSERT_VECTOR_ELT(ZeroVec, 0, Elt) to SCALAR_TO_VECTOR(Elt) for all SSE flavors.

This looks like the expected improvement to me, but given the 'blend' comment/test, I'd like someone else to confirm that I'm not overlooking some weird corner case.

Thu, Apr 18, 12:11 PM · Restricted Project
spatel added a comment to D60852: Fix for bug 41512: lower INSERT_VECTOR_ELT(ZeroVec, 0, Elt) to SCALAR_TO_VECTOR(Elt) for all SSE flavors.

I added some baseline tests here:
rL358687

Thu, Apr 18, 10:11 AM · Restricted Project
spatel committed rG51fa60bcbb91: [x86] add tests for improved insertelement to index 0 (PR41512); NFC (authored by spatel).
[x86] add tests for improved insertelement to index 0 (PR41512); NFC
Thu, Apr 18, 9:58 AM
spatel committed rL358687: [x86] add tests for improved insertelement to index 0 (PR41512); NFC.
[x86] add tests for improved insertelement to index 0 (PR41512); NFC
Thu, Apr 18, 9:57 AM
spatel accepted D59758: [DAGCombiner] Combine OR as ADD when no common bits are set.

LGTM (see inline for a couple of nits) - but I'd prefer that someone with AMDGPU knowledge (@arsenm @nhaehnle @rampitec ?) confirm those diffs too.

Thu, Apr 18, 9:06 AM · Restricted Project
spatel accepted D60375: [X86] combineVectorTruncationWithPACKUS - remove split/concatenation of mask.

LGTM

Thu, Apr 18, 8:52 AM · Restricted Project
spatel added a comment to D60852: Fix for bug 41512: lower INSERT_VECTOR_ELT(ZeroVec, 0, Elt) to SCALAR_TO_VECTOR(Elt) for all SSE flavors.

If we are getting this right sometimes, then we might already have the transform that we want, but it is limited in some way that prevents getting the larger case.
I doubt that the loop itself is needed to demonstrate the problem because I see 'movd' codegen even with a loop as long as it is not unrolled.

After more experimentation I tend to agree. Also the most basic case produces pinsrd even in a small kernel (https://gcc.godbolt.org/z/HAmNha), so will just create test out of it.

Thu, Apr 18, 7:45 AM · Restricted Project
spatel added a comment to D60852: Fix for bug 41512: lower INSERT_VECTOR_ELT(ZeroVec, 0, Elt) to SCALAR_TO_VECTOR(Elt) for all SSE flavors.

Your test cases need to be a lot simpler - I'd recommend looking at buildvec-insertvec.ll and possibly adding your tests to that file instead of adding this new file.

The problem is that in tiny kernel llvm behaves as I expect it to, while in a loop it underperforms: https://gcc.godbolt.org/z/PljujX -- compare Sum2 and Loop() code generation.
I will do my best to minimize the case.

Thu, Apr 18, 6:58 AM · Restricted Project
spatel updated the diff for D60838: [x86] use psubus for more vsetcc lowering (PR39859).

Patch updated:
Add inc or dec param to helper function to reduce code duplication.

Thu, Apr 18, 6:37 AM · Restricted Project
spatel added inline comments to D60838: [x86] use psubus for more vsetcc lowering (PR39859).
Thu, Apr 18, 6:29 AM · Restricted Project

Wed, Apr 17

spatel committed rGfb363a778fd7: [x86] try to widen 'shl' as part of LEA formation (authored by spatel).
[x86] try to widen 'shl' as part of LEA formation
Wed, Apr 17, 3:38 PM
spatel committed rL358622: [x86] try to widen 'shl' as part of LEA formation.
[x86] try to widen 'shl' as part of LEA formation
Wed, Apr 17, 3:37 PM
spatel closed D60789: [x86] try to widen 'shl' as part of LEA formation.
Wed, Apr 17, 3:37 PM · Restricted Project
spatel created D60838: [x86] use psubus for more vsetcc lowering (PR39859).
Wed, Apr 17, 2:52 PM · Restricted Project
spatel updated the diff for D60214: [DAGCombiner] move splat-shuffle after binop with splat constant.

Patch updated:
Made ARM test checks more exact, so we can see there actually is an extra instruction here (vext.32). That seems like a regression, but I'm not sure if it's important and/or another known shortcoming for folding shuffles.

Wed, Apr 17, 10:04 AM · Restricted Project
spatel committed rG1964962b496e: [ARM] tighten test checks; NFC (authored by spatel).
[ARM] tighten test checks; NFC
Wed, Apr 17, 9:54 AM
spatel committed rL358594: [ARM] tighten test checks; NFC.
[ARM] tighten test checks; NFC
Wed, Apr 17, 9:53 AM
spatel planned changes to D60214: [DAGCombiner] move splat-shuffle after binop with splat constant.

I haven't hand-written FileCheck lines in a long time, and I messed that up...another update coming up.

Wed, Apr 17, 9:53 AM · Restricted Project
spatel updated the diff for D60214: [DAGCombiner] move splat-shuffle after binop with splat constant.

Patch updated:

  1. Added TODO comment about adding a more effificient constant query.
  2. Updated ARM test assertions to show more complete diff.
Wed, Apr 17, 9:34 AM · Restricted Project
spatel committed rG1f2c81af72bd: [ARM] make test checks more thorough; NFC (authored by spatel).
[ARM] make test checks more thorough; NFC
Wed, Apr 17, 9:01 AM
spatel committed rL358587: [ARM] make test checks more thorough; NFC.
[ARM] make test checks more thorough; NFC
Wed, Apr 17, 9:01 AM
spatel accepted D60650: [LVI][CVP] Constrain values in with.overflow branches.

I haven't looked at LVI much, but judging from the git log, nobody else has done that recently either.

Wed, Apr 17, 8:18 AM · Restricted Project
spatel added a comment to D60214: [DAGCombiner] move splat-shuffle after binop with splat constant.

Ping * 2.

Wed, Apr 17, 7:55 AM · Restricted Project
spatel added a comment to D60600: [InstCombine] Fix a vector-of-pointers instcombine undef bug..

We clearly need to fix the bug (crash). @reames or someone else with more GEP experience should have a look though. I've just pointed out some minor bits to clean up.

Wed, Apr 17, 7:01 AM · Restricted Project
spatel updated the diff for D60789: [x86] try to widen 'shl' as part of LEA formation.

Patch updated:

  1. Added check to make sure that IndexReg has not already been set.
  2. Fixed node insertion/deletion/rauw to match existing code patterns in foldMask*() helpers. Dead node removal should be handled by PostprocessISelDAG(), so there's no need to do that explicitly.
Wed, Apr 17, 6:20 AM · Restricted Project

Tue, Apr 16

spatel updated the diff for D60789: [x86] try to widen 'shl' as part of LEA formation.

Patch updated:

  1. Try to make code comment clearer.
  2. Set scale/index parts of the AddressMode directly rather than recursing.
  3. Remove dead node and explicitly RAUW.
  4. Added some shift constant variation to the tests and a negative test.
Tue, Apr 16, 4:21 PM · Restricted Project
spatel committed rGd5bc5ca3e4f6: [x86] adjust LEA tests for better coverage; NFC (authored by spatel).
[x86] adjust LEA tests for better coverage; NFC
Tue, Apr 16, 4:10 PM
spatel committed rL358539: [x86] adjust LEA tests for better coverage; NFC.
[x86] adjust LEA tests for better coverage; NFC
Tue, Apr 16, 4:10 PM
spatel accepted D60403: [CostModel][X86] Add bool anyof/allof reduction costs.

LGTM

Tue, Apr 16, 2:54 PM · Restricted Project
spatel added inline comments to D60403: [CostModel][X86] Add bool anyof/allof reduction costs.
Tue, Apr 16, 2:33 PM · Restricted Project
spatel added inline comments to D60789: [x86] try to widen 'shl' as part of LEA formation.
Tue, Apr 16, 2:05 PM · Restricted Project
spatel committed rGe08783e2f54b: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted… (authored by spatel).
[EarlyCSE] detect equivalence of selects with inverse conditions and commuted…
Tue, Apr 16, 1:42 PM
spatel committed rL358523: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted….
[EarlyCSE] detect equivalence of selects with inverse conditions and commuted…
Tue, Apr 16, 1:41 PM
spatel closed D60723: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101).
Tue, Apr 16, 1:41 PM · Restricted Project
spatel added inline comments to D60723: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101).
Tue, Apr 16, 1:35 PM · Restricted Project
spatel created D60789: [x86] try to widen 'shl' as part of LEA formation.
Tue, Apr 16, 12:14 PM · Restricted Project
spatel committed rGf136c46bd64c: [x86] add more tests for LEA formation; NFC (authored by spatel).
[x86] add more tests for LEA formation; NFC
Tue, Apr 16, 11:59 AM
spatel committed rL358513: [x86] add more tests for LEA formation; NFC.
[x86] add more tests for LEA formation; NFC
Tue, Apr 16, 11:59 AM
spatel updated subscribers of D58633: [InstCombine] remove one-use restriction for icmp+add constant fold.

There's a similar one use check for the non-equality case in https://github.com/llvm-mirror/llvm/blob/fb1e04ff3c51a617c4944b2e05e8aa1b8c543e22/lib/Transforms/InstCombine/InstCombineCompares.cpp#L2389. This prevents the trivially true condition in https://rust.godbolt.org/z/qg0RjC from folding. That check was added (or rather moved) in rL341831.

Just wanted to add this as a datapoint for a case where dropping the restriction would be nice.

Tue, Apr 16, 9:30 AM · Restricted Project
spatel updated the diff for D60723: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101).

Patch updated:
Redo equivalency matching to reduce code.

Tue, Apr 16, 9:02 AM · Restricted Project

Mon, Apr 15

spatel updated the diff for D60723: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101).

Patch updated:

  1. Improve hashing as suggested (I don't know how to expose that in a test, so no extra tests for that).
  2. Add a "double-negation" matcher to recognize selects with same true/false ops but different conditions (positive and negative tests added).
Mon, Apr 15, 3:09 PM · Restricted Project
spatel added inline comments to D60723: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101).
Mon, Apr 15, 3:05 PM · Restricted Project
spatel committed rG800a0c3e4b06: [EarlyCSE] add more tests for double-negated select condition; NFC (authored by spatel).
[EarlyCSE] add more tests for double-negated select condition; NFC
Mon, Apr 15, 2:51 PM
spatel committed rL358454: [EarlyCSE] add more tests for double-negated select condition; NFC.
[EarlyCSE] add more tests for double-negated select condition; NFC
Mon, Apr 15, 2:51 PM
spatel committed rG5ae05d810c8d: [EarlyCSE] add test for select condition double-negation; NFC (authored by spatel).
[EarlyCSE] add test for select condition double-negation; NFC
Mon, Apr 15, 1:24 PM
spatel committed rL358444: [EarlyCSE] add test for select condition double-negation; NFC.
[EarlyCSE] add test for select condition double-negation; NFC
Mon, Apr 15, 1:24 PM
spatel added inline comments to D60723: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101).
Mon, Apr 15, 11:14 AM · Restricted Project
spatel committed rG8ae68f26489a: [x86] update test checks; NFC (authored by spatel).
[x86] update test checks; NFC
Mon, Apr 15, 10:40 AM
spatel committed rL358432: [x86] update test checks; NFC.
[x86] update test checks; NFC
Mon, Apr 15, 10:40 AM
spatel added a comment to D60723: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101).

Name: Inverted predicate

%cond = icmp sgt i8 %x, 42
%invcond = icmp sle i8 %x, 42
%m1 = select i1 %cond, i32 %t, i32 %f
%m2 = select i1 %invcond, i32 %f, i32 %t
%r = xor i32 %m1, %m2
=>
%r = i32 0
Mon, Apr 15, 10:11 AM · Restricted Project
spatel created D60723: [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101).
Mon, Apr 15, 10:11 AM · Restricted Project
spatel committed rG0e0bb0e24a0c: [EarlyCSE] add tests for selects with commuted operands (PR41101); NFC (authored by spatel).
[EarlyCSE] add tests for selects with commuted operands (PR41101); NFC
Mon, Apr 15, 9:00 AM
spatel committed rL358420: [EarlyCSE] add tests for selects with commuted operands (PR41101); NFC.
[EarlyCSE] add tests for selects with commuted operands (PR41101); NFC
Mon, Apr 15, 9:00 AM
spatel committed rGc71433335ad3: [EarlyCSE] regenerate test checks; NFC (authored by spatel).
[EarlyCSE] regenerate test checks; NFC
Mon, Apr 15, 7:03 AM
spatel committed rL358407: [EarlyCSE] regenerate test checks; NFC.
[EarlyCSE] regenerate test checks; NFC
Mon, Apr 15, 7:01 AM
spatel committed rG5e13cd2e61cb: [InstCombine] canonicalize fdiv after fmul if reassociation is allowed (authored by spatel).
[InstCombine] canonicalize fdiv after fmul if reassociation is allowed
Mon, Apr 15, 6:23 AM
spatel committed rL358404: [InstCombine] canonicalize fdiv after fmul if reassociation is allowed.
[InstCombine] canonicalize fdiv after fmul if reassociation is allowed
Mon, Apr 15, 6:23 AM

Fri, Apr 12

spatel committed rG5e4ad39af7c2: [DAGCombiner] narrow shuffle of concatenated vectors (authored by spatel).
[DAGCombiner] narrow shuffle of concatenated vectors
Fri, Apr 12, 9:31 AM