Page MenuHomePhabricator

[DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE (WIP)
Changes PlannedPublic

Authored by RKSimon on Jan 7 2019, 4:38 AM.

Diff Detail

Event Timeline

RKSimon created this revision.Jan 7 2019, 4:38 AM
RKSimon added inline comments.Jan 7 2019, 5:18 AM
test/CodeGen/ARM/lowerMUL-newload.ll
28 ↗(On Diff #180459)

This just looks like we're missing something for the ARMISD::VMULL lowering

test/CodeGen/X86/avx512-any_extend_load.ll
53 ↗(On Diff #180459)

Simplifying to ANY_EXTEND prevents PACKSS/PACKUS from working

test/CodeGen/X86/combine-sra.ll
252 ↗(On Diff #180459)

We'd been relying on the v4i64 ashr expansion

test/CodeGen/X86/vector-blend.ll
956 ↗(On Diff #180459)

Haven't worked out the problem here yet

test/CodeGen/X86/vector-trunc-widen.ll
77 ↗(On Diff #180459)

We'd been relying on the v8i64 ashr expansion

craig.topper added inline comments.Jan 7 2019, 4:33 PM
test/CodeGen/X86/vector-blend.ll
956 ↗(On Diff #180459)

I think we need to call SimplifyDemandedBits on Conditions of SHRUNKBLEND. We only do it when we convert from VSELECT to SHRUNKBLEND.

craig.topper added inline comments.
test/CodeGen/X86/vector-blend.ll
956 ↗(On Diff #180459)

Patch here D56421

RKSimon updated this revision to Diff 181284.Jan 11 2019, 8:28 AM

rebase after D56421

huihuiz added inline comments.
test/CodeGen/ARM/lowerMUL-newload.ll
28 ↗(On Diff #180459)

Using "CHECK-NEXT" and matching with the exact register names will make this test cast very sensitive to scheduling and register allocation changes.
Use pattern matching should be a better approach.

RKSimon added inline comments.Jan 23 2019, 1:20 PM
test/CodeGen/ARM/lowerMUL-newload.ll
28 ↗(On Diff #180459)

But it stops people missing/hiding codegen changes that need to be kept an eye on, including register allocation changes.

This argument has been going on for years now, and we've tended to see that the benefits of update_llc_test_checks.py outweighs any difficulties.

More importantly, do you have any insights as to how to improve ARMISD::VMULL lowering?

RKSimon updated this revision to Diff 196070.Apr 22 2019, 7:29 AM

rebase - still showing a number of regressions that are proving tricky to fix

Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2019, 7:29 AM
RKSimon updated this revision to Diff 211912.Jul 26 2019, 4:56 AM

rebase + vector support for truncate(srl(x,c)) case

RKSimon updated this revision to Diff 214598.Aug 12 2019, 4:18 AM

rebase - most of the remaining x86 issues should be fixed by D66004

RKSimon updated this revision to Diff 262099.May 5 2020, 6:58 AM
RKSimon retitled this revision from [DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE (WIP) to [DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE.
RKSimon edited the summary of this revision. (Show Details)

Add support for ANY_EXTEND ops to ARM's LowerMUL.

This fixes the main MULL regression but I'm not sure how to fix the ADDW regression which seems to be a purely isel pattern - @t.p.northover @efriedma @huihuiz any thoughts?

RKSimon planned changes to this revision.Jul 4 2020, 1:23 AM

The pattern in question comes out of https://github.com/llvm/llvm-project/blob/0fa0cf8638b0777a1a44feebf78a63865e48ecf6/llvm/lib/Target/ARM/ARMInstrNEON.td#L3100 , and it traces out to https://github.com/llvm/llvm-project/blob/0fa0cf8638b0777a1a44feebf78a63865e48ecf6/llvm/lib/Target/ARM/ARMInstrNEON.td#L4216 .

Probably we want to do what the Hexagon backend does: def asext: PatFrags<(ops node:$Rs), [(sext node:$Rs), (anyext node:$Rs)]>;.

RKSimon updated this revision to Diff 291969.Tue, Sep 15, 10:41 AM

rebase - avg.ll regressions now fixed

RKSimon planned changes to this revision.Tue, Sep 15, 10:41 AM
lebedev.ri added inline comments.
llvm/test/CodeGen/X86/combine-sra.ll
248–250

Appears to be a regression

llvm/test/CodeGen/X86/vector-trunc.ll
69–73

Appears to be a regression

392–403

I'm not very sure it's an improvement

note: this is still a wip

RKSimon retitled this revision from [DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE to [DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE (WIP).Tue, Sep 15, 11:00 AM