This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] allow hoisting vector bitwise logic ahead of truncates
ClosedPublic

Authored by spatel on Dec 7 2018, 11:28 AM.

Details

Summary

The transform performs a bitwise logic op in a wider type followed by truncate when both inputs are truncated from the same source type:
logic_op (truncate x), (truncate y) --> truncate (logic_op x, y)

There are a bunch of other checks that should prevent doing this when it might be harmful.

We already do this for scalars. The vector limitation was shared with a check for the case when the operands are extended. I'm not sure if that limit is needed either, but that would be a separate patch.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Dec 7 2018, 11:28 AM

There's been some churn on many of these test files recently - rebase?

RKSimon added inline comments.Dec 13 2018, 1:58 PM
test/CodeGen/ARM/setcc-logic.ll
28 ↗(On Diff #177254)

Is this a NFC regeneration change?

We already do this for scalars.

Hm, are you sure?
I think i saw the opposite happen.
https://bugs.llvm.org/show_bug.cgi?id=36419#c4

spatel marked 2 inline comments as done.Dec 13 2018, 4:00 PM

We already do this for scalars.

Hm, are you sure?
I think i saw the opposite happen.
https://bugs.llvm.org/show_bug.cgi?id=36419#c4

Well, it's SDAG, so anything can happen. :)
My comment was specifically referring to the check that I'm hoping to change in this patch - it has a scalar-only restriction currently, and I think as the diffs here show, it's unnecessary.

test/CodeGen/ARM/setcc-logic.ll
28 ↗(On Diff #177254)

Yes, that's only adding the "-NEXT"; I can update it separately.

spatel updated this revision to Diff 178232.Dec 14 2018, 7:42 AM
spatel marked an inline comment as done.

Patch updated:
No code changes, but rebased to remove cosmetic diffs in ARM test and updated codegen for x86 vector rotates.

This revision is now accepted and ready to land.Dec 14 2018, 11:10 AM

The Hexagon tests were meant to check operations on vector predicate registers, so for them to work the inputs need to be vectors of i1 with the lengths that correspond to the lengths of vectors of 8+ bit integers. The only way to generate such values is do either a compare or a truncate. Since compare instructions can be fused with logical operations, this leaves truncate as the only option. With the changes from this patch, truncate is also eliminated, so it appears that these instructions cannot be emitted anymore (from a non-intrinsic code). With this in mind, I am ok with these changes.

This revision was automatically updated to reflect the committed changes.