This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][CodeGen] Remove redundant vector negations before concat
ClosedPublic

Authored by benmxwl-arm on Nov 4 2022, 8:29 AM.

Details

Summary

This adds a new canonicalization rule to replace concats of truncated
negations with a negation of the concatenated truncates, e.g.

 (concat_vectors (v4i16 (truncate (not (v4i32)))),
                 (v4i16 (truncate (not (v4i32)))))
->
 (not (concat_vectors (v4i16 (truncate (v4i32))),
                      (v4i16 (truncate (v4i32)))))

Doing this allows avoiding redundant negations being emitted in
certain cases.

Diff Detail

Event Timeline

benmxwl-arm created this revision.Nov 4 2022, 8:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 4 2022, 8:29 AM
benmxwl-arm requested review of this revision.Nov 4 2022, 8:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 4 2022, 8:29 AM
peterwaller-arm accepted this revision.Nov 7 2022, 5:46 AM

LGTM, but adding other reviewers for visibility. The predictable ask is to make it a target-agnostic combine: reviewers, thoughts?

This revision is now accepted and ready to land.Nov 7 2022, 5:46 AM
benmxwl-arm updated this revision to Diff 474229.EditedNov 9 2022, 4:57 AM

Moved to general DAG combine rather than AArch64 specific

This does not appear to regress any tests and may fix similar cases on other platforms (though as yet there are no tests to prove that).

benmxwl-arm retitled this revision from [AArch64][CodeGen] Remove redundant vector negations before concat to [DAG] Add canonicalization to avoid redundant nots in concat vectors.Nov 9 2022, 5:00 AM
benmxwl-arm edited the summary of this revision. (Show Details)
RKSimon added inline comments.Nov 9 2022, 5:56 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
21649

hasOneUse (or N->isOnlyUserOf()) checks?

spatel added a comment.Nov 9 2022, 6:26 AM

This seems overly specialized - do we really care that the pattern ends in 'not' rather than any 'xor' rather than any bitwise-logic rather than any binop?

Note that we do have more general folds that could affect patterns like this:
https://github.com/llvm/llvm-project/blob/3dbda5ff88518912bbb72f03d95805634507ac17/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L13774

It's been a few years since that was added (D55126), so I wonder if we still show harm on regression tests by easing the legality and type constraints?

Either way, this patch could add more direct tests of the transforms.

define <8 x i16> @not_not_trunc_concat(<4 x i32> %x, <4 x i32> %y) {
  %notx = xor <4 x i32> %x, <i32 -1, i32 -1, i32 -1, i32 -1>
  %noty = xor <4 x i32> %y, <i32 -1, i32 -1, i32 -1, i32 -1>
  %trnx = trunc <4 x i32> %notx to <4 x i16>
  %trny = trunc <4 x i32> %noty to <4 x i16>
  %r = shufflevector <4 x i16> %trnx, <4 x i16> %trny, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
  ret <8 x i16> %r
}

I suspect that's going to be worse on some x86 subtargets. But that's not canonical IR (we would at least narrow the 'nots')...so we might not gate this patch on what may be target-specific improvements to x86 shuffling.

benmxwl-arm added a comment.EditedNov 10 2022, 9:22 AM

I think this could work for at least a few more binops if adjusted some something like:

 (concat ((truncate (binop x, const_a))),
         ((truncate (binop y, const_b))))
->
 (binop (concat (truncate x), (truncate y)), 
        (concat (truncate const_a), (truncate const_b)))

Though that admittedly looks a bit ugly.

However, I think this could be simplified to:

(concat (binop x const_a) (binop y const_b))
->
(binop (concat x y) (concat const_a const_b))

Though for that to work here the truncate narrowings from D55126 would need to apply post-legalization,
as the initial problem this patch is trying to solve is entirely an artifact of legalizing vector floating point compares.

I am however, a little less confident making these more general changes.

TBH, I think I'd much prefer for this to start in AArch64 and we review making it a generic DAG fold later on.

benmxwl-arm updated this revision to Diff 474739.EditedNov 11 2022, 6:20 AM
  • Reverted back to an AArch64 combine
  • Added some cautious isOnlyUserOf() checks
  • Added a few more direct tests based on spatel's comment
benmxwl-arm retitled this revision from [DAG] Add canonicalization to avoid redundant nots in concat vectors to [AArch64][CodeGen] Remove redundant vector negations before concat.Nov 11 2022, 6:23 AM
peterwaller-arm accepted this revision.Nov 14 2022, 1:52 AM

LGTM, Thanks for trying the path of making it generic and thanks for the input from the other reviewers!