This is an archive of the discontinued LLVM Phabricator instance.

InstCombine: Scalarize single use icmp/fcmp
ClosedPublic

Authored by arsenm on Dec 2 2018, 2:24 PM.

Details

Reviewers
spatel

Diff Detail

Event Timeline

arsenm created this revision.Dec 2 2018, 2:24 PM
spatel added a comment.Dec 3 2018, 7:12 AM

I think this is the right change for IR (I'm trying to do something similar with D50992 but that's been held up while we try to get the backend in shape). .

  1. I see that the new code is copying the existing handling of binops around line 315, but can we update this to use 'match()' and reduce the duplication?
  2. There should be a test that shows the benefit of using 'isCheapToScalarize()'. IIUC, that would involve a sequence with multiple binops and/or compares between an insertelement and the trailing extract.
  3. Please commit the tests with baseline checks as a preliminary commit, so we only show the diffs in this patch.

I think this is the right change for IR (I'm trying to do something similar with D50992 but that's been held up while we try to get the backend in shape). .

  1. I see that the new code is copying the existing handling of binops around line 315, but can we update this to use 'match()' and reduce the duplication?
  2. There should be a test that shows the benefit of using 'isCheapToScalarize()'. IIUC, that would involve a sequence with multiple binops and/or compares between an insertelement and the trailing extract.
  3. Please commit the tests with baseline checks as a preliminary commit, so we only show the diffs in this patch.

I don't think there's much of a useful way to combine this with the BinOp handling. At most it's going to share the pair of extracts, and would require more ugly code when creating the new operation so I think it would be overall worse

arsenm updated this revision to Diff 176679.Dec 4 2018, 11:08 AM

Rearrange code, apply on top of existing tests

spatel added a comment.Dec 5 2018, 3:26 PM
  1. There should be a test that shows the benefit of using 'isCheapToScalarize()'. IIUC, that would involve a sequence with multiple binops and/or compares between an insertelement and the trailing extract.

I added a binop test like this here:
rL348417
...and it shows a missed opportunity. AFAICT, there was no test coverage for anything beyond the simplest cases for isCheapToScalarize(), so please do add something like that for cmp.

Also, I used 'match' to reduce the binop code:
rL348418

cmp should be similar:

Value *X, *Y;
CmpInst::Predicate Pred;
if (match(SrcVec, m_Cmp(Pred, m_Value(X), m_Value(Y))) &&
    cheapToScalarize(SrcVec, IndexC)) {
  // extelt (cmp X, Y), Index --> cmp (extelt X, Index), (extelt Y, Index)
  Value *E0 = Builder.CreateExtractElement(X, Index);
  Value *E1 = Builder.CreateExtractElement(Y, Index);
  return CmpInst::Create(cast<CmpInst>(SrcVec)->getOpcode(), Pred, E0, E1);
}
arsenm updated this revision to Diff 176918.Dec 5 2018, 7:37 PM

Added more tests in r348456, use match

spatel accepted this revision.Dec 10 2018, 11:39 AM

LGTM - but I think there's a bug in cheapToScalarize() that I didn't notice before. Please put a FIXME note in there.

test/Transforms/InstCombine/scalarization.ll
86–87

This doesn't look right to start with. I don't think we can justify leaving more instructions than we started with as a target-independent combine.

This revision is now accepted and ready to land.Dec 10 2018, 11:39 AM
arsenm closed this revision.Dec 10 2018, 1:54 PM
arsenm marked an inline comment as done.

r348801