This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Canonicalize select immediates
ClosedPublic

Authored by dmgreen on Dec 14 2019, 12:31 PM.

Details

Summary

I found a case where after a specific awkward set of inlining and combining, we would end up with code that looked like:

%1 = icmp slt i32 %shr, -128
%2 = select i1 %1, i32 128, i32 %shr
%.inv = icmp sgt i32 %shr, 127
%spec.select.i = select i1 %.inv, i32 127, i32 %2
%conv7 = trunc i32 %spec.select.i to i8

This should be turned into a min/max pattern, but somewhere along the line the "-128" in the first select was instead transformed into "128", as only the bottom byte was ever demanded.

To fix this, I've put in further canonicalisation for the immediates of select, preferring to use the same value as the icmp if available.

Diff Detail

Event Timeline

dmgreen created this revision.Dec 14 2019, 12:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 14 2019, 12:31 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
nikic added a subscriber: nikic.Dec 14 2019, 1:38 PM
spatel accepted this revision.Dec 17 2019, 9:07 AM

LGTM

llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
352

nit: try and keep -> try to keep

361

nit: add period at end of sentence.

371–376

This comment/code didn't read clearly to me the first time through. It might just be me, but I'd rather see the positive comparison, followed by the default, so invert the order:

if ((*CmpC & DemandedMask) == (*SelC & DemandedMask)) {
 I->setOperand(OpNo, ConstantInt::get(I->getType(), *CmpC));
 return true;
}
return ShrinkDemandedConstant(I, OpNo, DemandedMask);
This revision is now accepted and ready to land.Dec 17 2019, 9:07 AM

Can you elaborate more on why this canonicalization should be performed? (show godbolt output with/without this
Is it an instruction selection issue?

Code that triggers this is actually used a lot in one library I was looking at. It's quite a big gain for both scalar and vector code to make sure this is matched correctly. (That's CMSISDSP, the same one that I was quoting differences from in the "bump select threshold to 3" patch. Interestingly it would only manifest some of the time, if inlining would happen before instead of after some simplification I think).

Does the "original" testcase shows the issue in enough detail? That's roughly what is happening. Or the example in the summary? They are not min/max patterns like they should be, and nothing will recognize them as such. Including ISel but also any of the code we have to attempt not split apart min/max patterns.

There's almost certainly a dozen different ways to fix this, but I think this one makes sense; canonicalising the select immediates towards the icmp values.

dmgreen updated this revision to Diff 234394.Dec 17 2019, 2:49 PM

Address comments. Thanks for taking a look!

lebedev.ri accepted this revision.Dec 17 2019, 2:55 PM

Code that triggers this is actually used a lot in one library I was looking at.
It's quite a big gain for both scalar and vector code to make sure this is matched correctly.
(That's CMSISDSP, the same one that I was quoting differences from in the
"bump select threshold to 3" patch. Interestingly it would only manifest some of the time,
if inlining would happen before instead of after some simplification I think).

Does the "original" testcase shows the issue in enough detail?
That's roughly what is happening. Or the example in the summary?

They are not min/max patterns like they should be, and nothing will recognize them as such.
Including ISel but also any of the code we have to attempt not split apart min/max patterns.

There's almost certainly a dozen different ways to fix this,
but I think this one makes sense; canonicalising the select immediates towards the icmp values.

Okay, that answered the question.
Sounds good, thank you.

This revision was automatically updated to reflect the committed changes.