Page MenuHomePhabricator

[X86] Freeze vXi8 shl(x,1) -> add(x,x) vector fold (PR50468)

Authored by RKSimon on Aug 16 2021, 8:49 AM.



We don't have any vXi8 shift instructions (other than on XOP which is handled separately), so replace the shl(x,1) ->add(x,x) fold with shl(x,1) ->add(freeze(x),freeze(x)) to avoid the undef issues identified in PR50468.

Split off from D106675 as I'm still looking at whether we can fix the vXi16/i32/i64 issues with the D106679 alternative.

Diff Detail

Event Timeline

RKSimon created this revision.Aug 16 2021, 8:49 AM
RKSimon requested review of this revision.Aug 16 2021, 8:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2021, 8:49 AM

Any objections to this?

RKSimon updated this revision to Diff 367983.Aug 22 2021, 7:58 AM

Add comment explaining the need for freeze

pengfei added inline comments.Aug 22 2021, 5:38 PM

Sorry, I don't quite understand the background. My question is why don't we check the oprand is a undef?
The affected tests seem all reasonable values, though they seem not have regressions.

craig.topper added inline comments.Aug 22 2021, 6:17 PM

It’s not enough to check for undef at the time of the fold. You need to know that no future DAGCombine can make the input undef.

(shl X, 1) must produce an even number even if X is undef. computeKnownBits will look at the shift amount and tell downstream code that the lsb is 0 without looking at the left hand side. And the value may not be undef at the time computeKnownBits is called but could be simplified to undef later.

(add undef, undef) does not produce an even number. Register allocation is free to pick different registers for undef.

The freeze forces register allocation to use the same register.

pengfei added inline comments.Aug 22 2021, 6:49 PM

Nice answer! I'm clear now. Thanks Craig.

Thanks for the explanation Craig! I always feel I confuse both myself and other people when I start talking about undef and poison......

Is everyone happy for this to go in?

spatel accepted this revision.Aug 24 2021, 6:21 AM



Undef reasoning is complicated, and the explanation here is great. How about enshrining it as a code comment? :)

// R may be undef at run-time, but (shl R, 1) must be an even number (LSB must be 0).
// (add undef, undef) however can be any value. To make this safe, we must freeze R
// to ensure that register allocation uses the same register for an undefined value.
// This ensures that the result will still be even and preserves the original semantics.
This revision is now accepted and ready to land.Aug 24 2021, 6:21 AM

Thanks - I'll add the explanation comment in the commit