This is an archive of the discontinued LLVM Phabricator instance.

Fix use of DemandedMask to NewMask for SIGN_EXTEND_INREG
Needs ReviewPublic

Authored by gilr on Nov 19 2014, 5:47 AM.

Download Raw Diff

Details

Reviewers

Summary

SimplifyDemandedBits takes into account multiple uses by using an all-ones mask instead of the caller's DemandedMask input. However for SIGN_EXTEND_INREG the original DemandedMask is used instead of the safe NewMask, which makes the SHL simplication unsafe for SIGN_EXTEND_INREG with multiple uses.

The following LIT reproduces this bug in LLVM 3.2 (the mask of the second selects ends up zeroed after being SHL 31 bits and then again SHL 63 bits).

; RUN: llc < %s -march=x86-64 -mcpu=generic -mattr=sse42 | FileCheck %s
; CHECK-LABEL: sext_inreg_multiple_uses:
; CHECK: pslld $31, [[REG:%[a-z0-9]+]]
; CHECK-NEXT: psrad $31, [[REG]]
define void @sext_inreg_multiple_uses(<4 x double> %a, <4 x double> %b, <4 x i32>* %out1, <4 x double>* %out2) {
entry:

%mask = fcmp olt <4 x double> %a, %b
%res1 = select <4 x i1> %mask, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, <4 x i32> <i32 -2, i32 -2, i32 -2, i32 -2>
%res2 = select <4 x i1> %mask, <4 x double> <double 3.0, double 3.0, double 3.0, double 3.0>, <4 x double> <double 7.0, double 7.0, double 7.0, double 7.0>
store <4 x i32> %res1, <4 x i32>* %out1, align 4
store <4 x double> %res2, <4 x double>* %out2, align 8
ret void

}

On the current trunk this code (and my attempts to tweak it) is not lowered to a double-use SIGN_EXTEND_INREG so the problem is not reproduced. Any suggestion on how to tweak the code to generate a double-use SIGN_EXTEND_INREG would be greatly appreciated.

Diff Detail

Event Timeline

gilr updated this revision to Diff 16373.Nov 19 2014, 5:47 AM

gilr retitled this revision from to Fix use of DemandedMask to NewMask for SIGN_EXTEND_INREG.

gilr updated this object.

gilr edited the test plan for this revision. (Show Details)

gilr added reviewers: eli.friedman, nadav.

gilr added a subscriber: Unknown Object (MLST).

This looks like an obvious bug that should be fixed, unless the code is never executed. Can you try to delete this optimization altogether and see if any tests break? You can also try to use llvm-stress to generate lots of code and see if it ever reaches this optimization.

Removing this optimization breaks CodeGen/X86/vector-blend.ll & CodeGen/X86/vselect-avx.ll.

Okay, so if the code is executed then we should try to come up with a testcase that exposes the bug that you found. Can you modify the test cases that broke and try to expose the bug?

We've tried to modify the test-cases, but we can't actually get it to fail.
We could try to track down why (I guess some earlier transformation changes the code for the problematic cases, where there are multiple uses, into something nicer) but I'm not sure it's worth the effort, since, as you say, this is an obvious bug.

Would you be ok with committing it without a test?

mkuper removed a subscriber: mkuper.Jun 6 2016, 5:33 PM

eli.friedman resigned from this revision.Jun 6 2016, 5:44 PM

eli.friedman removed a reviewer: eli.friedman.

Revision Contents

Path

Size

lib/

CodeGen/

SelectionDAG/

TargetLowering.cpp

2 lines

Diff 16373

lib/CodeGen/SelectionDAG/TargetLowering.cpp

Show First 20 Lines • Show All 787 Lines • ▼ Show 20 Lines	if (ConstantSDNode *SA = dyn_cast<ConstantSDNode>(Op.getOperand(1))) {
KnownOne \|= HighBits;		KnownOne \|= HighBits;
}		}
break;		break;
case ISD::SIGN_EXTEND_INREG: {		case ISD::SIGN_EXTEND_INREG: {
EVT ExVT = cast<VTSDNode>(Op.getOperand(1))->getVT();		EVT ExVT = cast<VTSDNode>(Op.getOperand(1))->getVT();

APInt MsbMask = APInt::getHighBitsSet(BitWidth, 1);		APInt MsbMask = APInt::getHighBitsSet(BitWidth, 1);
// If we only care about the highest bit, don't bother shifting right.		// If we only care about the highest bit, don't bother shifting right.
if (MsbMask == DemandedMask) {		if (MsbMask == NewMask) {
unsigned ShAmt = ExVT.getScalarType().getSizeInBits();		unsigned ShAmt = ExVT.getScalarType().getSizeInBits();
SDValue InOp = Op.getOperand(0);		SDValue InOp = Op.getOperand(0);

// Compute the correct shift amount type, which must be getShiftAmountTy		// Compute the correct shift amount type, which must be getShiftAmountTy
// for scalar types after legalization.		// for scalar types after legalization.
EVT ShiftAmtTy = Op.getValueType();		EVT ShiftAmtTy = Op.getValueType();
if (TLO.LegalTypes() && !ShiftAmtTy.isVector())		if (TLO.LegalTypes() && !ShiftAmtTy.isVector())
ShiftAmtTy = getShiftAmountTy(ShiftAmtTy);		ShiftAmtTy = getShiftAmountTy(ShiftAmtTy);
▲ Show 20 Lines • Show All 2,139 Lines • Show Last 20 Lines