SimplifyDemandedBits takes into account multiple uses by using an all-ones mask instead of the caller's DemandedMask input. However for SIGN_EXTEND_INREG the original DemandedMask is used instead of the safe NewMask, which makes the SHL simplication unsafe for SIGN_EXTEND_INREG with multiple uses.
The following LIT reproduces this bug in LLVM 3.2 (the mask of the second selects ends up zeroed after being SHL 31 bits and then again SHL 63 bits).
; RUN: llc < %s -march=x86-64 -mcpu=generic -mattr=sse42 | FileCheck %s
; CHECK-LABEL: sext_inreg_multiple_uses:
; CHECK: pslld $31, [[REG:%[a-z0-9]+]]
; CHECK-NEXT: psrad $31, [[REG]]
define void @sext_inreg_multiple_uses(<4 x double> %a, <4 x double> %b, <4 x i32>* %out1, <4 x double>* %out2) {
entry:
%mask = fcmp olt <4 x double> %a, %b %res1 = select <4 x i1> %mask, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, <4 x i32> <i32 -2, i32 -2, i32 -2, i32 -2> %res2 = select <4 x i1> %mask, <4 x double> <double 3.0, double 3.0, double 3.0, double 3.0>, <4 x double> <double 7.0, double 7.0, double 7.0, double 7.0> store <4 x i32> %res1, <4 x i32>* %out1, align 4 store <4 x double> %res2, <4 x double>* %out2, align 8 ret void
}
On the current trunk this code (and my attempts to tweak it) is not lowered to a double-use SIGN_EXTEND_INREG so the problem is not reproduced. Any suggestion on how to tweak the code to generate a double-use SIGN_EXTEND_INREG would be greatly appreciated.