When we match a MulZExtIdiom, after r307554, and we try to replace the original mul with a narrower one, we look at the uses of the mul, and if one is a BinaryOperator we create a trunc based on it.
From the diff in r307554.
> - ConstantInt *CI = cast<ConstantInt>(BO->getOperand(1)); > - APInt ShortMask = CI->getValue().trunc(MulWidth); > + Value *ShortMask = > + Builder.CreateTrunc(BO->getOperand(1), Builder.getIntNTy(MulWidth));
As pointed out in the ML thread, I don't think this is quite right.
Bo->getOperand(1) could live in another block, consider e.g.
@glob = external global i16 define void @patatino(i8 %beth) { %conv = zext i8 %beth to i32 %mul = mul nuw nsw i32 %conv, %conv %conv3 = and i32 %mul, 255 %tobool8 = icmp ne i32 %mul, %conv3 br i1 %tobool8, label %if.then9, label %if.then9 if.then9: ; preds = %0, %0 %tinky = load i16, i16* @glob %conv13 = sext i16 %tinky to i32 %and = and i32 %mul, %conv13 %conv14 = trunc i32 %and to i16 store i16 %conv14, i16* @glob ret void }
after this transformation would become:
define void @patatino(i8 %beth) { %conv = zext i8 %beth to i32 %umul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 %beth, i8 %beth) %umul.value = extractvalue { i8, i1 } %umul, 0 %1 = trunc i32 %conv13 to i8 %2 = and i8 %umul.value, %1 %3 = zext i8 %2 to i32 %mul = mul nuw nsw i32 %conv, %conv %conv3 = and i32 %mul, 255 %tobool8 = extractvalue { i8, i1 } %umul, 1 br i1 %tobool8, label %if.then9, label %if.then9 if.then9: ; preds = %0, %0 %tinky = load i16, i16* @glob %conv13 = sext i16 %tinky to i32 %and = and i32 %mul, %conv13 %conv14 = trunc i32 %3 to i16 store i16 %conv14, i16* @glob ret void }
Note how you're inserting %1 which has an operand an use of %conv13 which is defined in %if.then9, i.e. you end up with a def that doesn't dominate all the uses -> violates SSA.
As I'm not sure how common this case is, I propose fixing restoring the old behaviour, and in order to fix the original crash bail out in case the first operand of the binary op is not a constant, as the code after actually assumes this.
Ok, i understand and agree with the rational behind this fix. Would it be reasonable to check if the operand is either a constant or defined in the same block, previous to current instruction, in order to catch more candidates?