This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] Run constant folding if no other simplifications were possible
AbandonedPublic

Authored by igor-laevsky on Dec 1 2017, 6:35 AM.

Details

Summary

This came in as an unexpected requirement for the D40650.

Before this change we were doing constant folding only for the unrecognised instructions. For the recognised ones we were calling specific constant fold functions from their respective simplification functions. However this specific constant folding functions are slightly weaker that the general ConstantFoldInstruction because they don't fold operands. Turns out that for the insertelement we need full power of the ConstantFoldInstruction or we are going to fail some of the lit tests.

It felt like running constant folding after all InstSimplify transformations is more direct solution rather then trying to include constant folding into each of the simplification functions. Although this mostly affects insertelement case but we can improve later simply by removing constant folding from the simplification functions. This will give us better folding with less effort.

Diff Detail

Event Timeline

igor-laevsky created this revision.Dec 1 2017, 6:35 AM

Have you checked for compile time impact?

igor-laevsky added a comment.EditedDec 1 2017, 9:59 AM

No, I haven't. I wouldn't expect big compile time impact, but I will run CTMark to be sure.

spatel edited edge metadata.Dec 1 2017, 10:45 AM

This seems like it would call ConstantFoldXXX a 2nd time for most instructions?

I'd prefer to just add ConstantFoldInsertElementInstruction() to the top of SimplifyInsertElementInst() to keep things symmetrical for now. If this patch makes sense independent of that, then it could be a follow-up?

igor-laevsky added a comment.EditedDec 4 2017, 4:16 AM

Hi Sanjay,

Adding ConstantFoldInsertElementInstruction doesn't solve the problem I encountered after D40650. For the test Transforms/InstSimplify/pr28725.ll we need to run full ConstantFoldInstruction because it additionally folds instruction operands.

This change indeed causes constant folding to be called twice in some cases. But it's not a total waste - second time we will fold instruction operands and potentially simplify more. In the end I would expect that we will be able to remove all specific constant fold invocations in favour of this single call (providing it will not impact compile time).

I had run CTMarks and I see total compile time slowdown of about 5% with various benchmarks regressed up to 10%. This does seem a lot especially considering unclear benefit from this change. I need to figure out better approach here.

igor-laevsky abandoned this revision.Dec 4 2017, 8:04 AM
spatel added a comment.Dec 4 2017, 9:08 AM

If you add the usual constant folding check to the top of SimplifyInsertElementInst(), the test in pr28725 will simplify to a constant expression:

define <2 x i16> @test1() {
entry:

ret <2 x i16> <i16 extractvalue (%S select (i1 icmp eq (i16 extractelement (<2 x i16> bitcast (<1 x i32> <i32 1> to <2 x i16>), i32 0), i16 0), %S zeroinitializer, %S { i16 0, i32 1 }), 0), i16 0>

}

This should be fine. InstCombine will turn that into a '0' later.