Page MenuHomePhabricator

[InstCombine] Propagate null values from conditions to other basic blocks
AbandonedPublic

Authored by xbolva00 on Apr 6 2018, 9:49 AM.

Details

Summary

Clang currectly generates longer and slower code for:
if (!ptr)

return ptr;

than for:
if (!ptr)

return NULL;

Code example: https://godbolt.org/g/wtimXj

This patch fixes it.

Diff Detail

Event Timeline

xbolva00 created this revision.Apr 6 2018, 9:49 AM
lebedev.ri resigned from this revision.Apr 6 2018, 10:58 AM

High-level question: what is this trying to do?
Can't you just use ReplaceUsesOfWith(), or something like that?

lib/Transforms/InstCombine/InstructionCombining.cpp
2399

Just cast<>

2400

I thought it is written somewhere, but i can't find it in https://llvm.org/docs/ProgrammersManual.html
TLDR: don't use auto unless the type is already obvious (or it is an iterator/etc).

This should be something like

auto *Cmp = dyn_cast<CmpInst>(BI.getOperand(0));
???? *NullOp = Cmp->getOperand(0);
2403

I'd expect this to be something closer to

for(??? *B = TrueDest; B != nullptr; B = B->getNextNode()) {
2404

Similarly, i have no clue what this is,

for (Instr &I : *B) {

?

2405

I'd try to use range-based loop, if there is a function to go from the pointer to an index, or setOperand() would work with index.

test/Transforms/InstCombine/cond-return-null.ll
18 ↗(On Diff #141373)

Do you edit the check-lines after llvm/utils/update_test_checks.py?
Don't. (the first line included)

High-level question: what is this trying to do?
Can't you just use ReplaceUsesOfWith(), or something like that?

if (!ptr)

return NULL;

replaces e.g. $null with $ptr in blocks where we know $ptr == $null.

I have to rework patch, since it should jump to "NullPtrBlock" (after cmp) and then check br instr a jump to next possible block. Now, it just iterates over blocks starting with "NullPtrBlock".

Note that such a transform (null propagation) is already being done somewhere. https://godbolt.org/g/ow79c8
I'd recommend to first look where that is happening, and extend that, not introduce a duplicate-but-better fold elsewhere.

spatel added a comment.Apr 6 2018, 1:35 PM

This is pushing instcombine beyond local simplifications. Propagating values across blocks should probably be handled in CVP (-correlated-propagation).

Note that such a transform (null propagation) is already being done somewhere. https://godbolt.org/g/ow79c8
I'd recommend to first look where that is happening, and extend that, not introduce a duplicate-but-better fold elsewhere.

But it does not work in basic case.. https://godbolt.org/g/wtimXj

This is pushing instcombine beyond local simplifications. Propagating values across blocks should probably be handled in CVP (-correlated-propagation).

Maybe, but here it also fits, every info required for this transformation is available here.

Implementation of this optimization would probably be quite massive in the CorrelatedValuePropagation.

xbolva00 updated this revision to Diff 141426.Apr 6 2018, 2:26 PM
xbolva00 marked 3 inline comments as done.

Note that such a transform (null propagation) is already being done somewhere. https://godbolt.org/g/ow79c8
I'd recommend to first look where that is happening, and extend that, not introduce a duplicate-but-better fold elsewhere.

But it does not work in basic case.. https://godbolt.org/g/wtimXj

In *another* basic case. It is clearly working in the case i linked, no?

This is pushing instcombine beyond local simplifications. Propagating values across blocks should probably be handled in CVP (-correlated-propagation).

Take working example, save (slight manual cleaning required) it as test.ll

,
and run $ opt -O2 -S test.ll -print-after-all (D44244 isn't there still),
and look for when line tail call void @bar(i8* %0) is replaced with tail call void @bar(i8* null)
That will tell you which pass does it currently. (spoiler: Global Value Numbering, hmm...)

Maybe, but here it also fits, every info required for this transformation is available here.

Implementation of this optimization would probably be quite massive in the CorrelatedValuePropagation.

But then you will have two similar folds doing essentially the same thing, but in different passes, and one is more broken than the other one.
So while this will may be easy to do right now, it is the wrong solution long-term, it will increase technical debt,
and someone later on will stumble into that and will have to either fix one of them, or deduplicate them...

xbolva00 updated this revision to Diff 141427.Apr 6 2018, 2:29 PM
xbolva00 updated this revision to Diff 141428.
vivekvpandya resigned from this revision.Apr 7 2018, 9:50 AM

I am not an expert for InstCombine. But I think @spatel has worked on it for long time so I think he is appropriate person to review this.

This is pushing instcombine beyond local simplifications. Propagating values across blocks should probably be handled in CVP (-correlated-propagation).

Maybe, but here it also fits, every info required for this transformation is available here.

I think it's been shown that every other pass could be subsumed by instcombine. That doesn't mean we should do that.

Implementation of this optimization would probably be quite massive in the CorrelatedValuePropagation.

I was curious why that might be, so I wrote a patch. It's about the same amount of code as this patch, but more general. Please see if D45448 / rL329755 solved your motivating examples. If yes, I think you can abandon this patch.

xbolva00 abandoned this revision.EditedApr 11 2018, 10:24 AM

This is pushing instcombine beyond local simplifications. Propagating values across blocks should probably be handled in CVP (-correlated-propagation).

Maybe, but here it also fits, every info required for this transformation is available here.

I think it's been shown that every other pass could be subsumed by instcombine. That doesn't mean we should do that.

Implementation of this optimization would probably be quite massive in the CorrelatedValuePropagation.

I was curious why that might be, so I wrote a patch. It's about the same amount of code as this patch, but more general. Please see if D45448 / rL329755 solved your motivating examples. If yes, I think you can abandon this patch.

I checked clang trunk via godbolt and for https://godbolt.org/g/wtimXj (I didnt forget to switch to trunk) it is still same.

Edit: I checked godbolt again and now code looks well. Thanks!

lib/Transforms/InstCombine/InstructionCombining.cpp
2405

I think the current code is ok.