Clang currectly generates longer and slower code for:
if (!ptr)
return ptr;
than for:
if (!ptr)
return NULL;
Code example: https://godbolt.org/g/wtimXj
This patch fixes it.
Differential D45378
[InstCombine] Propagate null values from conditions to other basic blocks xbolva00 on Apr 6 2018, 9:49 AM. Authored by
Details
Clang currectly generates longer and slower code for: return ptr; than for: return NULL; Code example: https://godbolt.org/g/wtimXj This patch fixes it.
Diff Detail
Event TimelineComment Actions High-level question: what is this trying to do?
Comment Actions if (!ptr) return NULL; replaces e.g. $null with $ptr in blocks where we know $ptr == $null. I have to rework patch, since it should jump to "NullPtrBlock" (after cmp) and then check br instr a jump to next possible block. Now, it just iterates over blocks starting with "NullPtrBlock". Comment Actions Note that such a transform (null propagation) is already being done somewhere. https://godbolt.org/g/ow79c8 Comment Actions This is pushing instcombine beyond local simplifications. Propagating values across blocks should probably be handled in CVP (-correlated-propagation). Comment Actions Maybe, but here it also fits, every info required for this transformation is available here. Implementation of this optimization would probably be quite massive in the CorrelatedValuePropagation. Comment Actions In *another* basic case. It is clearly working in the case i linked, no? Take working example, save (slight manual cleaning required) it as test.ll ,and run $ opt -O2 -S test.ll -print-after-all (D44244 isn't there still), and look for when line tail call void @bar(i8* %0) is replaced with tail call void @bar(i8* null) That will tell you which pass does it currently. (spoiler: Global Value Numbering, hmm...)
But then you will have two similar folds doing essentially the same thing, but in different passes, and one is more broken than the other one. Comment Actions I am not an expert for InstCombine. But I think @spatel has worked on it for long time so I think he is appropriate person to review this. Comment Actions I think it's been shown that every other pass could be subsumed by instcombine. That doesn't mean we should do that.
I was curious why that might be, so I wrote a patch. It's about the same amount of code as this patch, but more general. Please see if D45448 / rL329755 solved your motivating examples. If yes, I think you can abandon this patch. Comment Actions I checked clang trunk via godbolt and for https://godbolt.org/g/wtimXj (I didnt forget to switch to trunk) it is still same. Edit: I checked godbolt again and now code looks well. Thanks!
|
Just cast<>