Clang currectly generates longer and slower code for:
Code example: https://godbolt.org/g/wtimXj
This patch fixes it.
xbolva00 on Apr 6 2018, 9:49 AM.Authored by
High-level question: what is this trying to do?
replaces e.g. $null with $ptr in blocks where we know $ptr == $null.
I have to rework patch, since it should jump to "NullPtrBlock" (after cmp) and then check br instr a jump to next possible block. Now, it just iterates over blocks starting with "NullPtrBlock".
Maybe, but here it also fits, every info required for this transformation is available here.
Implementation of this optimization would probably be quite massive in the CorrelatedValuePropagation.
In *another* basic case. It is clearly working in the case i linked, no?
Take working example, save (slight manual cleaning required) it as test.ll,
and run $ opt -O2 -S test.ll -print-after-all (D44244 isn't there still),
and look for when line tail call void @bar(i8* %0) is replaced with tail call void @bar(i8* null)
That will tell you which pass does it currently. (spoiler: Global Value Numbering, hmm...)
But then you will have two similar folds doing essentially the same thing, but in different passes, and one is more broken than the other one.
I think it's been shown that every other pass could be subsumed by instcombine. That doesn't mean we should do that.
I was curious why that might be, so I wrote a patch. It's about the same amount of code as this patch, but more general. Please see if D45448 / rL329755 solved your motivating examples. If yes, I think you can abandon this patch.