Fixes PR20425.
During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote.
Differential D4659
[SROA] Fold a PHI node if all its incoming values are the same jingyue on Jul 24 2014, 11:07 AM. Authored by
Details Fixes PR20425. During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote.
Diff Detail Event Timeline
Comment Actions Just a couple suggestions which might simplify the code.
Comment Actions Chandler, thanks for your suggestions! I think moving the logic to SliceBuilder makes perfect sense. After modifying SliceBuilder::visitPHINode, I found it almost the same as visitSelectInst. Therefore, I also refactored the code to merge visitPHINode and visitSelectInst to one function. Let me know if it looks good to you. Comment Actions Yea, this looks nice. I have a bunch of nit-picky comments below... Nothing really significant.
Comment Actions Hi Duncan, I tried SimplifyPHINode and it worked pretty well. Thanks! That makes me consider using SimplifySelectInst on select instructions too. However, I found one regression test PR16651.2 would fail after this potential modification. We would transform define void @PR16651.2() { ; This test case caused a crash due to failing to promote given a select that ; can't be speculated. It shouldn't be promoted, but we missed that fact when ; analyzing whether we could form a vector promotion because that code didn't ; bail on select instructions. ; ; CHECK-LABEL: @PR16651.2( ; CHECK: alloca <2 x float> ; CHECK: ret void entry: %tv1 = alloca { <2 x float>, <2 x float> }, align 8 %0 = getelementptr { <2 x float>, <2 x float> }* %tv1, i64 0, i32 1 store <2 x float> undef, <2 x float>* %0, align 8 %1 = getelementptr inbounds { <2 x float>, <2 x float> }* %tv1, i64 0, i32 1, i64 0 %cond105.in.i.i = select i1 undef, float* null, float* %1 %cond105.i.i = load float* %cond105.in.i.i, align 8 ret void } to define void @PR16651.2() { entry: %cond105.in.i.i = select i1 undef, float* null, float* undef %cond105.i.i = load float* %cond105.in.i.i, align 8 ret void } Is this transformation on PR16651.2 valid? If no, can somebody help me understand why it isn't? Thanks, Comment Actions
It looks like this transformation is valid only if loading from an undef address always produces an undef value (and never traps). In the original code, loading from %1 always produced an undef value (because undef was explicitly stored into that location, but would have been undef anyway because it is an otherwise-fresh alloca), but did not trap. So if we always replaced undef addresses with some equivalently-sized constant-pool load (or something like that), then this might work. I don't know that we do that, however. -Hal
Comment Actions To avoid redundancy, use SimplifyInstruction to fold PHI nodes and select instructions. One tricky case is when clobbering one operand in select undef, a, b we need to have the select always return the other operand by setting the condition "undef" to 0 or 1; otherwise, the transformation of clobbering dead operands may introduce new undefined behavior. Added test @simplify_undef in phi-and-select.ll to cover this case. Updated basictest.ll per this change. Comment Actions Can anyone take a look at the latest diff? In the latest diff, I use SimplifyInstruction to fold both PHI nodes and select instructions. Doing so requires a minor fix in the logic of clobbering dead uses in select instructions: to clobber x/y in "select undef, x, y", we should change the condition from "undef" to 0/1. Otherwise, "select undef, undef, y" or "select undef, x, undef" can introduce more undefined behavior. This wasn't an issue in the old code, because the old code does not fold a select with undef condition. Comment Actions Hi Chandler, I hope your bug chasing is going well. Wondering if you can get a chance to take a look at this patch. Thanks a lot, Comment Actions Sorry. I tried to get back to this a couple of times but it required a lot of thought. And then your pings kept hitting me at bad times. =/
Comment Actions Thanks for the review, Chandler. I see what you are saying. I agree the current dead-operand-trackign mechanism is a little cumbersome for the use of SimplifyInstruction. I reverted the use of SimplifyInstruction and left the architectural changes as TODO. PTAL. |
Just use nullptr as the initial value?