This patch resolves the issue mentioned in PR34565
Details
Diff Detail
Event Timeline
Thanks Mathew.
@chandlerc, you know this pass better than anybody else, can you review this minor change please?
Is there any way that the debug instructions can become invalid due to this transform? I'm not seeing anything obvious, but wanted to make sure you've thought about this too.
lib/Target/X86/X86CmovConversion.cpp | ||
---|---|---|
651–660 | I think the use of MI to refer to the front of the group makes this and other code in this function harder to read. I'd suggest renaming that variable to something less confusing (or completely removing it). But you also have both It and DbgIt here which also seems confusing. Really, all of this is driving me toward the conclusion that this is too late to do this fix. Consider that this is after the scan to swap the CC above which seems like it would not naively expect a non-cmov machine intsr. I could well imagine an assert being added to it later that would fire. I think you want to do this before you start to reason about the group of machine instructions as definitely consecutive cmov instructions. I'd also suggest pulling it into its own function. Then you can write the code in a more clear way along the lines of: for (auto I = Group.front().getIterator(), E = Group.back().getIterator(); I != E;) { MachineInstr &MI = *I++; if (!I.isDebugValue()) continue; // Splice the debug instruction after the cmov group. MBB->insertAfter(E, I.removeFromParent()); } Anyways, my goal would be to use auto and for loop syntax to make the code a bit more clear, but I don't think it'll work well until this gets hoisted out of this location and into another location. Maybe you can make it work over an MI range as well, unsure. |
Thanks Chandler for reviewing the code.
Is there any way that the debug instructions can become invalid due to this transform? I'm not seeing anything obvious, but wanted to make sure you've thought about this too.
This is how the code I am trying to fix looks like:
%vreg2<def,tied1> = CMOVB32rr %vreg6<tied0>, %vreg1, %EFLAGS<imp-use> DBG_VALUE %vreg2, %noreg, !"b", <!DIExpression()>; GR32:%vreg2 line no:7 DBG_VALUE %vreg2, %noreg, !"b", <!DIExpression()>; GR32:%vreg2 line no:7 %vreg3<def,tied1> = CMOVB32rr %vreg0<tied0>, %vreg6, %EFLAGS<imp-use>
I assume 3 facts at this point:
- DBG_VALUE has only virtual register and cannot have physical register, that is true because we are still working with in SSA machine code.
- DBG_VALUE instruction generates a void value, i.e., no instruction can use it (or refer to it).
- When we replace all uses of the CMOV instruction with the new PHINode instruction, that also changes the virtual register in the DBG_VALUE.
Saying that, we can always move all DBG_VALUE instructions forward up to the end of Machine Basic Block, right?
This is the transformation I am doing:
%vreg2<def,tied1> = CMOVB32rr %vreg6<tied0>, %vreg1, %EFLAGS<imp-use> %vreg3<def,tied1> = CMOVB32rr %vreg0<tied0>, %vreg6, %EFLAGS<imp-use> DBG_VALUE %vreg2, %noreg, !"b", <!DIExpression()>; GR32:%vreg2 line no:7 DBG_VALUE %vreg2, %noreg, !"b", <!DIExpression()>; GR32:%vreg2 line no:7
The rest, i.e., creating the branches and moving these DBG instructions to different basic block, is not related to this change and should have been working regardless of this case.
lib/Target/X86/X86CmovConversion.cpp | ||
---|---|---|
651–660 |
It is not really an issue, because the scan to swap the CC runs over the instructions in the "Group" list , and we add only CMOV instructions to the "Group" list. There is a difference between CMOV "Group" list container and the instructions between Group->front() and Group->back(). Anyway, I agree with you that we better move this into its own function, so I will add a "packCmovGroup" function that will assure the CMOV group is consecutive. |
Well, that's what I'm trying to make sure of...
I'm not an MI expert, but in the past I have found code motion at MI-time to be somewhat tricky.
Do we have kill flags set? If so, we could move a use of a virtual register across its kill... But maybe there is some reason to assure that this is not the case. Looping in Reid as he has been looking at DBG_VALUE a lot more closely than I have recently and may be able to quickly reassure both of us that this is safe. =]
lib/Target/X86/X86CmovConversion.cpp | ||
---|---|---|
580–584 | Any particular reason you moved to a vector rather than doing the increment first, and then splicing? |
Do we have kill flags set? If so, we could move a use of a virtual register across its kill... But maybe there is some reason to assure that this is not the case.
I would like to think that at this point of the optimizations, we do not preserve liveness of virtual registers. But I might be wrong!
Looping in Reid as he has been looking at DBG_VALUE a lot more closely than I have recently and may be able to quickly reassure both of us that this is safe. =]
Sure, let is wait for Reid's answer.
lib/Target/X86/X86CmovConversion.cpp | ||
---|---|---|
580–584 | No, thought it might be more clear and much more safer. |
Any particular reason you moved to a vector rather than doing the increment first, and then splicing?