NB: I've written up a sort of broad-ish summary of what's wrong with the SelectionDAG scheduling of DBG_VALUEs in PR41583 , which IMO is the cause of the unfortunate variable-location droppage described above. I've got an additional patch that should go up tomorrow.
Mon, Apr 15
Thanks for the all the reviews; still outstanding is the dbg.declares-get-stashed-in-MF-objects matter, which I'll generate a follow-up for.
Fri, Apr 12
Tue, Apr 9
Many thanks for the reviews, I've juggled the text in the first third in response to comments,
Wed, Mar 27
Maybe ISel should have put the COPY to %0 after the last DBG_VALUE (now we got a dbg-use of %15 after the last non-dbg-use of %15). Or maybe it should have used %0 instead of %15 in that DBG_VALUE. Or maybe there should be one DBG_VALUE before the COPY using %15 and one after using %0.
Tue, Mar 26
LGTM, many thanks for the find and fix!
Mar 25 2019
Ping on this -- I think we all agree that this is a part of a large issue (DBG_VALUEs of non-live variables), but this patch is a step in the right direction.
Switching it to a MapVector makes it deterministic. I was just struggling late last night to try to reduce the .ll file down to something worth checking in! If you want to take over to minimize the test case, that would be much appreciated. Let me know so I can continue doing that if needed. Thanks.
Isn't there a risk that this (partially) is hiding bugs in some other passes, where the DebugLoc is picked from the wrong place. I guess that either it should be picked from an adjacent non-meta instruction. Or it should be set to unknown.
With this patch we only eliminate the risk that the faulty passes are picking an "incorrect" DebugLoc from a dbg.value intrinsic (or DBG_VALUE instruction). But the "faulty" pass could still pick incorrect DebugLoc from other instructions, right? Or is it always correct to take the DebugLoc from all other non-dbg.value instructions, including all other kinds of meta-instructions?
Might be hard to find all such bugs. But with your known examples it shouldn't be too hard to find at least some places in opt/llc where the DebugLoc from the dbg.value is infecting some other instruction. Then perhaps we want to land this patch as well, just do avoid some not-yet-detected problems.
This patch appears to generate non-reproducible builds in some cases. I can craft a more minimal test case, but the following link (https://godbolt.org/z/sWucUZ) is what I have been using. If I run Clang multiple times, the output eventually swaps the order of some undef DEBUG_VALUE's. I am just passing "-O1 -g" with that .ii file. It isn't obvious to me yet what is causing this to be unordered/non-deterministic.
What is the status regarding this patch? Are we waiting for something?
Mar 22 2019
Mar 21 2019
So I guess the answer to my question whether this is indicative of a bug in an earlier pass is: not necessarily.
Mar 19 2019
Isn't that a symptom of previous passes failing to insert a dbg.value(undef) when they deleted a dbg.value? I'm trying to figure out what a legitimate example for this would look like.
Mar 15 2019
Mar 14 2019
Update to use unknown-locations for everything but promoted store instructions,
Mar 13 2019
Many thanks for the reviews!
Mar 12 2019
Ping -- AFAIUI we're happy with the meaning in this patch, wording and presentation still needs review?
Mar 11 2019
Mar 8 2019
Yes, I was just thinking out loud about sinking/scheduling in general and not so much the specific situation with MachineSink. But I also wondered if the sinking done by MachineSink could be seen as a special case of sinking in general, where we sink one instruction at a time. For MachineSink we eventually reach the end of the BB, and then continue into a successor. So how do we determine when it is time to insert "undef" when sinking one instruction at a time? What kind of reorderings should/shouldn't trigger that we insert an "undef"? I guess this is one thing we should try to describe in the documentation (also for DGB_VALUE), to make sure new developers understand the basic logic behind how we implement these things.
Mar 7 2019
I produced another patch (D59027, work-in-progress) that only creates undef DBG_VALUEs when the order of assignments would change... however I then realised I'd misunderstood Bjorns question here:
Mar 6 2019
Mar 4 2019
Re-think opening of paragraph on the risks involved when moving/altering code.
Incorporate further wording and structure feedback
Mar 1 2019
Incorporate recommendations on using auto from Andrea, delivered offline,
LGTM, although I'll wait a bit for more opinions
Incorporate feedback: use skipDebugInstructionsForward for better readability, iterate forwards when sinking DBG_VALUEs.
Rename new test case's file name, move it to the 'DebugInfo' directory too as that seems more appropriate.
Avoid relying on placeDbgValues for this change. We already walk (forwards) through all instructions in a block looking to optimise them, add a dbg.value visitor that rewrites the dbg.value operand if it refers to a sunk address computation.
Sprinkle some possessive apostrophes, speling, incorrect variable names
Apply review feedback, re-write the dbg.value undef example to better demonstrate the problem at hand.
Feb 27 2019
I've generated a short summary in https://reviews.llvm.org/D58726 which illustrates how I believe dbg.values are to be interpreted -- I'll chuck this at llvm-dev tomorrow if it's not obviously broken. As mentioned in the summary, it's likely that more could be said, but getting agreement on the location of a dbg.value being the location that an assignment ``happens'' would be good.
Feb 26 2019
Feb 25 2019
Feb 22 2019
So then we should check if there is any other dbg.value that we sink past? Otherwise there won't be any reordering and no need for an undef location?
In my view, the real problem as stated in the PR appears to be that the value later gets *lost*, sadly before the instructions directly related to the switch-case where the variable is used; the real problem is *not* that the value's description starts too soon.
So given all that, how does this patch help address the real problem?
Update comments in placeDbgValues to reflect its new purpose; remove code calculating the previous non-debug-instr, as we now use a dominator tree check instead.
Feb 21 2019
Use isCopyInstr to detect copy instructions, which will catch more opportunities that just isCopy().
This problem had PR40427 hiding at the bottom of it, so this WIP isn't required.
Feb 20 2019
It turns out the validity of this change relies on placeDbgValues reordering (curses), a small amount of extra juggling will be required.
Feb 19 2019
Feb 15 2019
I have some doubt about this. Mainly the part about inserting undefs. Although, some of my doubts are partially based on the big-black-hole regarding how debug info is supposed to work for heavily optimized code (and more specifically how this is supposed to work in LLVM based on what we got today).
Explicitly test for whether we're pre or post regalloc when deciding whether a copy can be propagated, refactor test logic.
Feb 14 2019
Sort dbg-values to be sunk to ensure determinism, sprinkle -NEXT on some CHECKS to strengthen test.