User Details
- User Since
- Mar 19 2018, 2:57 AM (262 w, 5 d)
Yesterday
LGTM; could you expand the comment where you set the kill location a little more, to explicitly explain why we can't use the DIArgList. As I understand it, it's because we'd have to compute the value, then start fragmenting it up, which is a large amount of effort for a rare scenario.
Thu, Mar 30
LGTM -- when landing could you add a short comment to the test indicating which function used to crash, that'll ease the life of anyone reading the test in the future and let them know exactly what's being tested.
I've changed my mind: actually there are scenarios where being inlined into a different context should enable inlining that was previously skipped, for example where an unlikely branch can statically found to be always true after inlining, justifying inlining whatever's down that branch. This has exposed the deeper issue and I've now got an updated test that does not rely on profile data.
Wed, Mar 29
LGTM -- this is after all just a switch-throw to enable everything that's in tree already (which has all been reviewed). The expected compile-time regression on ReleaseLTO-g is roughly 1.2% right, in exchange for greater variable location coverage?
(Oh looks, Phab posts when you hit control+enter; I've edited more data into the earlier comment)
Interesting -- I've dug into the cost modelling, and I think you're correct that there's an issue, but it's even less enjoyable. It looks like the effect of the branch-weights successfully suppresses longname being inlined into rec_call_to_longname because it's a cold callsite, in the first instance. However, once rec_call_to_longname is inlined into a, the same computations round all the frequencies to zero, and inlining starts to happen repeatedly through both rec_call_to_longname and longname, giving the exponential behaviour. Without the branch weights, inlining longname into rec_call_to_longname creates a directly recursive function that never gets inlined again.
LGTM
A more generalised solution I believe would be to skip debug instructions while examining the basic block -- that would mean
- Using an iterator-based for loop rather than a range based loop, and
- skipDebugInstructionsForward to increment past any debug instructions at the start of each iteration
Tue, Mar 28
Note that I'm seeing this too with Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29915 for x64.
I agree with @jryans analysis, otherwise LGTM, you might want to wait a bit for input from @probinson who enjoys looking at clang-switch tests.
Good saving; in the future we might find different ways of representing variable fragments and communicating them to LiveDebugValues, or make LDV smarter, but it's good to get this benefit now. I think the more structured fragment model @scott.linder proposed would deliver the sort of guarantees we'd need.
LGTM with one question -- there's a semantic change whereby a debug intrinsic with no fragment is considered as covering the whole variable, right? (I think this is uncontroversial).
LGTM
LGTM
Thu, Mar 23
LGTM with a question; this does introduce knarlyness, so we should revisit this if we manage to make the pass faster.
Wed, Mar 15
/me squints -- so if no part of the variable ends up on the stack, this is effectively early-exiting this portion of analysis?
Mon, Mar 13
LGTM, assuming that the instructions affected always have the same signature / format. (I know very little aarch64).
I agree with Stephen, it's good but flipping the FromValue flag doesn't seem right.
LGTM with inline rider
This looks good, but in the test outputs shouldn't the cost be zero? Auto-regenerated without rebuilding opt perhaps? (See inline comment)
Could you add the context for review; also, are there any clear winners in CTMark that justify this change? 0.05% is close to noise, and the function complexity increases substantially.
Fri, Mar 10
LGTM; you might consider making it a MIR test that just runs the machine-combiner pass, but I don't think that'll have any material advantage over an .ll test, the only difference is making the exact input to machine-combiner explicitly visible.
Wed, Mar 8
Tue, Mar 7
This is a good improvement, many thanks for working in this area. The overall approach is good, I've got a couple of high level comments:
- I believe you'll need to erase elements from LocalVarMap once a DBG_VALUE is sunk -- if a later, unrelated DBG_VALUE for the same variable is encountered in the same block, then there's no need terminate the earlier sunk DBG_VALUE,
- You should use the DebugVariable class to identify variables, as that combines inlining and fragment information into a single object. Using just DILocalVariable risks duplicate instances of the same variable from different inlining contexts being merged.
Feb 10 2023
If I understand correctly, this is moving the tool-global string pool from a header-function into something with one-definition -- definitely the right direction to take. I think Orlando might have said that a context-object-with-string-pool is probably the best overall solution, but this is a good improvement for now.
Feb 7 2023
Cool, sounds like this is just a case of mistaken-function-identity, thanks for catching this @uabelho. I'll drop in a follow-up patch shortly to use GetValueAtEndOfBlock
/me squints -- I think the idea here was that if SSAUpdater has a value at the end of the block we can use that and it won't cause new codegen; but then we're using GetValueInMiddleOfBlock, which will generate code in certain circumstances.
LGTM from a debug-info perspective; I know next to nothing about ARM though so can't venture an opinion there. A smaller test would be appreciated if that's possible.
Jan 31 2023
LGTM -- and for my understanding, this is because the assignments have been replaced, leaving the dbg.assign hanging around in the case of DSE would be the correct behaviour, yes?
Jan 30 2023
LGTM3
From a debug-info perspective this LGTM, with the test improvement to reduce the size -- I'm not familiar with all of GVN though. The objective of ensuring that debug-info can't affect the code generated is an important one.
Jan 27 2023
I've run out of time to read the code this week, but it sounds like a good approach to me, I'll give my debugger colleagues a prod about it too.
LGTM, but I'll leave to @djtodoro to approve. Support in the other flavour of LiveDebugValues would indeed be appreciated.
Jan 23 2023
[...]
error: YAML:99:1: unknown key 'debugInstrRef'
debugInstrRef: true
^~~~~~~~~~~~~
[...]
subprocess.CalledProcessError: Command 'llc -run-pass=prologepilog -o - -x mir' returned non-zero exit status 1.
Jan 20 2023
Jan 18 2023
Apply review comments; I've also had an attack of conscience, and added some more check lines:
- llvm/test/DebugInfo/x86/instr-ref-flag.ll -- check that the MIR function key gets produced with true or false values, according to the command line flags,
- llvm/test/DebugInfo/MIR/InstrRef/instr-ref-roundtrip.mir -- check that the flag makes its way from the input to the output.
Use a type alias instead of SmallSet directly,
(Clearing queue of old revisions) -- things have moved on a lot since this was written, I think we're close to just killing off the IsIndirect field completely and putting everything into DIExpressions.
(Clearing out old revisions) -- it's been so long that this is unlikely to land. This was also written at the point where I was realising that to handle DBG_VALUE movement in machinesink, we basically need to re-compute SSA form each time.
Jan 17 2023
Jan 16 2023
Drive-by comment after spotting this question:
Jan 10 2023
LGTM
This all LGTM; if you're happy with the comments for struct WasmLoc then that's fine too.
Jan 9 2023
Jan 6 2023
Rather than store instructions, "instructions that store", seeing how you're marking some memory intrinsics with DIAssignID too?
Dec 27 2022
Dec 24 2022
Thanks for assurance about the stack-popping being addressed elsewhere -- it might be worth putting a comment in D138943 next to the definition of struct WasmLoc about that being a deliberate non-objective, to enlighten future readers.
Thanks @vitalybuka , looks like we'll get back to this after the holidays.
Dec 21 2022
Dec 20 2022
Slightly twitchy about there being a zero LocIdx -- I think I tried to suppress there being any default-constructed LocIdxes out there. On the other hand, it comes with a sentinel value attached, so it's no big deal.
I removed all register-based (= dangling) DBG_VALUEs for Wasm in D139675; WebAssemblyDebugFixup runs right before this analysis. So now we don't have any virtual registers participating here, and I re-enabled the physical register assertion, so we can maintain LocIndex only deals with physical registers.
LGTM
Looks good (leaning largely on my prior review).
Looks good, but with what feels like low test coverage: remind me, the plumbing required to track variadic locations in LiveDebugValues already landed, and we have tests to cover those new code paths, yes?
Dec 14 2022
LGTM -- let's leave it a while to give more opportunity for input.
According to the style guide, "Variable names should [...] be camel case, and start with an upper case letter (e.g. Leader or Boats).", you'll need to update the variable names you've added. Otherwise all looks good.
Dec 9 2022
Latest two revisions look good -- except that you've not merged them together when uploading them, so the latter overwrite the former!
I had another chat with Orlando about how we're testing these patches -- I'm pretty happy that we're now testing all the important paths through the assignment-tracking analysis pass, and the different ways that CFGs can compose and have different lattice input/output values. Nothing really springs to mind about what else we can be testing -- and significantly, this work is not going to be enabled by default, we'll be doing a decent amount of validating on larger projects first.
LGTM, with a request for some more documentation comments in the earlier comment.
I think I've clocked that this is a two-value lattice of \top and \bottom / true and false, indicating "this bit is in memory" or not in memory, and the intersection on meet ensures that it monotonically goes downwards, so we gradually remove bits from being in memory. Which sounds good and correct.
Dec 2 2022
The new two tests look good, and the documentation is highly appreciated -- is there are need to check for new "loc=none" locations being fed in, or is that something that's in the domain of LiveDebugValues? (I keep on switching back to the mindset of "every piece of information needs a dbg.value", which is not the situation in this pass).
Hi Scott,
Nov 30 2022
I see from the inline comments that virtual registers in WASM at this stage may or may not be meaningful -- if we can avoid LiveDebugValues having to encounter virtual registers, that's highly preferred, as it avoids considering a whole load of scenarios that LiveDebugValues isn't designed for. The number space for LocIndex is designed only for physregs and spill slots (physregs from 1 to 2^30, spill slots from 2^30+1 to 2^31-1). Adding the number range for virtual registers would require more testing.
Before digging into the code, I'm not sure I understand the failing IR completely -- for the segment:
Nov 25 2022
I suppose another way of thinking about that is ensuring every lattice transition gets explored across every CFG join pattern that can exist.
Just thinking out loud about coverage -- it seems to me that there are several dimensions over which the assignment-tracking lattice analysis should run, which are:
- The domain of dbg.assign "state"/value, i.e. whether it has complete information in the intrinsic: {`both, value-only, link-only, neither},
- Whether there's a linked store or not,
- Whether the store comes before or after the intrinsic,
- CFG patterns between the store, intrinsic, and intrinsics with a different state/value, i.e. whether there's straight-line-code, diamonds, loops, in the way.
LGTM, note you've added a 'tmp.ll' that you probably didn't intend to? Best to remove that.
Nov 24 2022
I can't really work out what untagged-store-frag.ll is testing as I don't get the notation, alas.
Looks good with some nits fixed, a spurious return removed, and the question on line 825 addressed.
LGTM, thanks for keeping on this.
Nov 23 2022
Worth putting implicit-check-not on all the tests?
Looks good -- the change to alloca-bitcast.ll is suspicious (an undef dbg.assign becoming valued again) but that probably signals an assignment tracking issue. CC @Orlando, what should happen here?
Nov 22 2022
This is well motivated and useful -- I'm having some trouble building a mental model of what the lattice looks like though. IIUC we're seeking to identify the ranges of variable location that are well defined by the fragment ranges, but, because we later drop overlapped ranges at a later date, we want to explicitly state the not-overlapped memory location so that it isn't dropped. I don't quite see how meetFragments achieves that, as it only selects the bits that do overlap. Is the lattice collecting all the ranges where there's genuine overlap, so that later dbg.values split their ranges across the intersection? And if so, why does addDef delete "fully contained" overlaps, shouldn't it break all definitions up?
This is all fine and well; are redundant variable locations a common case and which should we optimise for? IIUC every instruction with debug-info results in a SmallVector copy right now, which isn't necessary if that specific position hasn't had debug-info change. IMO, adding a "ChangedPosition" flag that's used to skip setWedge if it isn't true would be beneficial.
Nov 21 2022
(Ticking "request changes" to take it out of the review queue)
(Ticking "request changes" to take it out of the review queue)
Nov 20 2022
I'm in as far as line 900, hitting submit to clear the queue of comments just in case I don't get back to this. Random remarks:
Nov 17 2022
LGTM
LGTM with a couple of suggestions. The bigger comment is me asking "have you considered what happens when the inliner deletes random elements of assignment tracking due to pruning" -- I don't think there's anything that would be broken, just checking that we have a common understanding.
LGTM, with one or two questions, and the middle test needs -simplifycfg adding
LGTM
Reverse-ping -- would this be landable for the next release? The test revisions needed are minor.
Nov 16 2022
Looks good, although I think some of the logic at one of the inline comments can be simplified.
LGTM with some cosmetic nits.
Hi Teresa,