This is an archive of the discontinued LLVM Phabricator instance.

Consider the frame base isn't moving in calculateDbgValueHistory.
Needs ReviewPublic

Authored by friss on Sep 16 2014, 5:29 AM.

Details

Summary

This is a pragmatic hack to allow calculateDbgValueHistory to do its job
in the presence of call instructions clobbering the stack pointer register.
FrameIndex references have been removed when we arrive in
calculateDbgValueHistory, but the MachineFrameInfo is still there and contains
the information that helped build the frame. The only information that interests
us is the register used as frame base for the current function (which might
vary from function to function btw). We can consider that the frame base won't
ever be really clobbered during a function's lifetime.
We thus interrogate the MFI about the location of a (random) valid stack object
and remove the base register used to address it from the set of changing
registers for the function.
This patch makes debugging ASAN instrumented code without optimizations
possible (the added testcase uses ASAN instumented code).

Fixes rdar://17904418

Diff Detail

Event Timeline

friss updated this revision to Diff 13743.Sep 16 2014, 5:29 AM
friss retitled this revision from to Consider the frame base isn't moving in calculateDbgValueHistory..
friss added reviewers: echristo, dblaikie, samsonov, aprantl.
friss updated this object.
friss added a subscriber: Unknown Object (MLST).
dblaikie edited edge metadata.Sep 16 2014, 10:36 AM

This seems like it sort of devolves to something like what I suggested during Alexey's last round of improvements (that we just special case the frame pointer rather than trying to detect all the non-changing registers)

Alexey - is Fred's assessment here correct? That is, your previous changes/improvements are still being stopped by any call instruction?
If that's the case, are we gaining much/anything by having the complexity of detecting non-clobbered registers? Or should we just do a single special case for the frame pointer and throw the rest of the logic out?

friss added a comment.Sep 16 2014, 1:44 PM

For the sake of completeness, it’s not the whole logic that is stopped by the call instructions, but only the liveness range of stack based variables that are unable to be propagated. This is especially an issue for ASAN instrumented code at O0 (and it’s really a pain not to be able to debug code compiled at O0). I think the code in there is still improves the situation with register variables, but I haven’t extracted any statistics to prove me right.

friss added a comment.Sep 16 2014, 1:49 PM

Oh, and while I'm at it, we could throw a real data-flow liveness analysis in there to improve the situation even more (and stop truncating the liveness of every variable at the end of each BB). But this would be a bit more costly.

echristo edited edge metadata.Sep 16 2014, 1:52 PM

FWIW that's what I was wanting when we were talking about it in the first place. Just, you know, time.

In D5366#6, @friss wrote:

Oh, and while I'm at it, we could throw a real data-flow liveness analysis in there to improve the situation even more (and stop truncating the liveness of every variable at the end of each BB). But this would be a bit more costly.

Yeah, this would be nice to have (as Eric said - just a matter of someone finding the care/time to do it). I'd assume it would possibly be powered by the register allocator rather than trying to recompute the liveness from scratch, but I don't know for sure. We've bandied about the idea now & then for a while now.

In D5366#5, @friss wrote:

For the sake of completeness, it’s not the whole logic that is stopped by the call instructions, but only the liveness range of stack based variables that are unable to be propagated. This is especially an issue for ASAN instrumented code at O0 (and it’s really a pain not to be able to debug code compiled at O0). I think the code in there is still improves the situation with register variables, but I haven’t extracted any statistics to prove me right.

Right, but so far as I understand it this liveness range propagation you're describing was, I thought, the precise thing Alexey was trying to fix with a bunch of work/refactoring he did a few weeks ago. So I'm a bit surprised to learn that it didn't actually do what I thought it did - or that it had a substantial limitation.

Perhaps I misunderstood the prior work.

samsonov edited edge metadata.Sep 17 2014, 4:29 PM

Why is RSP clobbered by the calls in the first place? If $rsp is used to address local variables, then does it mean the generated code actually has to spill $rsp before calling a function and restore $rsp after that? That looks weird.

I considered adding a liveness analysis here, but it seems that DbgValueHistoryCalculator is a wrong place for this: it's pain to match the propagation of debug variables' locations in the control flow graph with the order the actual basic blocks are emitted in the object file (as this is the order we care about when we generate .debug_loc section). It is doable, but painful and can be pretty slow. Instead, it would be better to somehow reuse/fix LiveDebugVariables pass (which is currently used only by optimized register allocator), and track locations by inserting DBG_VALUEs where necessary.

friss added a comment.Sep 18 2014, 2:02 AM
In D5366#10, @samsonov wrote:

Why is RSP clobbered by the calls in the first place? If $rsp is used to address local variables, then does it mean the generated code actually has to spill $rsp before calling a function and restore $rsp after that? That looks weird.

I should have mentioned this is the initial message, but I actually investigated that out a few weeks ago: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140811/230791.html
The consensus seemed to be that we need to deal with it. From a purely theoretical POV, it is true that a call will write to the stack pointer, thus the annotation isn't wrong. I'm proposing a way to just deal with it.

I considered adding a liveness analysis here, but it seems that DbgValueHistoryCalculator is a wrong place for this: it's pain to match the propagation of debug variables' locations in the control flow graph with the order the actual basic blocks are emitted in the object file (as this is the order we care about when we generate .debug_loc section). It is doable, but painful and can be pretty slow. Instead, it would be better to somehow reuse/fix LiveDebugVariables pass (which is currently used only by optimized register allocator), and track locations by inserting DBG_VALUEs where necessary.

And if we decided to do it on DbgValueHistoryCalculator, the RSP issue would still be there. Every liveness analysis relies on def/kill information.