This is an archive of the discontinued LLVM Phabricator instance.

BreakpointResolverFileLine: Correct treatment of move-to-nearest-code for multiple modules
ClosedPublic

Authored by labath on Mar 10 2017, 4:00 AM.

Details

Summary

move-to-nearest-code needs special treatment to avoid creating
superfluous breakpoints in case multiple compilation units. It already
had code to handle the case when the compilation units are in the same
module, but it still did not properly handle the case when the
compilation units are in different modules.

This fixes the issue by manually iterating over the modules (instead of
just CUs) to make sure we aggregate the matches properly.

Diff Detail

Repository
rL LLVM

Event Timeline

labath created this revision.Mar 10 2017, 4:00 AM
jingham requested changes to this revision.Mar 10 2017, 10:38 AM

Can you say more about the problem you are trying to solve. As far as breakpoints are concerned, if you find a match in each of several modules it seems to me you would always want to set a locations in each because modules can come and go independently.

This also introduces an N^2 dependency on number of modules since the searcher is already looping over the modules.

This revision now requires changes to proceed.Mar 10 2017, 10:38 AM

Can you say more about the problem you are trying to solve.

Yes.

Consider foo.h in the test case. Since the functions foo1 and foo2 are inline, they will only show up in the module if they are used. This means that the line table for module 1 will only contain entry for line 1 and module 2 will only contain line 2. If we treat each module separately, and we have move-to-nearest=true, we will set breakpoints on both lines, which are in two different functions and certainly not what the user had intended.

This is basically an extension of the existing code, which treated the same case, but only when the two compilation units were in the same module.

As far as breakpoints are concerned, if you find a match in each of several modules it seems to me you would always want to set a locations in each because modules can come and go independently.

Hm.. I had not considered that. I guess it's true that this would mean that adding or removing a module can affect how a breakpoint is resolved elsewhere, which does not seem ideal. I would still argue that this increases correctness, because then we will be misplacing the breakpoint only in some cases, where as now we do it always. I am open to suggestions though...

This also introduces an N^2 dependency on number of modules since the searcher is already looping over the modules.

I was under the impression that setting depth to target disables the iteration in the searcher.

jingham added a comment.EditedMar 10 2017, 11:40 AM

I missed that you had set this to target. I'd rather avoid doing that unless there's good reason since it means we duplicate all the module iteration logic.

To my mind, within reason it's always better to be conservative and set too many locations rather than too few. There's nothing the user can do to construct a missing breakpoint - especially after they've missed it..., but we have "move-to-nearest-code false" as the way for users to tell us to be more radical in rejecting matches, and it is also simple to disable a location that was errantly set.

And linking breakpoint location logic cross-module seems like a bad idea to me. If I decide that Module A's version of the breakpoint wins over module B's in the move, when A goes away there's nothing that will tell me to go revise that decision for a module that hasn't changed (B). So this seems the wrong way to solve this problem.

BTW, the formally correct way to solve this problem when "move-to-nearest-code" is true is to reject moves that cross function boundaries. IRL, you can't tell when a breakpoint leaves a function that didn't get emitted since you have no way of knowing on what line it would have ended without doing syntax analysis on the sources, and we can't assume sources are available. But you should be able to tell when a moving a breakpoint by line crosses INTO a new function because you have the decl_file, decl_line for the function you would be moving it into. Last time I looked, the debug info from clang wasn't quite right, however. If you have:

int
foo()
{

}

Then clang put the decl file & line on the "foo" line. So by those lights moving from a breakpoint set on the "int" line would be disallowed as moving across a function boundary. This is actually something I've seen people do quite often - particularly when using a GUI. But I think it would actually be good enough to establish a window, so that moving from "decl line - fudge-factor" was still allowed, where fudge factor is 2 or 3 lines. That would make most reasonable cases work as expected. It won't help for people who write:

int short_func() {}
int other_short_func () {}

with no spaces. But that is the sort of case the "move-to-nearest-code false" is for.

This is actually something that's been on my plate to do for a while, but it is kind of low on the list at present. If you really want to solve this problem, the above suggestion would I think be a much better approach.

I like the idea of using the function declaration line, as it will solve a couple of other corner cases also (we've had one user try to set a breakpoint on a macro definition and expect that to work). I'll try to implement that instead.

labath updated this revision to Diff 91693.Mar 14 2017, 3:22 AM
labath edited edge metadata.

I've updated the code as suggested. For the "fudge factor" I chose one, so that
this at least works in the fairly common case where you put an empty line
between two tiny functions. The "breakpoint on return type" case should still
work as long as the retyrn type does not span multiple lines, which I think is a
good compromise.

This required some tweaks to existing tests, as two of them were actually
relying on the move-the-breakpoint-into-a-function behavior. I have removed the
corresponding check from TestBreakpointOptions, as the new test supersedes that,
and I have tweaked TestMiBreak to test the move-nearest functionality
differently.

Let me know what you think.

jingham accepted this revision.Mar 14 2017, 11:13 AM

This seems good to me. Thanks for doing this.

I made a few inline comments, but none are serious.

Maybe I'm a little over-cautious about this sort of thing, but GetStartLineSourceInfo will return the first line in the line table if there is no decl_file & decl_line. I prefer to only do this if we had decl_file and decl_line, since we know that the fudge factor will be too small for the case where the function beginning is really the first line table entry.

That's a judgement call, however.

The more we're making fancy choices filtering breakpoints for people, the more we need some way to tell them what we're doing. It is great that this change will for example mean we no longer set breakpoints in some random function if you accidentally set one in an #ifdef'ed out function. But the reason why you ended up with no locations is still left entirely mysterious. Maybe we need to keep "rejected locations" with some explanation, and then "break list -v" would show them?

But that's definitely out of the scope of this change.

packages/Python/lldbsuite/test/functionalities/breakpoint/move_nearest/foo.h
1–4 ↗(On Diff #91693)

At other places in the testsuite we use something like:

#define INLINE inline __attribute__((always_inline))

to force inlining if we intend it. IIRC inline is just advisory.

source/Breakpoint/BreakpointResolverFileLine.cpp
164 ↗(On Diff #91693)

Can you make this a

const size_t decl_line_is_too_late_fudge = 1;

or something like that. That will make it obvious what's going on with the "line - 1".

This revision is now accepted and ready to land.Mar 14 2017, 11:13 AM
labath marked an inline comment as done.Mar 15 2017, 2:58 AM

This seems good to me. Thanks for doing this.

I made a few inline comments, but none are serious.

Maybe I'm a little over-cautious about this sort of thing, but GetStartLineSourceInfo will return the first line in the line table if there is no decl_file & decl_line. I prefer to only do this if we had decl_file and decl_line, since we know that the fudge factor will be too small for the case where the function beginning is really the first line table entry.

I tried to to this as conservatively as possible. E.g. if I compile the binary with -gline-tables-only then sc.function will be null and we will not do any filtering. I guess we could arrive in this situation if the compiler chooses to not emit DW_AT_decl_line attributes, even though the rest of the DWARF is present, but I think at that point the user will have bigger problems than his breakpoints not being flexible enough.

packages/Python/lldbsuite/test/functionalities/breakpoint/move_nearest/foo.h
1–4 ↗(On Diff #91693)

I don't need or want to force inlining for this test. What I need is make sure the compiler does not emit code for the function if it is not used in the compilation unit. I suppose inline doesn't guarantee that, but neither does always_inline. I guess if some days compilers start emitting code for all functions, we'll need to update the test, but I doubt that's going to happen, as that would make intermediate files huge.

This revision was automatically updated to reflect the committed changes.