Page MenuHomePhabricator

[LLDB] WIP: Follow DW_AT_decl_file when setting breakpoint
Needs ReviewPublic

Authored by kwk on Feb 6 2020, 8:44 AM.

Details

Reviewers
jdoerfert
Summary

To set a breakpoint on a function called YOURFUNCTION you do this:

(lldb) breakpoint set -n YOURFUNCTION

The output then could look like this:

Breakpoint 1: where = YOURTARGET`YOURFUNCTION() + 4 at YOURFILE:1:20, address = 0x01234

LLDB does a good job in finding YOURFUNCTION because, indeed,
YOURFILE contains the definition of YOURFUNCTION.

Let's say, you not only want to find YOURFUNCTION anywhere in
YOURTARGET. Say, you want to find YOURFUNCTION in the file that LLDB
found it in.

This is how you'd do that:

(lldb) breakpoint set -n YOURFUNCTION -f YOURFILE

This is where things get a bit tricky. If YOURFILE points to a header
file in which YOURFUNCTION is not only declared but also defined, then
chances are that it might be inlined into YOURTARGET and LLDB cannot
find it. *Why?* You might ask and the answer is that LLDB uses only the
DW_TAG_compile_unit's name that it finds under the DW_AT_name
attribute.

Suppose this is a snippet of llvm-darfdump YOURTARGET

0x0000000b: DW_TAG_compile_unit
              DW_AT_name ("main.c")

0x0000002a: DW_TAG_subprogram
              DW_AT_name ("YOURFUNCTION")
              DW_AT_decl_file ("YOURFILE")

Then LLDB only looks at the file called main.c and ignores the
DW_AT_decl_file.

With this change LLDB respects the DW_AT_decl_file.

If you're not convinced that this change has any purpose, think about
LTO and how it might inline your functions with logic that is hard to
reason about when you just have the source code and a binary.

For reference, here's the definition for DW_AT_decl_file from the DWARF
spec in chapter 2.14:

It is sometimes useful in a debugger to be able to associate a
declaration with its occurrence in the program source.
[...]
The value of the DW_AT_decl_file attribute corresponds to a file
number from the line number information table for the compilation unit
containing the debugging information entry and represents the source
file in which the declaration appeared (see Section 6.2 on page 148).
The value 0 indicates that no source file has been specified.

Reference to lldb-dev mailing list

I once brought this topic up on the lldb-dev mailing list
(http://lists.llvm.org/pipermail/lldb-dev/2019-November/015707.html) and
since then worked on it.

Event Timeline

kwk created this revision.Feb 6 2020, 8:44 AM
kwk planned changes to this revision.Feb 6 2020, 8:45 AM

I plan to refactor some of this patch.

labath added a subscriber: labath.Feb 6 2020, 9:36 AM

I think that having this behavior key off of a flag is really confusing. I think the most sensible behavior here would be to introduce a new option like --compile-unit and have that behave like the current --file+--function mode. This would free up the --file argument for the behavior that you want to implement here.

I.e.:

  • --compile-unit+--function: search for the occurence of the given function in the given compile unit (ignoring DW_AT_decl_file)
  • --file+--function: search for the given function defined in the given file (no matter which compile unit it comes from). This would be consistent with the --file+--line scenario. For maximal consistence this should also respect the target.inline-breakpoint-strategy setting and only do an exhaustive file search when the value of that setting is "always". If the value is "headers" then we should assume that the name of the .cpp file refers to a compile unit and only search the unit with that name.

The change in behavior is unfortunate, but I don't think this is something that a lot of people will depend on because the current behavior is quite confusing. And I think it's better to change it now instead of painting ourselves into an even smaller corner. To reduce surprises we can issue a warning about the behavior change when the user uses this flag combination. And making these things separate paves the way for introducing new combinations of flags (--compile-unit+--file+--name, or --compile-unit+--file+--line) with a reasonably predictable behavior...

jingham added a subscriber: jingham.EditedFeb 6 2020, 11:09 AM

Is there ever a reason other than performance why you would want NOT to consult both the Compile Unit name and also look for DW_AT_decl_file? That doesn't seem clear to me.

If the only worry is performance, and except for that you would really always want the full search, then I don't see what we gain by adding --compile-unit as distinct from --file. We should use target.inline-breakpoint-strategy to allow people to turn on or off the searching for DW_AT_decl_file for folks whose projects are complicated enough AND who aren't doing a lot of inlining. "break set" is already over-stuffed, I don't want to add to it unless we get a positive benefit.

On a side note, I really wish I had made "break set" be a multi-word command, so you could specify the main search criteria, and then the qualifiers could be used more flexibly and be better documented and and you wouldn't see options that weren't relevant. Like:

break set name
break set source
break set address

That wouldn't make the command any harder to type, it would even be easier since:

break set name foo

would shorten to:

br s n foo

instead of

br s -n foo

I wonder if it is too late to make this change?

Is there ever a reason other than performance why you would want NOT to consult both the Compile Unit name and also look for DW_AT_decl_file? That doesn't seem clear to me.

If the only worry is performance, and except for that you would really always want the full search, then I don't see what we gain by adding --compile-unit as distinct from --file. We should use target.inline-breakpoint-strategy to allow people to turn on or off the searching for DW_AT_decl_file for folks whose projects are complicated enough AND who aren't doing a lot of inlining. "break set" is already over-stuffed, I don't want to add to it unless we get a positive benefit.

Well, conceptually these are two different things. --compile-unit would restrict the search to the given compile unit no matter what is the file where the function was defined in. And --file would consider the defining file, and not care which compile unit hosts that. In theory, both can be useful, but I'm not sure if people would really use that flexibility.

My proposal was based on the fact that we want to preserve the ability to filter according to the defining cu. If we don't need to do that, then simply just redefining the meaning of the file+function combo is fine by me..

On a side note, I really wish I had made "break set" be a multi-word command, so you could specify the main search criteria, and then the qualifiers could be used more flexibly and be better documented and and you wouldn't see options that weren't relevant. Like:

break set name
break set source
break set address

That wouldn't make the command any harder to type, it would even be easier since:

break set name foo

would shorten to:

br s n foo

instead of

br s -n foo

I wonder if it is too late to make this change?

That would be a bigger change than what I proposed because it would affect every breakpoint command, instead of just the (rarely used, I think) file+function combo, though I'm not necessarily against that. However, if we redefine the meaning of the file+function combo, then I think the need for this will be diminished because the option will become less context sensitive and thus more predictable and more combinable.

Is there ever a reason other than performance why you would want NOT to consult both the Compile Unit name and also look for DW_AT_decl_file? That doesn't seem clear to me.

If the only worry is performance, and except for that you would really always want the full search, then I don't see what we gain by adding --compile-unit as distinct from --file. We should use target.inline-breakpoint-strategy to allow people to turn on or off the searching for DW_AT_decl_file for folks whose projects are complicated enough AND who aren't doing a lot of inlining. "break set" is already over-stuffed, I don't want to add to it unless we get a positive benefit.

Well, conceptually these are two different things. --compile-unit would restrict the search to the given compile unit no matter what is the file where the function was defined in. And --file would consider the defining file, and not care which compile unit hosts that. In theory, both can be useful, but I'm not sure if people would really use that flexibility.

My proposal was based on the fact that we want to preserve the ability to filter according to the defining cu. If we don't need to do that, then simply just redefining the meaning of the file+function combo is fine by me..

Yeah, I'm not convinced the flexibility is worth adding another slot, especially since now --file needs to explain how it is different from --compile-unit, but ONLY when --name is specified. When --file is the primary (for source breakpoints) that won't be true. And why is there a --compile-unit that DOESN'T set file & line breakpoints, etc... The fact that --file has different meanings depending on what the primary specification is is one of the reasons why it's a shame that they all share one common pool of documentable options.

On a side note, I really wish I had made "break set" be a multi-word command, so you could specify the main search criteria, and then the qualifiers could be used more flexibly and be better documented and and you wouldn't see options that weren't relevant. Like:

break set name
break set source
break set address

That wouldn't make the command any harder to type, it would even be easier since:

break set name foo

would shorten to:

br s n foo

instead of

br s -n foo

I wonder if it is too late to make this change?

That would be a bigger change than what I proposed because it would affect every breakpoint command, instead of just the (rarely used, I think) file+function combo, though I'm not necessarily against that. However, if we redefine the meaning of the file+function combo, then I think the need for this will be diminished because the option will become less context sensitive and thus more predictable and more combinable.

Yes, I wasn't suggesting as blocking this change. I just feel like at some point we're going to tip over into too much complexity as things are currently constituted, so someday we'll probably go crazy if we don't rework this.

kwk added a comment.Feb 11 2020, 7:26 AM

@labath @jingham to summarize from what I read here and what I chatted about with @labath , the following is a possible way to go for now, right?

  1. We're not going to introduce my flag.
  2. You're both not perfectly happy with the way things are documented at the moment and dislike some of the implementation as in in LLDB but chaning it should not be part of this patch.
  3. @jingham wouldn't want to introduce --compile-unit as a flag that @labath proposed.
  4. You want breakpoint set --file to search everything, that is to say compilation units and files referenced in DW_AT_decl_file.

If you can confirm that this is correct, then I can refactor this patch to remove the switch and change the default behavior for breakpoint set --file. Especially point 4. is important I guess.

kwk requested review of this revision.Feb 11 2020, 7:26 AM
In D74136#1869622, @kwk wrote:

@labath @jingham to summarize from what I read here and what I chatted about with @labath , the following is a possible way to go for now, right?

  1. We're not going to introduce my flag.
  2. You're both not perfectly happy with the way things are documented at the moment and dislike some of the implementation as in in LLDB but chaning it should not be part of this patch.
  3. @jingham wouldn't want to introduce --compile-unit as a flag that @labath proposed.
  4. You want breakpoint set --file to search everything, that is to say compilation units and files referenced in DW_AT_decl_file.

    If you can confirm that this is correct, then I can refactor this patch to remove the switch and change the default behavior for breakpoint set --file. Especially point 4. is important I guess.

There's a point (5) which is that the search in 4 should be conditioned on the setting of the "target.inline-breakpoint-strategy". That way if people have big projects but don't ever #include source files, and don't build with LTO, they can turn off these extra searches, which might end up being costly and to no purpose for them.

In D74136#1869622, @kwk wrote:

@labath @jingham to summarize from what I read here and what I chatted about with @labath , the following is a possible way to go for now, right?

  1. We're not going to introduce my flag.
  2. You're both not perfectly happy with the way things are documented at the moment and dislike some of the implementation as in in LLDB but chaning it should not be part of this patch.
  3. @jingham wouldn't want to introduce --compile-unit as a flag that @labath proposed.
  4. You want breakpoint set --file to search everything, that is to say compilation units and files referenced in DW_AT_decl_file.

    If you can confirm that this is correct, then I can refactor this patch to remove the switch and change the default behavior for breakpoint set --file. Especially point 4. is important I guess.

There's a point (5) which is that the search in 4 should be conditioned on the setting of the "target.inline-breakpoint-strategy". That way if people have big projects but don't ever #include source files, and don't build with LTO, they can turn off these extra searches, which might end up being costly and to no purpose for them.

I think we're on the same page, but I'll just try to rephrase this in my own words.

Basically, I'm hoping that we could get the --file+--function combo to behave the same way --file+--line. I.e., that the --file argument should specify the file that the function is defined in (i.e. the DW_AT_decl_file attribute of the function), and not the compile unit (DW_AT_name of the compile unit containing the function). The way the inline-breakpoint-strategy setting comes into play is that if the setting value is "headers" and the user specifies a .cpp file, then we will assume that the .cpp file also matches the name of one of the compile units, and we will not attempt to search compile units with different names (which is the same thing we do for line breakpoints, I believe).

In D74136#1869622, @kwk wrote:

@labath @jingham to summarize from what I read here and what I chatted about with @labath , the following is a possible way to go for now, right?

  1. We're not going to introduce my flag.
  2. You're both not perfectly happy with the way things are documented at the moment and dislike some of the implementation as in in LLDB but chaning it should not be part of this patch.
  3. @jingham wouldn't want to introduce --compile-unit as a flag that @labath proposed.
  4. You want breakpoint set --file to search everything, that is to say compilation units and files referenced in DW_AT_decl_file.

    If you can confirm that this is correct, then I can refactor this patch to remove the switch and change the default behavior for breakpoint set --file. Especially point 4. is important I guess.

There's a point (5) which is that the search in 4 should be conditioned on the setting of the "target.inline-breakpoint-strategy". That way if people have big projects but don't ever #include source files, and don't build with LTO, they can turn off these extra searches, which might end up being costly and to no purpose for them.

I think we're on the same page, but I'll just try to rephrase this in my own words.

Basically, I'm hoping that we could get the --file+--function combo to behave the same way --file+--line. I.e., that the --file argument should specify the file that the function is defined in (i.e. the DW_AT_decl_file attribute of the function), and not the compile unit (DW_AT_name of the compile unit containing the function). The way the inline-breakpoint-strategy setting comes into play is that if the setting value is "headers" and the user specifies a .cpp file, then we will assume that the .cpp file also matches the name of one of the compile units, and we will not attempt to search compile units with different names (which is the same thing we do for line breakpoints, I believe).

That is my understanding as well.

This way the only bit of the matrix of "restricting function name breakpoints by containing source file" that we don't support is "set a breakpoint on this function, but ONLY when it is inlined into another CU, NOT when it is in its defining CU." That seems the least useful of the options, and I'm happy to trade off not adding another option to the already overstuffed "break set" list of options. And remember, if you actually needed this it would be straightforward to write a scripted breakpoint resolver to do that job.

kwk updated this revision to Diff 246160.Feb 24 2020, 2:04 AM

Updated tests and code to remove the --search-source-files flag and make it the default behavior to also search source files.

TODO: rename class SearchFilterByModuleListAndCU to something more meaningful when an agreement on the behavior was made.

kwk updated this revision to Diff 246162.Feb 24 2020, 2:11 AM
  • Clear formatting
  • Make list private again
  • Remove open from CommandObjectBreakpoint.cpp
  • Remove change in unrelated file
kwk edited the summary of this revision. (Show Details)Feb 24 2020, 2:13 AM
kwk retitled this revision from [LLDB] WIP: Optionally follow DW_AT_decl_file when setting breakpoint to [LLDB] WIP: Follow DW_AT_decl_file when setting breakpoint.
kwk added a comment.Feb 24 2020, 2:16 AM

@labath @jingham Can you please have a look at the lldb/test/Shell/Breakpoint/search-support-files.test to see if the test reflects the desired behavior?

kwk added a comment.Feb 24 2020, 2:18 AM

I figured it might be the easiest to re-use SearchFilterByModuleListAndCU but it needs to be renamed still. If you have a good suggestion, please let me know.

Yes, I believe this matches the behavior we were talking about.

I could make a bunch of comments on the implementation and the test, but I'm not sure if we're at that stage yet...

lldb/source/Core/SearchFilter.cpp
712

I'm not sure -- it'd be good to check that we still honor the module (aka --shlib) restriction.