This is an archive of the discontinued LLVM Phabricator instance.

[LLDB] WIP: Follow DW_AT_decl_file when setting breakpoint
Changes PlannedPublic

Authored by kwk on Feb 6 2020, 8:44 AM.

Details

Reviewers
jdoerfert
Summary

Purpose

To set a breakpoint on a function called YOURFUNCTION you do this:

(lldb) breakpoint set -n YOURFUNCTION

The output then could look like this:

Breakpoint 1: where = YOURTARGET`YOURFUNCTION() + 4 at YOURFILE:1:20, address = 0x01234

LLDB does a good job in finding YOURFUNCTION because, indeed,
YOURFILE contains the definition of YOURFUNCTION.

Let's say, you not only want to find YOURFUNCTION anywhere in
YOURTARGET. Say, you want to find YOURFUNCTION in the file that LLDB
found it in.

This is how you'd do that:

(lldb) breakpoint set -n YOURFUNCTION -f YOURFILE

This is where things get a bit tricky.

Current situation

WARNING: Without this change, if YOURFILE points to a header file in which YOURFUNCTION is declared and also defined, then LLDB possibly cannot find it.

LLDB uses only the DW_TAG_compile_unit's name that it finds under the
DW_AT_name attribute.

Suppose this is a snippet of llvm-darfdump YOURTARGET

DWARF-Dump-Snippet
0x0000000b: DW_TAG_compile_unit
              DW_AT_name ("main.c")

0x0000002a: DW_TAG_subprogram
              DW_AT_name ("YOURFUNCTION")
              DW_AT_decl_file ("YOURFILE")

Then LLDB only looks at the file called main.c and ignores the
DW_AT_decl_file.

With this change LLDB respects the DW_AT_decl_file.

Settings awareness

This change is aware of the target.inline-breakpoint-strategy setting and
respects it when filtering for breakpoints.

IMPORTANT: The behavior of target.inline-breakpoint-strategy when set to headers is still subject to change!

Reference for DW_AT_decl_file

For reference, here's the definition for DW_AT_decl_file from the DWARF
spec in chapter 2.14:

It is sometimes useful in a debugger to be able to associate a
declaration with its occurrence in the program source.
[...]
The value of the DW_AT_decl_file attribute corresponds to a file
number from the line number information table for the compilation unit
containing the debugging information entry and represents the source
file in which the declaration appeared (see Section 6.2 on page 148).
The value 0 indicates that no source file has been specified.

Reference to lldb-dev mailing list

I once brought this topic up on the lldb-dev mailing list
(http://lists.llvm.org/pipermail/lldb-dev/2019-November/015707.html) and
since then worked on it.

Diff Detail

Event Timeline

kwk created this revision.Feb 6 2020, 8:44 AM
kwk planned changes to this revision.Feb 6 2020, 8:45 AM

I plan to refactor some of this patch.

labath added a subscriber: labath.Feb 6 2020, 9:36 AM

I think that having this behavior key off of a flag is really confusing. I think the most sensible behavior here would be to introduce a new option like --compile-unit and have that behave like the current --file+--function mode. This would free up the --file argument for the behavior that you want to implement here.

I.e.:

  • --compile-unit+--function: search for the occurence of the given function in the given compile unit (ignoring DW_AT_decl_file)
  • --file+--function: search for the given function defined in the given file (no matter which compile unit it comes from). This would be consistent with the --file+--line scenario. For maximal consistence this should also respect the target.inline-breakpoint-strategy setting and only do an exhaustive file search when the value of that setting is "always". If the value is "headers" then we should assume that the name of the .cpp file refers to a compile unit and only search the unit with that name.

The change in behavior is unfortunate, but I don't think this is something that a lot of people will depend on because the current behavior is quite confusing. And I think it's better to change it now instead of painting ourselves into an even smaller corner. To reduce surprises we can issue a warning about the behavior change when the user uses this flag combination. And making these things separate paves the way for introducing new combinations of flags (--compile-unit+--file+--name, or --compile-unit+--file+--line) with a reasonably predictable behavior...

jingham added a subscriber: jingham.EditedFeb 6 2020, 11:09 AM

Is there ever a reason other than performance why you would want NOT to consult both the Compile Unit name and also look for DW_AT_decl_file? That doesn't seem clear to me.

If the only worry is performance, and except for that you would really always want the full search, then I don't see what we gain by adding --compile-unit as distinct from --file. We should use target.inline-breakpoint-strategy to allow people to turn on or off the searching for DW_AT_decl_file for folks whose projects are complicated enough AND who aren't doing a lot of inlining. "break set" is already over-stuffed, I don't want to add to it unless we get a positive benefit.

On a side note, I really wish I had made "break set" be a multi-word command, so you could specify the main search criteria, and then the qualifiers could be used more flexibly and be better documented and and you wouldn't see options that weren't relevant. Like:

break set name
break set source
break set address

That wouldn't make the command any harder to type, it would even be easier since:

break set name foo

would shorten to:

br s n foo

instead of

br s -n foo

I wonder if it is too late to make this change?

Is there ever a reason other than performance why you would want NOT to consult both the Compile Unit name and also look for DW_AT_decl_file? That doesn't seem clear to me.

If the only worry is performance, and except for that you would really always want the full search, then I don't see what we gain by adding --compile-unit as distinct from --file. We should use target.inline-breakpoint-strategy to allow people to turn on or off the searching for DW_AT_decl_file for folks whose projects are complicated enough AND who aren't doing a lot of inlining. "break set" is already over-stuffed, I don't want to add to it unless we get a positive benefit.

Well, conceptually these are two different things. --compile-unit would restrict the search to the given compile unit no matter what is the file where the function was defined in. And --file would consider the defining file, and not care which compile unit hosts that. In theory, both can be useful, but I'm not sure if people would really use that flexibility.

My proposal was based on the fact that we want to preserve the ability to filter according to the defining cu. If we don't need to do that, then simply just redefining the meaning of the file+function combo is fine by me..

On a side note, I really wish I had made "break set" be a multi-word command, so you could specify the main search criteria, and then the qualifiers could be used more flexibly and be better documented and and you wouldn't see options that weren't relevant. Like:

break set name
break set source
break set address

That wouldn't make the command any harder to type, it would even be easier since:

break set name foo

would shorten to:

br s n foo

instead of

br s -n foo

I wonder if it is too late to make this change?

That would be a bigger change than what I proposed because it would affect every breakpoint command, instead of just the (rarely used, I think) file+function combo, though I'm not necessarily against that. However, if we redefine the meaning of the file+function combo, then I think the need for this will be diminished because the option will become less context sensitive and thus more predictable and more combinable.

Is there ever a reason other than performance why you would want NOT to consult both the Compile Unit name and also look for DW_AT_decl_file? That doesn't seem clear to me.

If the only worry is performance, and except for that you would really always want the full search, then I don't see what we gain by adding --compile-unit as distinct from --file. We should use target.inline-breakpoint-strategy to allow people to turn on or off the searching for DW_AT_decl_file for folks whose projects are complicated enough AND who aren't doing a lot of inlining. "break set" is already over-stuffed, I don't want to add to it unless we get a positive benefit.

Well, conceptually these are two different things. --compile-unit would restrict the search to the given compile unit no matter what is the file where the function was defined in. And --file would consider the defining file, and not care which compile unit hosts that. In theory, both can be useful, but I'm not sure if people would really use that flexibility.

My proposal was based on the fact that we want to preserve the ability to filter according to the defining cu. If we don't need to do that, then simply just redefining the meaning of the file+function combo is fine by me..

Yeah, I'm not convinced the flexibility is worth adding another slot, especially since now --file needs to explain how it is different from --compile-unit, but ONLY when --name is specified. When --file is the primary (for source breakpoints) that won't be true. And why is there a --compile-unit that DOESN'T set file & line breakpoints, etc... The fact that --file has different meanings depending on what the primary specification is is one of the reasons why it's a shame that they all share one common pool of documentable options.

On a side note, I really wish I had made "break set" be a multi-word command, so you could specify the main search criteria, and then the qualifiers could be used more flexibly and be better documented and and you wouldn't see options that weren't relevant. Like:

break set name
break set source
break set address

That wouldn't make the command any harder to type, it would even be easier since:

break set name foo

would shorten to:

br s n foo

instead of

br s -n foo

I wonder if it is too late to make this change?

That would be a bigger change than what I proposed because it would affect every breakpoint command, instead of just the (rarely used, I think) file+function combo, though I'm not necessarily against that. However, if we redefine the meaning of the file+function combo, then I think the need for this will be diminished because the option will become less context sensitive and thus more predictable and more combinable.

Yes, I wasn't suggesting as blocking this change. I just feel like at some point we're going to tip over into too much complexity as things are currently constituted, so someday we'll probably go crazy if we don't rework this.

kwk added a comment.Feb 11 2020, 7:26 AM

@labath @jingham to summarize from what I read here and what I chatted about with @labath , the following is a possible way to go for now, right?

  1. We're not going to introduce my flag.
  2. You're both not perfectly happy with the way things are documented at the moment and dislike some of the implementation as in in LLDB but chaning it should not be part of this patch.
  3. @jingham wouldn't want to introduce --compile-unit as a flag that @labath proposed.
  4. You want breakpoint set --file to search everything, that is to say compilation units and files referenced in DW_AT_decl_file.

If you can confirm that this is correct, then I can refactor this patch to remove the switch and change the default behavior for breakpoint set --file. Especially point 4. is important I guess.

kwk requested review of this revision.Feb 11 2020, 7:26 AM
In D74136#1869622, @kwk wrote:

@labath @jingham to summarize from what I read here and what I chatted about with @labath , the following is a possible way to go for now, right?

  1. We're not going to introduce my flag.
  2. You're both not perfectly happy with the way things are documented at the moment and dislike some of the implementation as in in LLDB but chaning it should not be part of this patch.
  3. @jingham wouldn't want to introduce --compile-unit as a flag that @labath proposed.
  4. You want breakpoint set --file to search everything, that is to say compilation units and files referenced in DW_AT_decl_file.

If you can confirm that this is correct, then I can refactor this patch to remove the switch and change the default behavior for breakpoint set --file. Especially point 4. is important I guess.

There's a point (5) which is that the search in 4 should be conditioned on the setting of the "target.inline-breakpoint-strategy". That way if people have big projects but don't ever #include source files, and don't build with LTO, they can turn off these extra searches, which might end up being costly and to no purpose for them.

In D74136#1869622, @kwk wrote:

@labath @jingham to summarize from what I read here and what I chatted about with @labath , the following is a possible way to go for now, right?

  1. We're not going to introduce my flag.
  2. You're both not perfectly happy with the way things are documented at the moment and dislike some of the implementation as in in LLDB but chaning it should not be part of this patch.
  3. @jingham wouldn't want to introduce --compile-unit as a flag that @labath proposed.
  4. You want breakpoint set --file to search everything, that is to say compilation units and files referenced in DW_AT_decl_file.

If you can confirm that this is correct, then I can refactor this patch to remove the switch and change the default behavior for breakpoint set --file. Especially point 4. is important I guess.

There's a point (5) which is that the search in 4 should be conditioned on the setting of the "target.inline-breakpoint-strategy". That way if people have big projects but don't ever #include source files, and don't build with LTO, they can turn off these extra searches, which might end up being costly and to no purpose for them.

I think we're on the same page, but I'll just try to rephrase this in my own words.

Basically, I'm hoping that we could get the --file+--function combo to behave the same way --file+--line. I.e., that the --file argument should specify the file that the function is defined in (i.e. the DW_AT_decl_file attribute of the function), and not the compile unit (DW_AT_name of the compile unit containing the function). The way the inline-breakpoint-strategy setting comes into play is that if the setting value is "headers" and the user specifies a .cpp file, then we will assume that the .cpp file also matches the name of one of the compile units, and we will not attempt to search compile units with different names (which is the same thing we do for line breakpoints, I believe).

In D74136#1869622, @kwk wrote:

@labath @jingham to summarize from what I read here and what I chatted about with @labath , the following is a possible way to go for now, right?

  1. We're not going to introduce my flag.
  2. You're both not perfectly happy with the way things are documented at the moment and dislike some of the implementation as in in LLDB but chaning it should not be part of this patch.
  3. @jingham wouldn't want to introduce --compile-unit as a flag that @labath proposed.
  4. You want breakpoint set --file to search everything, that is to say compilation units and files referenced in DW_AT_decl_file.

If you can confirm that this is correct, then I can refactor this patch to remove the switch and change the default behavior for breakpoint set --file. Especially point 4. is important I guess.

There's a point (5) which is that the search in 4 should be conditioned on the setting of the "target.inline-breakpoint-strategy". That way if people have big projects but don't ever #include source files, and don't build with LTO, they can turn off these extra searches, which might end up being costly and to no purpose for them.

I think we're on the same page, but I'll just try to rephrase this in my own words.

Basically, I'm hoping that we could get the --file+--function combo to behave the same way --file+--line. I.e., that the --file argument should specify the file that the function is defined in (i.e. the DW_AT_decl_file attribute of the function), and not the compile unit (DW_AT_name of the compile unit containing the function). The way the inline-breakpoint-strategy setting comes into play is that if the setting value is "headers" and the user specifies a .cpp file, then we will assume that the .cpp file also matches the name of one of the compile units, and we will not attempt to search compile units with different names (which is the same thing we do for line breakpoints, I believe).

That is my understanding as well.

This way the only bit of the matrix of "restricting function name breakpoints by containing source file" that we don't support is "set a breakpoint on this function, but ONLY when it is inlined into another CU, NOT when it is in its defining CU." That seems the least useful of the options, and I'm happy to trade off not adding another option to the already overstuffed "break set" list of options. And remember, if you actually needed this it would be straightforward to write a scripted breakpoint resolver to do that job.

kwk updated this revision to Diff 246160.Feb 24 2020, 2:04 AM

Updated tests and code to remove the --search-source-files flag and make it the default behavior to also search source files.

TODO: rename class SearchFilterByModuleListAndCU to something more meaningful when an agreement on the behavior was made.

kwk updated this revision to Diff 246162.Feb 24 2020, 2:11 AM
  • Clear formatting
  • Make list private again
  • Remove open from CommandObjectBreakpoint.cpp
  • Remove change in unrelated file
kwk edited the summary of this revision. (Show Details)Feb 24 2020, 2:13 AM
kwk retitled this revision from [LLDB] WIP: Optionally follow DW_AT_decl_file when setting breakpoint to [LLDB] WIP: Follow DW_AT_decl_file when setting breakpoint.
kwk added a comment.Feb 24 2020, 2:16 AM

@labath @jingham Can you please have a look at the lldb/test/Shell/Breakpoint/search-support-files.test to see if the test reflects the desired behavior?

kwk added a comment.Feb 24 2020, 2:18 AM

I figured it might be the easiest to re-use SearchFilterByModuleListAndCU but it needs to be renamed still. If you have a good suggestion, please let me know.

Yes, I believe this matches the behavior we were talking about.

I could make a bunch of comments on the implementation and the test, but I'm not sure if we're at that stage yet...

lldb/source/Core/SearchFilter.cpp
734

I'm not sure -- it'd be good to check that we still honor the module (aka --shlib) restriction.

kwk updated this revision to Diff 255339.Apr 6 2020, 7:43 AM
  • rebased onto master to get rid of NFC change
kwk updated this revision to Diff 255648.Apr 7 2020, 5:11 AM
  • Simplified test
kwk updated this revision to Diff 257313.Apr 14 2020, 6:58 AM

Ran git clang-format

kwk updated this revision to Diff 257682.Apr 15 2020, 5:35 AM
  • Modify SearchFilterByModuleListAndCU
  • format
kwk updated this revision to Diff 257688.Apr 15 2020, 6:09 AM
  • Honor the module
kwk planned changes to this revision.Apr 15 2020, 6:10 AM
kwk updated this revision to Diff 257691.Apr 15 2020, 6:10 AM
  • Revert "Honor the module"
kwk marked an inline comment as done.Apr 15 2020, 6:15 AM

Yes, I believe this matches the behavior we were talking about.

I could make a bunch of comments on the implementation and the test, but I'm not sure if we're at that stage yet...

I'd be happy to hear about your comments @labath because I'm kind of stuck in what I can think of. I will obviously rename SearchFilterByModuleListAndCU but that can come in a child revision.

lldb/source/Core/SearchFilter.cpp
734

@labath How do you want to honor it? I guess ModulePasses(fileSpec) as a check is wrong, isn't it?

Harbormaster failed remote builds in B53343: Diff 257688!
Harbormaster failed remote builds in B53346: Diff 257691!
kwk updated this revision to Diff 258004.Apr 16 2020, 3:04 AM
  • Remove not needed include
labath added a comment.May 5 2020, 6:29 AM
In D74136#1983467, @kwk wrote:

I'd be happy to hear about your comments @labath because I'm kind of stuck in what I can think of. I will obviously rename SearchFilterByModuleListAndCU but that can come in a child revision.

Sorry, I'm kinda busy these days, so and I did not see the embedded question while glossing over the WIP changes.

I think this looks pretty clean (surprisingly clean even) now. I don't have any major comments, but we should have Jim take a look at this again, as he's more familiar with this stuff.

Given that the name SearchFilterByModuleListAndCU is not very widespread (20 hits in three files) and that this patch actually makes that name wrong, I think it'd be best to do the rename as a part of this patch.

lldb/source/Breakpoint/BreakpointResolverName.cpp
320

This mention of a "function call" is confusing. Either a function is in a file or it isn't. Why do we care about who calls who?

lldb/test/Shell/Breakpoint/Inputs/search-support-files.h
2

Calling this inlined is misleading. The function won't get inlined anywhere at -O0, and in fact your test would not work if it got inlined. Maybe just call it function_in_header ?

lldb/test/Shell/Breakpoint/search-support-files.test
9

All of the tests execute in the same directory, so it's a good practice to embed %t into the files you create. As you don't care about the actual file name in the CHECK lines, you can just replace dummy.out with {{.*}} everywhere.

10

It's not standard/polite to hard-code --color like this. Using --dump-input is also not very common though there are definitely precedents for that, and I can see how it can be useful (though one should be careful not to overwhelm the terminal if he's using it).

labath added inline comments.May 5 2020, 6:29 AM
lldb/source/Breakpoint/BreakpointResolverName.cpp
326

This CompUnitPasses call smells, as source_file does not really correspond to any compilation unit (in the scenario that you care about). It seems like this is actually the first usage of this function, so let's just rename it (and make it take a const FileSpec & while we're at it).

lldb/test/Shell/Breakpoint/search-support-files.test
16

These check lines hardcode too much stuff. The +4 thingy can easily change due to unrelated codegen changes, and even the :1:20 seems unnecessarily strict.
Maybe something like this would be enough:

CHECK-NEXT: Breakpoint 1: where = {{.8}}`inlined_42{{.*}} at search-support-files.h
33

compilation

jdoerfert resigned from this revision.May 6 2020, 9:23 AM
kwk updated this revision to Diff 266463.May 27 2020, 2:46 AM
kwk marked 3 inline comments as done.
  • rebase
  • Rename function in test from inlined_42 to function_in_header
  • Typo: compulation -> compilation
kwk updated this revision to Diff 266465.May 27 2020, 2:58 AM
kwk marked 3 inline comments as done.
  • use %t in files created in tests
kwk updated this revision to Diff 266472.May 27 2020, 3:17 AM
  • make test CHECKs less strict
kwk marked 2 inline comments as done.May 27 2020, 3:24 AM
kwk added inline comments.
lldb/test/Shell/Breakpoint/Inputs/search-support-files.h
2

Thank you for finding this. My understanding needs a lot of sharpening I guess.

lldb/test/Shell/Breakpoint/search-support-files.test
16

Yes, you're right. Although I don't know what {{.8}} does in this case.

kwk updated this revision to Diff 266493.May 27 2020, 5:12 AM
kwk marked an inline comment as done.
  • don't hard-code --color and --dump-input on FileCheck invocation
kwk marked an inline comment as done.May 28 2020, 7:16 AM

@labath please see my inline comment.

lldb/source/Breakpoint/BreakpointResolverName.cpp
320

@labath we have this filter_by_function as an indirection to let BreakpointResolverName::SearchCallback know that the SearchFilter needs a special handling. But while reviewing the code now I noticed that only the implementations of SearchFilter::GetFilterRequiredItems currently either return

  • eSymbolContextModule (for SearchFilterByModule and SearchFilterByModuleList) or
  • eSymbolContextModule | eSymbolContextCompUnit | eSymbolContextFunction (for SearchFilterByModuleListAndCU).

I thought this was clever at a time when the old SearchFilterByModuleListAndCU which then only returned eSymbolContextModule | eSymbolContextCompUnit . With my new search filter I wanted a special handling which is why I wrote a new search filter class and had it return eSymbolContextModule | eSymbolContextCompUnit | eSymbolContextFunction. But when you confirmed my question to change the default behavior (https://reviews.llvm.org/D74136#1869622), the new class became the default implementation and there no longer is a need to distinguish the required items for this search filter.

Makes sense?

labath added inline comments.May 28 2020, 8:28 AM
lldb/source/Breakpoint/BreakpointResolverName.cpp
320

I'm afraid I got a bit lost here. Jim is more familiar with his stuff, so maybe he can answer this question. Jim?

Looking at this again, I get the impression that a lot of this confusion is coming from the fact that this patch assigns a special meaning of eSymbolContextCompUnit | eSymbolContextFunction combination to mean "match the file the function is defined in". I think it would be more consistent with the spirit of these functions if this code just did something like:

if (filter_by_function && !filter.FunctionPasses(function))
  remove_it = true;

and then the code which extracts the file from the function definition lived inside the FunctionPasses method. That may require adding a new filter class or it could be that an existing filter could be adapted to do that (SearchFilterByModuleListAndCU, I guess).

Take this with a grain of salt though, as I don't frequent this part of code very often.

lldb/test/Shell/Breakpoint/search-support-files.test
16

I meant {{.*}}.

jingham added inline comments.May 28 2020, 3:13 PM
lldb/source/Breakpoint/BreakpointResolverName.cpp
320

I think Pavel is right here.

The point of the built-in checks: ModulePasses, CompUnitPasses, FunctionPasses is that they mirror the search space that the Searcher iterates over when calling the Resolvers. It allows a Resolver to have the Searcher bypass calling the resolver at all. Adding in the DeclFile to the CompUnit check isn't right for that use, since that isn't one of the spaces the Searcher iterates over.

The way I thought about this is that all the higher level checks are for the searcher, but ultimately you always have to call AddressPasses because your filter might be below the chunks that the Searcher searches for.

So it seems to me better not to try to glom the CompUnit check and the DeclFile check together, but rather to have the CompUnit check do what it says, eliminate comp units. And then add the DeclFile check to the AddressPasses for your filter.

It also seems a shame that if I really only want to look only in a CompUnit, I end up having to call into the Resolver for every comp unit, and then filter there.

kwk updated this revision to Diff 269781.Jun 10 2020, 3:35 AM
kwk marked an inline comment as done.
  • Outsource parts into SearchFilterByModulesAndSupportFiles::FunctionPasses
  • Tests with alternating setting target.inline-breakpoint-strategy between "always" and "headers"
  • Respecting target.inline-breakpoint-strategy setting in SearchFilterByModulesAndSupportFiles and maded adjustments in BreakpointResolverName::SearchCallback
  • Renamed SearchFilterByModuleListAndCUList to SearchFilterByModulesAndSupportFiles
kwk updated this revision to Diff 269782.Jun 10 2020, 3:43 AM
  • remove commented out code
kwk updated this revision to Diff 269784.Jun 10 2020, 3:44 AM
  • Fix comment
kwk updated this revision to Diff 269787.Jun 10 2020, 3:48 AM
kwk marked an inline comment as done.
  • remove debug output from test
kwk added a comment.Jun 10 2020, 3:51 AM

@labath I've applied all the ideas we ping-ponged yesterday and I decided to go with alternating the target.inline-breakpoint-strategy from always (the default) to headers. This way you can exactly see in the test file how things are behaving. So before going into the actual code review I'd like to ask you and @jingham to take a look at the test file. Is it the behavior described there the desired outcome? Then we can discuss the implementation.

lldb/source/Breakpoint/BreakpointResolverName.cpp
323

This is done on purpose to reverse the decision to remove a context for not passing a CU above.

kwk planned changes to this revision.Jun 10 2020, 7:32 AM
kwk edited the summary of this revision. (Show Details)
IMPORTANT: The behavior of target.inline-breakpoint-strategy when set to headers is still subject to change!

I think the setting is not respected correctly...

labath marked an inline comment as done.Jun 11 2020, 5:56 AM
In D74136#2085076, @kwk wrote:
IMPORTANT: The behavior of target.inline-breakpoint-strategy when set to headers is still subject to change!

I think the setting is not respected correctly...

Yes. I'm guessing you also need a call to FileSpec::IsSourceImplementationFile somewhere. The logic should be different only if target.inline-breakpoint-strategy is`headers` AND FileSpec::IsSourceImplementationFile is true.

To Jim: I'm still unable to decide whether we really should be respecting the inline-breakpoint-strategy setting or not in this case. I mean, this is an optimization for setting file+line breakpoints, but in this case it does not buy us anything, since we search for the functions by name. OTOH, it is nice to have a consistent treatment of the --file argument to breakpoint set...

lldb/test/Shell/Breakpoint/search-support-files.test
78–81

The function in a .h file should be found regardless of the value of the setting.

89–92

This is right.

I need to read this in more detail at this point but I'm caught up in something else.

One of the not so great design decisions I made was to try to have "break set" be a single command governed by flags. The way the command turned out, there are some flags that tell you what kind of breakpoint you are setting, and then other flags that qualify that kind of search. But that means --file can mean "I'm specifying the file to use to set a file & line breakpoint" or it can be a narrowing specifier for a source regex or function name based breakpoint. Doing it this way makes it hard to have consistent meanings for the flags in all contexts.

If I had it to do over, I would make:

(lldb) break set source
(lldb) break set symbol
(lldb) break set source-pattern
(lldb) break set address

etc... Then the flags could have the meanings appropriate to the breakpoint type, and furthermore be documented appropriately.

I don't think we should strain the meaning of a particular flag just to keep it consistent with other orthogonal uses. We should probably start making the flag documentation specify "when used for a -n breakpoint, -f means...". I think that would be sufficient.

I need to read this in more detail at this point but I'm caught up in something else.

One of the not so great design decisions I made was to try to have "break set" be a single command governed by flags. The way the command turned out, there are some flags that tell you what kind of breakpoint you are setting, and then other flags that qualify that kind of search. But that means --file can mean "I'm specifying the file to use to set a file & line breakpoint" or it can be a narrowing specifier for a source regex or function name based breakpoint. Doing it this way makes it hard to have consistent meanings for the flags in all contexts.

If I had it to do over, I would make:

(lldb) break set source
(lldb) break set symbol
(lldb) break set source-pattern
(lldb) break set address

Of course, if I were going to do this, I'd choose names that had different initial letters, so it really wouldn't have been more onerous to type than the current setup, and it would be easy to adjust the "__regex_break" command to accommodate...

etc... Then the flags could have the meanings appropriate to the breakpoint type, and furthermore be documented appropriately.

I don't think we should strain the meaning of a particular flag just to keep it consistent with other orthogonal uses. We should probably start making the flag documentation specify "when used for a -n breakpoint, -f means...". I think that would be sufficient.

I would separate/remove all [nfc] changes from this patch as they complicate it a bit. Such cleaned up patch I have prepared for myself: https://people.redhat.com/jkratoch/D74136-cleanup.patch

Then also did you notice there is a regression for (I do not know why):

FAIL: LLDB (/home/jkratoch/redhat/llvm-monorepo-clangassert/bin/clang-11-x86_64) :: test_scripted_resolver (TestScriptedResolver.TestScriptedResolver)
  File "/home/jkratoch/redhat/llvm-monorepo/lldb/test/API/functionalities/breakpoint/scripted_bkpt/TestScriptedResolver.py", line 23, in test_scripted_resolver
    self.do_test()
  File "/home/jkratoch/redhat/llvm-monorepo/lldb/test/API/functionalities/breakpoint/scripted_bkpt/TestScriptedResolver.py", line 152, in do_test
    self.assertEqual(wrong[i].GetNumLocations(), 0, "Breakpoint %d has locations."%(i))
AssertionError: 1 != 0 : Breakpoint 1 has locations.
lldb/source/Breakpoint/BreakpointResolverName.cpp
292

[nfc] It could use include_symbols here.

lldb/source/Core/SearchFilter.cpp
713

If we cannot determine which file the function is from then rather ignore it (the current call returns true):

return false;
731

This value should be cached.
But besides that there should be GetInlineStrategy() == eInlineBreakpointsNever || (GetInlineStrategy() == eInlineBreakpointsHeaders && cu_spec.IsSourceImplementationFile()modulosomeReverseRemapPath?.
As for eInlineBreakpointsAlways the check is being delegated to FunctionPasses().

737

This condition is always true as there is already above:

if (!sym_ctx.comp_unit)
828

Here should be eInlineBreakpointsNever || eInlineBreakpointsHeaders (and probably cache the GetInlineStrategy() value).
For eInlineBreakpointsHeaders we do not know for sure as we cannot call IsSourceImplementationFile() for cu_spec which is not known yet here.

kwk marked an inline comment as done.Jun 15 2020, 7:27 AM
kwk added inline comments.
lldb/source/Core/SearchFilter.cpp
737

That' incorrect. Only if the nested if (m_support_file_list.GetSize() != 0) is false, then sym_ctx.comp_unit is potentially true.

jankratochvil marked an inline comment as done.Jun 15 2020, 7:36 AM
jankratochvil added inline comments.
lldb/source/Core/SearchFilter.cpp
737

OK, I agree, my comment was a mistake.

kwk updated this revision to Diff 271583.Jun 17 2020, 11:16 PM
  • Align tests with reviewer expectations
kwk added a comment.Jun 17 2020, 11:30 PM

@labath @jingham @jankratochvil The test change suggested by @labath is now in place and it works. @jankratochvil, I've removed the logic that checks seomthing with the CU from AddressPasses. That logic now lives in FunctionPasses where it logically makes more sense to me. The whole filtering by CU logic from AddressPasses is gone now and I wonder if @jingham has some thoughts on this.

kwk marked an inline comment as done.Jun 17 2020, 11:31 PM
kwk marked 6 inline comments as done.Jun 17 2020, 11:34 PM
kwk added inline comments.
lldb/source/Core/SearchFilter.cpp
713

@jankratochvil so far I haven't addressed this on purpose to give @labath and @jingham time to say what they think about this.

kwk updated this revision to Diff 271631.Jun 18 2020, 2:58 AM
  • Remove old logic that was no longer needed since my search filter now adaptively adds eSymbolContextCompUnit and not always returns it.

Just some notes of mine to find out which functions get used in which breakpoint case:

-o 'breakpoint set -f main.c -l 1' -o q
-o 'b main.c:1'     
lldb/source/Commands/CommandObjectBreakpoint.cpp:578 = eSetTypeFileAndLine
lldb/source/Target/Target.cpp:330 = GetInlineStrategy()    
BreakpointSP Target::CreateBreakpoint(const FileSpecList *containingModules,
                                      const FileSpec &file, uint32_t line_no,
  lldb/source/Target/Target.cpp:622

-o 'b main'
-o 'breakpoint set -n main -f noFileOfThisName.xxx' containingSourceFiles="noFileOfThisName.xxx"
lldb/source/Commands/CommandObjectBreakpoint.cpp:614 = eSetTypeFunctionName
lldb/source/Target/Target.cpp:445 
Target::CreateBreakpoint(const FileSpecList *containingModules,
                         const FileSpecList *containingSourceFiles,
                         const std::vector<std::string> &func_names,
  lldb/source/Target/Target.cpp:622
lldb/source/Core/SearchFilter.cpp
722

This IsSourceImplementationFile() should be checking the filename requested by user, not filename found in DWARF. It should be in SearchFilterByModuleListAndCU::GetFilterRequiredItems() where it should check the filename in m_cu_spec_list. The whole GetInlineStrategy() is there only for performance, if it was fast enough it would be always best to use eInlineBreakpointsAlways. So here when you already have fully parsed DIE for the function+type it no longer makes sense to check for any performance optimization possibility. If eInlineBreakpointsNever or eInlineBreakpointsHeaders shows more breakpoint locations than expected (and still not more than eInlineBreakpointsAlways) then it is not a bug but just a missed performance optimization.
And after you fully move this GetInlineStrategy()+IsSourceImplementationFile() into SearchFilterByModuleListAndCU::GetFilterRequiredItems() then you can drop it from Target::CreateBreakpoint at lldb/source/Target/Target.cpp:330 as it no longer has any effect there.
IMO one should implement the DW_AT_decl_file check also into SymbolFileDWARF::ResolveSymbolContext at lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp:1977 and other non-DWARF formats. As otherwise properly found candidates can be discarded at the end.

jankratochvil added inline comments.Jun 25 2020, 12:08 AM
lldb/source/Core/SearchFilter.cpp
722

That last SymbolFileDWARF::ResolveSymbolContext needs no change, it is already checking GetSupportFiles(), discard my last paragraph.

kwk updated this revision to Diff 273307.Jun 25 2020, 4:52 AM
  • bring back logic to keep a symbol context when a function passes and add a comment as Jan suggested
  • remove test from scripted resolver that calls SearchFilterByModulesAndSupportFiles::AddressPasses
    • before the test checked that a non existing file doesn't cause setting a breakpoint location
    • the logic in AddressPasses before was to identify the CU of a symbol context and check if it's file is in the list of files that are allowed to pass. We agreed to change this logic so that not only CU's are checked but also files in which a function is declared. SearchFilterByModulesAndSupportFiles::FunctionPasses now does exactly that but it is not called from the BreakPointResolverScripted class, only from BreakpointResolverName.
    • It is an open question how to deal with this and I hope Jim can help here.
    • Shall we maybe take another file from the SymbolContext to see if we can filter by that in AddressPasses? Here's a dump of the filter context for the removed test:
0x00007ffda14d94a0: SymbolContext
    Module       = 0x0000563169c60780 /opt/notnfs/kkleine/llvm/build/lldb-test-build.noindex/functionalities/breakpoint/scripted_bkpt/TestScriptedResolver.test_scripted_resolver/a.out
    CompileUnit  = 0x000056316a04c310 {0000000000000000} /opt/notnfs/kkleine/llvm/lldb/test/API/functionalities/breakpoint/scripted_bkpt/main.c
    Function     = 0x000056316a048c40 {7fffffff00000043} break_on_me, address-range = a.out[0x0000000000401150-0x0000000000401167)
            Type = 0x56316a0220d0:     Type{0x7fffffff00000043} , name = "break_on_me", decl = main.c:10, compiler_type = 0x000056316a0cbe80 void (void)

    Block        = 0x000056316a048c78 {7fffffff00000043}
    LineEntry    = a.out[0x0000000000401150-0x0000000000401154), file = /opt/notnfs/kkleine/llvm/lldb/test/API/functionalities/breakpoint/scripted_bkpt/main.c, line = 11, is_start_of_statement = TRUE
    Symbol       = 0x000056316a037248
Variable     = 0x0000000000000000
  • rename var
kwk marked an inline comment as done.Jun 25 2020, 4:58 AM
kwk added inline comments.
lldb/test/API/functionalities/breakpoint/scripted_bkpt/TestScriptedResolver.py
124

@jingham can you please let me know if this test serves any purpose with my patch? I mentioned in my last change log (https://reviews.llvm.org/D74136#2113899) that it calls AddressPasses and checked for the CU before. But since we're no longer doing this, this test no longer can pass.

kwk updated this revision to Diff 273309.Jun 25 2020, 4:58 AM
  • Add newlines
jankratochvil added inline comments.Jun 26 2020, 2:52 AM
lldb/source/Breakpoint/BreakpointResolverName.cpp
326

Now if (filter_by_function) {} always overrides any result from if (filter_by_cu). So it would be faster + more clear to read as:

if (filter_by_function) {
  if (!sc.function || !filter.FunctionPasses(*sc.function))
    remove_it = true;
} else if (filter_by_cu) {
  if (!sc.comp_unit || !filter.CompUnitPasses(*sc.comp_unit))
    remove_it = true;
}
kwk updated this revision to Diff 273637.Jun 26 2020, 3:00 AM
  • Simplify logic
kwk marked 2 inline comments as done.Jun 26 2020, 3:01 AM
kwk added inline comments.
lldb/source/Breakpoint/BreakpointResolverName.cpp
326

Thanks @jankratochvil ! You were right about the logic. I must have thought to keep as much from the old code and only add to it. But in this case it makes much more sense to structure the logic the way that you've proposed.

jankratochvil added inline comments.Jun 26 2020, 3:12 AM
lldb/source/Core/SearchFilter.cpp
829

filter_by_function now fully overrides filter_by_cu and so when you set eSymbolContextFunction unconditionally here it makes no sense to set eSymbolContextCompUnit.
The real truth is that eSymbolContextFunction should be set conditionally according to GetInlineStrategy() like lldb/source/Target/Target.cpp line 335 does now.

kwk planned changes to this revision.Jun 26 2020, 5:06 AM
kwk marked an inline comment as done.

The filename should not be checked from SymbolContext::function but rather from SymbolContext::line_entry. As that is cheaper. And when one asks for breakpoint at 1a.h:1 then it is enough to check .debug_line (which needs to be checked anyway) and why to look into .debug_info for the function name etc. at all?
I think there is even a bug due to the Function being involved and not just line_entry:

tail -n99 1b.c 1b.h;clang -o 1b 1b.c -Wall -g;~/redhat/llvm-monorepo2-clangassert/bin/lldb -batch ./1a -o 'breakpoint set -f 1b.h -l 1' 
==> 1b.c <==
static void func(void) {}
int main(void) {
#include "1b.h"
  return 0;
}
==> 1b.h <==
  func();
(lldb) target create "./1a"
Current executable set to '/home/jkratoch/t/1a' (x86_64).
(lldb) breakpoint set -f 1b.h -l 1
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.

For a practical use case the 1b.h should be rather called 1b.def, there are such files found in many GNU projects - although usually the *.def files are included into data structs and not into function code. But still. This reproducer should be even a part of the testcase for your patch.

The filename should not be checked from SymbolContext::function but rather from SymbolContext::line_entry. As that is cheaper. And when one asks for breakpoint at 1a.h:1 then it is enough to check .debug_line (which needs to be checked anyway) and why to look into .debug_info for the function name etc. at all?
I think there is even a bug due to the Function being involved and not just line_entry:

tail -n99 1b.c 1b.h;clang -o 1b 1b.c -Wall -g;~/redhat/llvm-monorepo2-clangassert/bin/lldb -batch ./1a -o 'breakpoint set -f 1b.h -l 1' 
==> 1b.c <==
static void func(void) {}
int main(void) {
#include "1b.h"
  return 0;
}
==> 1b.h <==
  func();
(lldb) target create "./1a"
Current executable set to '/home/jkratoch/t/1a' (x86_64).
(lldb) breakpoint set -f 1b.h -l 1
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.

When setting a breakpoint by file+line I would definitely expect this to work as you describe.

However, when setting a breakpoint by name+file I think that a more natural behavior would be to check the file that the function was defined in. I.e. in your example "--name main --file 1b.h" should fail, but "--name main --file 1b.c" should succeed.

I guess this all goes back to what Jim said earlier about the "--file" argument in set-by-name and set-by-line behaving differently even though that's not obvious from how the command is structured.

At this point I'm getting very lost in this patch, but here's some of my thoughs (the ones I could actually formulate in a coherent manner)...

lldb/source/Core/SearchFilter.cpp
713

Yes, ignoring the function if we cannot determine which file it is in sounds reasonable. (though I'm not sure if this type argument can ever be null in practice.)

lldb/test/API/functionalities/breakpoint/scripted_bkpt/TestScriptedResolver.py
124

After reading more about it, I'm pretty sure this is not the right solution. The scripted breakpoints are documented (here) as accepting filters based on compile units. If we don't want to change that, then I guess we need to create a separate filter, so that the scripted breakpoints can filter based on compile unit names and the name+file breakpoints can filter on definitions file names.

If we want to change that, then we'd also need to update the documentation. But, the test should stay anyway, because noFileOfThisName.xxx is not a valid file nor compile unit name, so the location should not be added regardless of what exactly are we choosing to filter on.

At this point I'm getting very lost in this patch,

Me too, as you stated:

In D74136#1983467, @kwk wrote:

I will obviously rename SearchFilterByModuleListAndCU but that can come in a child revision.

...

I think it'd be best to do the rename as a part of this patch.

So I have to always undo the renaming before understanding what the patch does now. It would be nice if you could take back the merge of renaming.