Page MenuHomePhabricator

[LLDB][NFC] Decouple dwarf location table from DWARFExpression.
Needs ReviewPublic

Authored by zequanwu on May 12 2022, 6:05 PM.

Details

Summary

This decouples DWARFExpression such that DWARFExpression only contains the
data of a single dwarf expression, no range info. DWARFExpressionList holds
a RangeDataVector that maps file address ranges into dwarf expressions.

Currently a variable in NativePDB and PDB plugins can only have one dwarf
location to describe its location, because they don't have dwarf location table
that stores the mapping relationship. So, they need to construct a dwarf
location table from code view format which is complicated to have multiple
locations for one variable. This patch makes it easier to add multiple dwarf
expressions without constructing a dwarf location table.

Diff Detail

Unit TestsFailed

TimeTest
60,080 msx64 debian > Clang.CodeGenCXX::dllimport-members.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/15.0.0/include -nostdsysteminc -no-opaque-pointers -no-enable-noundef-analysis -disable-llvm-passes -triple i686-windows-msvc -fms-compatibility -emit-llvm -std=c++1y -O0 -o - /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGenCXX/dllimport-members.cpp -DMSABI | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --check-prefix=MSC --check-prefix=M32 /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGenCXX/dllimport-members.cpp
60,940 msx64 debian > Clang.Driver::fsanitize.c
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -target x86_64-linux-gnu -fsanitize=undefined -fsanitize-trap=undefined /var/lib/buildkite-agent/builds/llvm-project/clang/test/Driver/fsanitize.c -### 2>&1 | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/clang/test/Driver/fsanitize.c --check-prefix=CHECK-UNDEFINED-TRAP
60,280 msx64 debian > Clang.OpenMP::target_parallel_for_codegen_registration.cpp
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/15.0.0/include -nostdsysteminc -no-opaque-pointers -verify -fopenmp -fopenmp-version=45 -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_parallel_for_codegen_registration.cpp -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --allow-unused-prefixes /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_parallel_for_codegen_registration.cpp
60,150 msx64 debian > Clang.OpenMP::target_parallel_for_simd_codegen_registration.cpp
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/15.0.0/include -nostdsysteminc -no-opaque-pointers -verify -fopenmp -fopenmp-version=45 -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_parallel_for_simd_codegen_registration.cpp -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --allow-unused-prefixes /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_parallel_for_simd_codegen_registration.cpp
60,150 msx64 debian > Clang.OpenMP::target_teams_distribute_parallel_for_simd_codegen_registration.cpp
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/15.0.0/include -nostdsysteminc -no-opaque-pointers -verify -fopenmp -fopenmp-version=45 -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen_registration.cpp -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --allow-unused-prefixes /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen_registration.cpp
View Full Test Results (7 Failed)

Event Timeline

zequanwu created this revision.May 12 2022, 6:05 PM
Herald added a project: Restricted Project. · View Herald Transcript
zequanwu requested review of this revision.May 12 2022, 6:05 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 12 2022, 6:05 PM
zequanwu retitled this revision from [WIP][LLDB][NFC] Decouple dwarf location table from DWARFExpression. to [LLDB][NFC] Decouple dwarf location table from DWARFExpression..May 12 2022, 7:19 PM
zequanwu edited the summary of this revision. (Show Details)
zequanwu added reviewers: labath, JDevlieghere.
zequanwu removed a subscriber: JDevlieghere.
zequanwu updated this revision to Diff 430502.May 18 2022, 2:13 PM

Add missing sort.

Separating the notion of an expression and an expression list seems worthwhile. I did a very high level pass but I'll take a more detailed look tomorrow.

lldb/include/lldb/Expression/DWARFExpressionList.h
2–3

Formatting

23

This should be a Doxygen comment. But more importantly, this doesn't really tell me what the class does. DWARFExpressionList implies it holds a list of expressions so that seems useful to include there.

25

I think we generally put public first, followed by protected and private last.

lldb/test/Shell/SymbolFile/DWARF/x86/dwp.s
22–24

I guess the patch is not NFC is the output changes? Would it be possible to split the functional and non-functional part of this patch into separate patches?

zequanwu updated this revision to Diff 431022.May 20 2022, 12:07 PM
zequanwu marked 3 inline comments as done.

Address comments.

lldb/test/Shell/SymbolFile/DWARF/x86/dwp.s
22–24

Originally, lldb-test uses DWARFLocationTable::dumpLocationList to dump the raw range info from dwarf expression, which are not interpreted to be readable range info.
In this case, the raw range info is DW_LLE_startx_length 0x1 0x1 where the first value is address index and second value is range length. Now, it's dumping the readable range info that is interpreted from the raw range info. So, the variable range is within the block range.

Another test case change is debug_loc.s, because we don't accept invalid range info now.

rnk added a subscriber: rnk.Tue, Jun 7, 10:30 AM

@labath, ping, I see you just got back from vacation and this is a lot of code, but any high level design feedback would be helpful to know if this is the right direction.

llvm/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
180 ↗(On Diff #431022)

Did this change cause the presubmit test failure? DebugInfoDWARFTests/DWARFDie::getLocations looks related.

zequanwu added inline comments.Thu, Jun 9, 11:21 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
180 ↗(On Diff #431022)

Yes. If I don't make this change, several existing tests crashed at here due to Expected<T> must be checked before access or destruction. I'm not sure how to fix this.

zequanwu updated this revision to Diff 435625.Thu, Jun 9, 11:56 AM

Add missing check for Expected.

zequanwu marked an inline comment as done.Thu, Jun 9, 11:57 AM
zequanwu added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
180 ↗(On Diff #431022)

NVM, I missed a checking for expected above. Reverted this part.

Overall, I like this. The main thing I am wondering is if it wouldn't be better to (instead of essentially duplicating the DWARFExpression interface in the DWARFExpressionList class) somehow split the process of finding the appropriate expression (for the given PC value or whatever), and the actual evaluation. This would be particularly nice if the arguments can be split in such a way that those that are used for locating the expression are not repeated in the "evaluate" call. Have you considered this?

Also, if I understand correctly, one of the side effects of this patch that the DWARF location list is now parsed more eagerly (when the containing object (e.g. variable) is constructed rather than during evaluation time). I think that's probably fine, but it's definitely worth calling out.

lldb/source/Expression/DWARFExpression.cpp
2567–2570

Could this go into the list class?

lldb/source/Expression/DWARFExpressionList.cpp
112

What would it take to make this class iterable? (i.e. have some begin/end methods instead of this foreach callback.

lldb/test/Shell/SymbolFile/DWARF/x86/debug_loc.s
43

This is probably fine, but it would be nice to at least log the errors somewhere (if we don't do it already).

zequanwu updated this revision to Diff 437316.Wed, Jun 15, 1:26 PM
zequanwu marked an inline comment as done.

Address comments.

The main thing I am wondering is if it wouldn't be better to (instead of essentially duplicating the DWARFExpression interface in the DWARFExpressionList class) somehow split the process of finding the appropriate expression (for the given PC value or whatever), and the actual evaluation. This would be particularly nice if the arguments can be split in such a way that those that are used for locating the expression are not repeated in the "evaluate" call. Have you considered this?

Then we need to duplicate the code that gets expression data and register kind before every call to DWARFExpression::Evaluate, making it less clean.

Also, if I understand correctly, one of the side effects of this patch that the DWARF location list is now parsed more eagerly (when the containing object (e.g. variable) is constructed rather than during evaluation time). I think that's probably fine, but it's definitely worth calling out.

Yes, the dwarf location list now is parsed at the time when parsing variable DIE. It used to be that a dwarf location list will be parsed every time when we try to access a dwarf expression inside.

lldb/source/Expression/DWARFExpression.cpp
2567–2570

This function calls ReadAddressFromDebugAddrSection that is called multiple places in DWARFExpression. If we move it to DWARFExpressionList, we need to duplicate the ReadAddressFromDebugAddrSection.

The main thing I am wondering is if it wouldn't be better to (instead of essentially duplicating the DWARFExpression interface in the DWARFExpressionList class) somehow split the process of finding the appropriate expression (for the given PC value or whatever), and the actual evaluation. This would be particularly nice if the arguments can be split in such a way that those that are used for locating the expression are not repeated in the "evaluate" call. Have you considered this?

Then we need to duplicate the code that gets expression data and register kind before every call to DWARFExpression::Evaluate, making it less clean.

Ok, fair enough.

I noticed one issue (described inline) but other than that, I think this should be ok.

lldb/include/lldb/Expression/DWARFExpressionList.h
30–32

I believe this is equivalent.

104

false would be more correct here ((a<b) == false && (b<a) == false implies a==b, but a<b && b<a is nonsense)

lldb/source/Expression/DWARFExpression.cpp
2567–2570

Ok. If that's the only problem, then I think we can fix that by moving ReadAddressFromDebugAddrSection to the DWARFUnit class (if it does not contain something like that already). IIUC, this is equivalent to getAddrOffsetSectionItem in llvm DWARFUnit, so I'd give it a similar name.

lldb/source/Expression/DWARFExpressionList.cpp
110

Could we make MatchesOperand const (and ditch the Mutable calls above)?

180

I don't think this is correct.

We should be able to distinguish between a location list that is always valid (i.e., one which is not a location list at all), and a location list which happens to contain a single valid range.

That could probably be achieved by making the range of the always-valid entry be (0, UINT_MAX), but maybe it would be better to have an explicit fallback/default entry for this (one that is used when no ranged entry applies). This is how the default location entries in DWARF5 are supposed to work, although I am not sure if any compiler actually makes use of them.

zequanwu updated this revision to Diff 438817.Tue, Jun 21, 1:36 PM
zequanwu marked 5 inline comments as done.

Address comments.

zequanwu added inline comments.Tue, Jun 21, 1:45 PM
lldb/source/Expression/DWARFExpressionList.cpp
180

That could probably be achieved by making the range of the always-valid entry be (0, UINT_MAX)

This looks better to me as it saves one default entry and there is no difference in terms of using DWARFExpressionList APIs.
Added IsAlwaysValidSingleExpr to check if it is an always-valid entry and uses it to distinguish between the two kinds.

This isn't turning out as clean as I had hoped it would, but I think it's a step in the right direction. Jonas, do you want to say anything about this?

lldb/include/lldb/Expression/DWARFExpressionList.h
55

calling GetExpressionAtAddress without an address seems weird. Maybe just have a separate GetAlwaysValidExpr getter or something?

zequanwu updated this revision to Diff 439814.Fri, Jun 24, 10:16 AM
zequanwu marked an inline comment as done.

address comment.