This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] support for DW_AT_data_location in llvm
ClosedPublic

Authored by alok on May 7 2020, 10:56 AM.

Details

Summary

This patch adds support for DWARF attribute DW_AT_data_location.

Summary:

Dynamic arrays in fortran are described by array descriptor and
data allocation address. Former is mapped to DW_AT_location and
later is mapped to DW_AT_data_location.

Testing:
  • unit test cases added (hand-written)
  • check llvm
  • check debug-info

Diff Detail

Event Timeline

alok created this revision.May 7 2020, 10:56 AM
vsk added inline comments.May 7 2020, 11:38 AM
llvm/lib/Bitcode/Reader/MetadataLoader.cpp
1395

Is this the right thing to do in the ThinLTO case discussed above?

llvm/test/DebugInfo/dwarfdump-dataLocationExp.ll
2

Could you expand the comment to describe what aspect of DW_AT_data_location this is testing?

4

Why do optimizations need to run in order to validate dwarfdump output? I'd expect it to be possible to write a test like this:

define void @f() {
  dbg.declare(%arrayVal, !arrayMD, ...)
  ret void
}
!arrayMD = !DILocalVariable(..., !typeMD)
!typeMD = !DICompositeType(array_type, ...)

Everything else makes the test harder to understand and maintain.

llvm/test/DebugInfo/dwarfdump-dataLocationVar.ll
2

ditto

Why is the data location a part of the type and not something dynamically bound via dbg.value or something similar?

What's an example for what a typical data location expression?

alok marked 8 inline comments as done.May 7 2020, 11:08 PM
alok added inline comments.
llvm/lib/Bitcode/Reader/MetadataLoader.cpp
1395

Thanks for pointing this out, I completely overlooked it. It will be corrected in next version.

llvm/test/DebugInfo/dwarfdump-dataLocationExp.ll
2

Thanks for your comment. I shall update this in next version.

4

Thanks for your suggestion. test will be trimmed in next version.

llvm/test/DebugInfo/dwarfdump-dataLocationVar.ll
2

I shall update the test case for your comments.

alok updated this revision to Diff 262814.May 7 2020, 11:11 PM
alok marked 4 inline comments as done.

Incorporated comments from @vsk

alok added a comment.May 7 2020, 11:41 PM

Why is the data location a part of the type and not something dynamically bound via dbg.value or something similar?

What's an example for what a typical data location expression?

Dynamic arrays are represented by two things

  • allocated space
  • descriptor which describes the array bounds (always) and allocated space (not always)

DW_AT_location attached to DW_TAG_variable denotes descriptor, which is used by bounds (using DW_OP_push_object_address). Since DW_AT_location denotes descriptor, DW_AT_data_location is used to denote allocated space. DWARF has attached that to DW_TAG_array_type.
In cases when descriptor also has a pointer to allocated space (as in case of gfortran), DW_AT_data_location can also be described by DW_OP_push_object_address and offset where the pointer to allocated space is. In cases when descriptor does not have the pointer to allocated space (as in case of flang), DW_AT_data_location can be represented by DIE reference which denotes an artificial variable which has the data pointer (dbg.declare in IR).
Please let me know if you need more information.

Why is the data location a part of the type and not something dynamically bound via dbg.value or something similar?

What's an example for what a typical data location expression?

Dynamic arrays are represented by two things

  • allocated space
  • descriptor which describes the array bounds (always) and allocated space (not always)

DW_AT_location attached to DW_TAG_variable denotes descriptor, which is used by bounds (using DW_OP_push_object_address). Since DW_AT_location denotes descriptor, DW_AT_data_location is used to denote allocated space. DWARF has attached that to DW_TAG_array_type.
In cases when descriptor also has a pointer to allocated space (as in case of gfortran), DW_AT_data_location can also be described by DW_OP_push_object_address and offset where the pointer to allocated space is. In cases when descriptor does not have the pointer to allocated space (as in case of flang), DW_AT_data_location can be represented by DIE reference which denotes an artificial variable which has the data pointer (dbg.declare in IR).
Please let me know if you need more information.

Thanks. I read through the DWARF v5 spec and given that arrays are represented as DICompositeTypes this makes sense to me. Personally I find the DWARF nomenclature weird (it would make more sense if we had a DW_AT_type_descriptor and a DW_AT_location), but it's probably better to stick with the DWARF naming scheme here, or else this raises even more confusion.

Can you please add a check to Verifier.cpp that checks that only arrays get a data location?
Can you please add a round-trip test to test/Assembler/ (see dicompositetype-members.ll for an example)?
Can you please extend the metadata unit test llvm/unittests/IR/MetadataTest.cpp to exercise the changes in LLVMContextImpl.h?
Should this be added to DIBuilder?

Can you add some form of documentation that explains what the new field is used for? Perhaps in the doxygen of DIBuilder, and/or in SourceLevelDebugging.rst or LangRef.rst?

llvm/lib/Bitcode/Reader/MetadataLoader.cpp
1391

This check should be removed here. Can you move it into Verifier.cpp?

llvm/lib/IR/LLVMContextImpl.h
562

I don't think it makes sense performance-wise to hash DataLocation.

alok marked 4 inline comments as done.May 11 2020, 1:15 PM

Why is the data location a part of the type and not something dynamically bound via dbg.value or something similar?

What's an example for what a typical data location expression?

Dynamic arrays are represented by two things

  • allocated space
  • descriptor which describes the array bounds (always) and allocated space (not always)

DW_AT_location attached to DW_TAG_variable denotes descriptor, which is used by bounds (using DW_OP_push_object_address). Since DW_AT_location denotes descriptor, DW_AT_data_location is used to denote allocated space. DWARF has attached that to DW_TAG_array_type.
In cases when descriptor also has a pointer to allocated space (as in case of gfortran), DW_AT_data_location can also be described by DW_OP_push_object_address and offset where the pointer to allocated space is. In cases when descriptor does not have the pointer to allocated space (as in case of flang), DW_AT_data_location can be represented by DIE reference which denotes an artificial variable which has the data pointer (dbg.declare in IR).
Please let me know if you need more information.

Thanks. I read through the DWARF v5 spec and given that arrays are represented as DICompositeTypes this makes sense to me. Personally I find the DWARF nomenclature weird (it would make more sense if we had a DW_AT_type_descriptor and a DW_AT_location), but it's probably better to stick with the DWARF naming scheme here, or else this raises even more confusion.

Thanks for your attention and comments.

Can you please add a check to Verifier.cpp that checks that only arrays get a data location?

I shall update the patch with this.

Can you please add a round-trip test to test/Assembler/ (see dicompositetype-members.ll for an example)?

I have added the round trip test named llvm/test/Bitcode/dataLocation.ll, should I move this to Assembler directory?

Can you please extend the metadata unit test llvm/unittests/IR/MetadataTest.cpp to exercise the changes in LLVMContextImpl.h?

I shall update the patch to incorporate this.

Should this be added to DIBuilder?

As of now, flang front-end doesnt utilize these routines, we can defer it or update it now itself provided f18 is now part of LLVM and later we may need.

Can you add some form of documentation that explains what the new field is used for? Perhaps in the doxygen of DIBuilder, and/or in SourceLevelDebugging.rst or LangRef.rst?

llvm/lib/Bitcode/Reader/MetadataLoader.cpp
1391

Thanks for your comment, I shall move that in next version of patch.

llvm/lib/IR/LLVMContextImpl.h
562

Thanks for your comment. I shall incorporate this in next version of patch.

alok updated this revision to Diff 263260.May 11 2020, 1:21 PM
alok marked 2 inline comments as done.

Incorporated comments form @aprantl. In addition to that also updated dependencies function in file DwarfCompileUnit.cpp.

Thanks, we're almost there!

llvm/docs/LangRef.rst
4813–4818

the -> The

LLVM IR doesn't know the term location description. Let's call it what it is:

The ``dataLocation`` is a DIExpression that describes how to get from an object's address to the actual raw data, if they aren't equivalent. This is only supported for array types, particularly to describe Fortran arrays, which have an array descriptor in addition to the array data.
llvm/lib/IR/Verifier.cpp
1018

Thanks! Can you add a tiny testcase to tests/Verifier that covers this check?

llvm/test/Bitcode/dataLocation.ll
8

Out of curiosity: This is a placeholder, right? I would have expected something like !DIExpression(DW_OP_constu, 32, DW_OP_plus) here.

aprantl added inline comments.May 12 2020, 1:09 PM
llvm/test/Bitcode/dataLocation.ll
8

Ah.. presumably there will be a DW_OP_push_object_address, too, right?

alok marked 6 inline comments as done.May 13 2020, 10:52 AM
alok added inline comments.
llvm/docs/LangRef.rst
4813–4818

Thanks a lot for this suggestion. It looks far better now. I shall include DIVariable along with DIExpression. It can be found in next version.

llvm/lib/IR/Verifier.cpp
1018

Thanks for your comment. It shall be included in next version of patch.

llvm/test/Bitcode/dataLocation.ll
8

Yes correct. Since that is a separate patch and this tests objective is just to test if an expression can appear for dataLocation field and not, and not to check what is valid expression. To avoid dependency on other patch i just used a place holder here.

alok marked 3 inline comments as done.May 13 2020, 1:01 PM
alok updated this revision to Diff 263833.May 13 2020, 1:05 PM

Re-based and addressed comments from @aprantl

aprantl accepted this revision.May 14 2020, 10:40 AM

Small wording fix inline, otherwise this LGTM.

llvm/docs/LangRef.rst
4814
The optional ``dataLocation`` is a DIExpression that describes ...
This revision is now accepted and ready to land.May 14 2020, 10:40 AM
This revision was automatically updated to reflect the committed changes.
alok marked 2 inline comments as done.May 14 2020, 11:34 PM
alok added inline comments.
llvm/docs/LangRef.rst
4814

Thanks for comment. It is incorporated.

broadwaylamb added inline comments.
llvm/test/DebugInfo/dwarfdump-dataLocationExp.ll
4

This also breaks our buildbot, see https://reviews.llvm.org/D79306#2038748

llvm/test/DebugInfo/dwarfdump-dataLocationVar.ll
4

And this too