This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][BPF] Add 'annotations' field for DIBasicType & DISubroutineType
AcceptedPublic

Authored by eddyz87 on Feb 13 2023, 4:52 PM.

Details

Summary

Currently btf_type_tag attributes are represented in DWARF as child DIEs with DW_TAG_LLVM_annotation tag. Such attributes are supported for derived types of type DW_TAG_pointer_type and designate that the pointee type should have the attribute. (These DWARF entries are used to generate BTF definitions by tools like pahole, BTF is used by Linux kernel to verify some properties of BPF programs).

For example, consider the following C code:

struct st {
  // field "a" of type (pointer (int :btf_type_tag "__a"))
  int __attribute__((btf_type_tag("__a"))) *a;
} g;

And corresponding DWARF:

0x29:   DW_TAG_structure_type
          DW_AT_name      ("st")

0x2e:     DW_TAG_member
            DW_AT_name    ("a")
            DW_AT_type    (0x38 "int *")

0x38:   DW_TAG_pointer_type
          DW_AT_type      (0x41 "int")

0x3d:     DW_TAG_LLVM_annotation
            DW_AT_name    ("btf_type_tag")
            DW_AT_const_value     ("__a")

0x41:   DW_TAG_base_type
          DW_AT_name      ("int")

At the IR level this corresponds to annotations field for DIDerivedType class instance with DW_TAG_pointer_type tag:

!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "st", file: !3, line: 1, size: 64, elements: !6)
!6 = !{!7}
!7 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !5, file: !3, line: 2, baseType: !8, size: 64)
!8 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !9, size: 64, annotations: !10)
                                                                      ^^^^^^^^^^^^^^^^
!9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!10 = !{!11}
!11 = !{!"btf_type_tag", !"__a"}

The annotations field is an array of string/string tuples, these tuples are emitted as child DIEs with tag DW_TAG_LLVM_annotation by DwarfUnit::addAnnotation(), see llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp.

Recent discussion in Kernel BPF mailing list came to conclusion, that such annotations should apply to the annotated type itself (multiple approaches are discussed in the linked thread, "Solution 2" is the one accepted). For example, new DWARF encoding for the code above should look as follows:

0x29:   DW_TAG_structure_type
          DW_AT_name      ("st")

0x2e:     DW_TAG_member
            DW_AT_name    ("a")
            DW_AT_type    (0x38 "int *")

0x38:   DW_TAG_pointer_type
          DW_AT_type      (0x41 "int")

0x41:   DW_TAG_base_type
          DW_AT_name      ("int")

0x45:     DW_TAG_LLVM_annotation
            DW_AT_name    ("btf:type_tag")
            DW_AT_const_value     ("__a")

Which means that DW_TAG_LLVM_annotation children DIEs should be possible for anything that could be pointed to:

  • basic types
  • derived types
  • composite types
  • subroutine types

Which in turn means that annotations fields are necessary for DI classes corresponding to the above entities. LLVM debug information classes support annotations field for DIDerivedType (used for pointee btf_type_tags) and DICompositeType (used for btf_decl_tags). This commit extends DIBasicType and DISubroutineType classes to support such field. For the example above the IR looks as follows:

!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "st", file: !3, line: 1, size: 64, elements: !6)
!6 = !{!7}
!7 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !5, file: !3, line: 2, baseType: !8, size: 64)
!8 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !9, size: 64)
!9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed, annotations: !10)
     ^^^^^^^^^^^^                                                 ^^^^^^^^^^^^^^^^
!10 = !{!11}
!11 = !{!"btf:type_tag", !"__a"}

The commit comprises of the following changes:

  • modifications for DIBasicType and DISubroutineType to add field annotations;
  • modifications for DIBasicType, DISubroutineType, DIDerivedType and DICompositeType to support method replaceAnnotations(), it is used by the next commit in a stack.
  • bitcode read and write support for the new fields;
  • IR parsing support for the new fields;
  • DWARF generation support for the new fields;
  • test cases.

Diff Detail

Event Timeline

eddyz87 created this revision.Feb 13 2023, 4:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2023, 4:52 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
eddyz87 updated this revision to Diff 497834.Feb 15 2023, 3:55 PM
  • Added tests for 'annotations' DWARF field generation for DIBasicType and DISubroutineType.
  • Fix for DIBasicType::getImpl() to correctly handle Annotations field.
  • Fix for DwarfUnit::constructTypeDIE(... DIBasicType *BTy) to generate annotations for dwarf::DW_TAG_unspecified_type instances.
eddyz87 retitled this revision from [DebugInfo][BPF] 'annotations' field for DIBasicType & DISubroutineType to [DebugInfo][BPF] Add 'annotations' field for DIBasicType & DISubroutineType.Feb 15 2023, 4:14 PM
eddyz87 edited the summary of this revision. (Show Details)
eddyz87 updated this revision to Diff 499279.Feb 21 2023, 1:33 PM
eddyz87 edited the summary of this revision. (Show Details)

Rebase

eddyz87 updated this revision to Diff 502032.Mar 2 2023, 5:58 PM

Rebase, change for tag name from btf_type_tag:v2 to btf:type_tag.

eddyz87 updated this revision to Diff 502149.Mar 3 2023, 8:57 AM

Fix for LLVMIR compilation error.

eddyz87 updated this revision to Diff 502320.Mar 3 2023, 5:17 PM

Another rebase, hope to fix windows build issue.

eddyz87 published this revision for review.Mar 13 2023, 3:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 13 2023, 3:51 PM
eddyz87 edited the summary of this revision. (Show Details)Mar 15 2023, 5:52 AM

@dblaikie Could you help take a look at this patch? Previously we have implemented btf_type_tag support in clang. After some discussion with gcc folks (https://lore.kernel.org/bpf/87r0w9jjoq.fsf@oracle.com/), we decided to change the dwarf format a little bit to make btf_type_tag more extensible. Currently, dwarf is used by pahole which supports the old format. @eddyz87 has implemented pahole change to support new format. So for the
time being, pahole will support both old and new format. And llvm dwarf output should also support both old and new format for a while so old pahole and new pahole should both work. Once minimum llvm requirement is lifted in linux kernel, we can remove support for old format.

Please see the above link for details. Feel free to ask any questions about rationale.

Seems generally OK.

As for the design - could this maybe have been a decorator on a type? (or are there other uses for annotations) more like how DW_TAG_const_type works, for instance?

And presumably the back-compat can be removed from LLVM once linux has updated pahole/whatever your support timeline is for LLVM (like how many old kernel versions are supported by recent LLVM) - rather than once Linux has dropped support for an LLVM that's old enough not to have the new feature (that milestone is when Linux can remove the backcompat support from pahole, rather than when LLVM can remove support for the old format).

llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
700–714

perhaps emit the annotations before the early-exit for unspecified type? (same way name is added before that) so it only has to be done once

eddyz87 updated this revision to Diff 533744.Jun 22 2023, 1:03 PM

Rebase, change in DwarfUnit::constructTypeDIE(..., DIBasicType)

eddyz87 marked an inline comment as done.Jun 22 2023, 1:23 PM

Seems generally OK.

Thank you for taking a look.

As for the design - could this maybe have been a decorator on a type? (or are there other uses for annotations) more like how DW_TAG_const_type works, for instance?

You mean why not model this thing as DIDerivedType, right?
The idea is that existing tools would ignore the unknown attribute (annotations), however the same tools won't know how to handle a new derived type in the chain. E.g. gdb would require update to handle some of the types in kernels built using new encoding etc. So, after some discussion we decided to use attributes instead of new derived type.

And presumably the back-compat can be removed from LLVM once linux has updated pahole/whatever your support timeline is for LLVM (like how many old kernel versions are supported by recent LLVM) - rather than once Linux has dropped support for an LLVM that's old enough not to have the new feature (that milestone is when Linux can remove the backcompat support from pahole, rather than when LLVM can remove support for the old format).

This is my assumption as well. As you correctly note, the dependency here is the minimal version of pahole required for kernel compilation. Kernel documentation lists 1.16 as a current minimal requirement, it was released 3 years ago. I can initiate a discussion on this topic if you want to have specific timeline.

dblaikie accepted this revision.Jun 26 2023, 3:07 PM

Fair enough. Seems OK.

This revision is now accepted and ready to land.Jun 26 2023, 3:07 PM