Currently btf_type_tag attributes are represented in DWARF as child DIEs with DW_TAG_LLVM_annotation tag. Such attributes are supported for derived types of type DW_TAG_pointer_type and designate that the pointee type should have the attribute. (These DWARF entries are used to generate BTF definitions by tools like pahole, BTF is used by Linux kernel to verify some properties of BPF programs).
For example, consider the following C code:
struct st { // field "a" of type (pointer (int :btf_type_tag "__a")) int __attribute__((btf_type_tag("__a"))) *a; } g;
And corresponding DWARF:
0x29: DW_TAG_structure_type DW_AT_name ("st") 0x2e: DW_TAG_member DW_AT_name ("a") DW_AT_type (0x38 "int *") 0x38: DW_TAG_pointer_type DW_AT_type (0x41 "int") 0x3d: DW_TAG_LLVM_annotation DW_AT_name ("btf_type_tag") DW_AT_const_value ("__a") 0x41: DW_TAG_base_type DW_AT_name ("int")
At the IR level this corresponds to annotations field for DIDerivedType class instance with DW_TAG_pointer_type tag:
!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "st", file: !3, line: 1, size: 64, elements: !6) !6 = !{!7} !7 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !5, file: !3, line: 2, baseType: !8, size: 64) !8 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !9, size: 64, annotations: !10) ^^^^^^^^^^^^^^^^ !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) !10 = !{!11} !11 = !{!"btf_type_tag", !"__a"}
The annotations field is an array of string/string tuples, these tuples are emitted as child DIEs with tag DW_TAG_LLVM_annotation by DwarfUnit::addAnnotation(), see llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp.
Recent discussion in Kernel BPF mailing list came to conclusion, that such annotations should apply to the annotated type itself (multiple approaches are discussed in the linked thread, "Solution 2" is the one accepted). For example, new DWARF encoding for the code above should look as follows:
0x29: DW_TAG_structure_type DW_AT_name ("st") 0x2e: DW_TAG_member DW_AT_name ("a") DW_AT_type (0x38 "int *") 0x38: DW_TAG_pointer_type DW_AT_type (0x41 "int") 0x41: DW_TAG_base_type DW_AT_name ("int") 0x45: DW_TAG_LLVM_annotation DW_AT_name ("btf:type_tag") DW_AT_const_value ("__a")
Which means that DW_TAG_LLVM_annotation children DIEs should be possible for anything that could be pointed to:
- basic types
- derived types
- composite types
- subroutine types
Which in turn means that annotations fields are necessary for DI classes corresponding to the above entities. LLVM debug information classes support annotations field for DIDerivedType (used for pointee btf_type_tags) and DICompositeType (used for btf_decl_tags). This commit extends DIBasicType and DISubroutineType classes to support such field. For the example above the IR looks as follows:
!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "st", file: !3, line: 1, size: 64, elements: !6) !6 = !{!7} !7 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !5, file: !3, line: 2, baseType: !8, size: 64) !8 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !9, size: 64) !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed, annotations: !10) ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^ !10 = !{!11} !11 = !{!"btf:type_tag", !"__a"}
The commit comprises of the following changes:
- modifications for DIBasicType and DISubroutineType to add field annotations;
- modifications for DIBasicType, DISubroutineType, DIDerivedType and DICompositeType to support method replaceAnnotations(), it is used by the next commit in a stack.
- bitcode read and write support for the new fields;
- IR parsing support for the new fields;
- DWARF generation support for the new fields;
- test cases.
perhaps emit the annotations before the early-exit for unspecified type? (same way name is added before that) so it only has to be done once