This is an archive of the discontinued LLVM Phabricator instance.

[ObjectYAML][ELF] Add support for emitting the .debug_abbrev section.
ClosedPublic

Authored by Higuoxing on Jun 14 2020, 8:49 PM.

Details

Summary

This patch enables yaml2elf emit the .debug_abbrev section.

The generated .debug_abbrev is verified using llvm-dwarfdump.

Known issues that will be addressed later:

  • Current implementation doesn't support generating multiple abbreviation tables in one .debug_abbrev section.

Diff Detail

Event Timeline

Higuoxing created this revision.Jun 14 2020, 8:49 PM

Should it have all 32/64 LE/BE set of tests?

llvm/test/tools/yaml2obj/ELF/DWARF/debug-abbrev.yaml
206
# -> ##?

Should it have all 32/64 LE/BE set of tests?

I think we don't need it. Values in abbreviations tables is encoded in LEB128.

From your description: "This patch enables yaml2elf emit the .debug_line section.": s/.debug_line/.debug_abbrev

llvm/test/tools/yaml2obj/ELF/DWARF/debug-abbrev.yaml
93

It would also be good to test a Form value that requires more than one byte, e.g. 0x81.

Higuoxing updated this revision to Diff 270708.Jun 15 2020, 3:40 AM
Higuoxing marked 2 inline comments as done.

Address comments.

Higuoxing edited the summary of this revision. (Show Details)Jun 15 2020, 3:40 AM
jhenderson accepted this revision.Jun 15 2020, 3:59 AM

LGTM, but please wait for others to confirm.

llvm/test/tools/yaml2obj/ELF/DWARF/debug-abbrev.yaml
94–95

Fine to leave this here, but you do have that with the 0x2020 value below :)

This revision is now accepted and ready to land.Jun 15 2020, 3:59 AM
grimar accepted this revision.Jun 15 2020, 4:30 AM

LGTM too.

MaskRay requested changes to this revision.EditedJun 15 2020, 10:38 AM

I asked around how we should comment individual bytes.

lichray gave an alternative:

# CONTENT-NEXT:                0000:  01 11 01 25
                                      ~~ ~~ ~~ ~~
                                       ^  ^  ^  ^
                                       |  |  |  |
      ULEB128  abbreviation code ______|  |  |  |
      ULEB128  DW_TAG_compile_unit _______|  |  |
       1-byte  DW_CHILDREN_yes ______________|  |
      ULEB128  DW_AT_producer __________________|

                                      0E 13 05 03
                                      ~~ ~~ ~~ ~~
                                       ^  ^  ^  ^
                                       |  |  |  |
      ULEB128  DW_FORM_strp ___________|  |  |  |
      ULEB128  DW_AT_language ____________|  |  |
      ULEB128  DW_FORM_data2 ________________|  |
      ULEB128  DW_AT_name ______________________|

wanders: Not radically different, but if you put the comment above instead it will read from top instead of from bottom.

(Request changes just to make sure these suggestions are considered and the comments should be improved if you think any suggestion is favorable.)

This revision now requires changes to proceed.Jun 15 2020, 10:38 AM
MaskRay added inline comments.Jun 15 2020, 10:43 AM
llvm/lib/ObjectYAML/DWARFYAML.cpp
35

Probably use an alphabetical order.

Higuoxing marked an inline comment as done.Jun 15 2020, 7:41 PM

I asked around how we should comment individual bytes.

lichway gave an alternative:

# CONTENT-NEXT:                0000:  01 11 01 25
                                      ~~ ~~ ~~ ~~
                                       ^  ^  ^  ^
                                       |  |  |  |
      ULEB128  abbreviation code ______|  |  |  |
      ULEB128  DW_TAG_compile_unit _______|  |  |
       1-byte  DW_CHILDREN_yes ______________|  |
      ULEB128  DW_AT_producer __________________|

                                      0E 13 05 03
                                      ~~ ~~ ~~ ~~
                                       ^  ^  ^  ^
                                       |  |  |  |
      ULEB128  DW_FORM_strp ___________|  |  |  |
      ULEB128  DW_AT_language ____________|  |  |
      ULEB128  DW_FORM_data2 ________________|  |
      ULEB128  DW_AT_name ______________________|

wanders: Not radically different, but if you put the comment above instead it will read from top instead of from bottom.

(Request changes just to make sure these suggestions are considered and the comments should be improved if you think any suggestion is favorable.)

Hi Fangrui, thanks for the suggestion.

It seems that FileCheck cannot match bytes one by one.

## Cannot match
# CONTENT-NEXT: 0000: 01 11 01 25
## Can match
# CONTENT-NEXT: 0000: 01110125

I have also checked the manual and there's no such an option that makes FileCheck happy with this match pattern. Did I miss something?

What do you think of having something like this?

# CONTENT-NEXT: 0000: 01110125
                      ^~^~^~^~
                ______| | | |
                ________| | |
                __________| |
                ____________|

OR

# CONTENT-NEXT: 0000: 01110125
                      ^~       ULEB28 abbreviation code
                        ^~     ULEB128 DW_TAG_compile_unit
                          ^~   1-byte DW_CHILDREN_yes
                            ^~ ULEB128 DW_AT_producer
llvm/lib/ObjectYAML/DWARFYAML.cpp
35

Oh, thanks for spotting it. Can I re-arrange them in a follow-up patch?

Higuoxing updated this revision to Diff 271306.Jun 17 2020, 2:25 AM

Address @MaskRay 's comments.

  • Put comments above check line.

OR

# CONTENT-NEXT: 0000: 01110125
                      ^~       ULEB28 abbreviation code
                        ^~     ULEB128 DW_TAG_compile_unit
                          ^~   1-byte DW_CHILDREN_yes
                            ^~ ULEB128 DW_AT_producer

This style looks good to me. This is similar to lld/test/ELF/mips-gp-ext.s, which uses

# ABS:      Contents of section .text:
# ABS-NEXT:  00e0 3c080000 21080120 8f82ff08
#                 ^-- %hi(_gp_disp)
#                          ^-- %lo(_gp_disp)
#                                   ^-- 8 - (0x200 - 0x100)
#                                       G - (GP - .got)

OR

# CONTENT-NEXT: 0000: 01110125
                      ^~       ULEB28 abbreviation code
                        ^~     ULEB128 DW_TAG_compile_unit
                          ^~   1-byte DW_CHILDREN_yes
                            ^~ ULEB128 DW_AT_producer

It may be more readable if the two columns are aligned separately:

# CONTENT-NEXT: 0000: 01110125
                      ^~       abbreviation code    ULEB128
                        ^~     DW_TAG_compile_unit  ULEB128
                          ^~   DW_CHILDREN_yes       1-byte
                            ^~ DW_AT_producer       ULEB128
Higuoxing updated this revision to Diff 271549.Jun 17 2020, 6:56 PM

Address comments.

MaskRay accepted this revision.EditedJun 17 2020, 7:04 PM

Looks great! And thanks a lot to @lichray's suggestions

This revision is now accepted and ready to land.Jun 17 2020, 7:04 PM

Thanks for giving suggestions and reviewing!

This revision was automatically updated to reflect the committed changes.