This is an archive of the discontinued LLVM Phabricator instance.

[ObjectYAML][DWARF] Add DWARF entry in ELFYAML.
ClosedPublic

Authored by Higuoxing on May 19 2020, 4:42 AM.

Details

Summary

This patch adds a new DWARF entry in ELF YAML file. Sections emitting will be added in next series of patches.


The basic idea is that we treat these debug sections as implicit sections like .strtab/.symtab. In future, reportError() will be replaced by initDWARFSectionHeader() for emitting the content.

Diff Detail

Event Timeline

Higuoxing created this revision.May 19 2020, 4:42 AM

I think I'd like to see at least one DWARF section implemented to demonstrate that the concept works.

llvm/lib/ObjectYAML/DWARFYAML.cpp
26

It can return the vector of StringRefs. Perhaps it can be called just "getSectionNames" for simplicity.

28

DebugStrings.size() != 0 -> !DebugStrings.empty() here and below.

36

Why this is !DWARF.PubNames.Entries and not !DWARF.PubNames?
(The same question for things around).

llvm/lib/ObjectYAML/ELFEmitter.cpp
268

Do not use auto when a type is not obvious.

269

Doesn't this create a temporary std::string that is destroyed after the push_back
(and so ImplicitSections holds the dead string as a result)?

405

Why do you need && !YAMLSec?

llvm/test/tools/yaml2obj/ELF/unimplemented-debug-sections.test
1 ↗(On Diff #264853)

unimplemented-debug-sections.test is a name that assumes this file will be renamed/removed later.
I do not think it is common to do such things. It could be just "debug-sections.test".

Perhaps, we should create a "ELF/DWARF" subfolder and create a separate test for each section. I am not sure, lets hear from others.

Higuoxing updated this revision to Diff 265126.May 19 2020, 9:27 PM

Addressed comments:

  • Use std::vector<>::empty().
  • Use StringRef.
  • Support emitting .debug_str section.
  • Add a new directory ELF/DWARF for test cases.
Higuoxing marked an inline comment as done.May 19 2020, 9:31 PM
Higuoxing added inline comments.
llvm/lib/ObjectYAML/ELFEmitter.cpp
405

We have some DebugInfo test cases using yaml2obj to generate debug sections via raw content.

Higuoxing marked an inline comment as done.May 19 2020, 9:54 PM
Higuoxing added inline comments.
llvm/lib/ObjectYAML/DWARFYAML.cpp
36

Because Data possesses PubNames directly, rather than Optional<PubNames>.

struct PubNames {
  ...
  std::vector<PubEntry> Entries;
};

struct Data {
  PubSection PubNames;
};

Can I fix it in the next patch?

Higuoxing planned changes to this revision.May 19 2020, 10:36 PM

Seems that this patch breaks test, I will look into it.

jhenderson added a comment.EditedMay 21 2020, 1:15 AM

Thanks for the patch. In general, it looks like the right approach.

llvm/lib/ObjectYAML/DWARFYAML.cpp
26

DWARFYAML is already used in Mach-O, so we might need to be careful with code like this that is ELF-specific. If we can add a property to the class that says what file format it is, that'll allow us to change the function name to getSectionNames as suggested, and provide a way to make this sort of function generic in the future, should the need arise.

Actually, I might suggest getUsedSectionNames, since it doesn't return all section names all the time.

llvm/lib/ObjectYAML/ELFEmitter.cpp
762

I don't know if it's needed in this patch, but we might want an error if a user tries to specify both Content/Size and DWARF data for a DWARF section.

An alternative, but I don't think it is necessary without a use-case, would be to do something like append the two.

776

Maybe this can just result in an empty section rather than an error?

llvm/test/tools/yaml2obj/ELF/DWARF/debug-info.yaml
3 ↗(On Diff #265128)

You can get rid of the "a) ". On its own, it doesn't really make sense.

16–21 ↗(On Diff #265128)

Maybe worth just omitting the body here, since it isn't actually used, if I'm right.

22 ↗(On Diff #265128)

Here and in all other cases, you don't need the ....

llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
2

In addition to the test cases in this file already, you need the following, I think:

a) Size is explicitly specified.
b) Both Size and DWARF->debug_str are specified
c) Both Content and DWARF->debug_str are specified
d) All properties that can be overridden by the section header (e.g. EntSize, Flags etc) when DWARF->debug_str is used.
e) The default values of Flags, EntSize etc when DWARF->debug_str is not used.
f) All properties that can be overridden by the section header (e.g. EntSize, Flags etc) when DWARF->debug_str is not used.
g) Something show that assignSectionAddresses is called.

6

I'd like to hear other people's opinions on this:

Presumably one of the reasons we are adding this support is because in the future, we'd like to write llvm-dwarfdump tests by using yaml2obj, but if we do that, we end up in a bit of a circular dependency, as we are using llvm-dwarfdump to test yaml2obj. That could result in a bug in the common library that results in both appearing to produce the right data, but actually it not being correct.

For .debug_str, it would be trivial to use other options, like llvm-readobj's --section-data or --string-dump options. I don't think this would work as well for other sections though, so maybe the circular dependency is okay? Thoughts?

jhenderson added inline comments.May 21 2020, 1:17 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
776

In fact, looking at the test cases that are still failing, I think that's exactly what we need to do. Since you zero the section header, you probably can just delete this else.

grimar added inline comments.May 21 2020, 2:02 AM
llvm/lib/ObjectYAML/DWARFYAML.cpp
36

OK. I wonder if we could start from having 2 just sections here for start: ".debug_str" and ".debug_info"
since seems other lines here arenot needed for this patch?

llvm/lib/ObjectYAML/ELFEmitter.cpp
756

I'd move it below, it is a bit too far from the place where RawSec is used first time.

llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
6

I don't think this would work as well for other sections though, so maybe the circular dependency is okay? Thoughts?

As far I understand, we have no much options. lib/DebugInfo is underlying layer used by llvm-dwarfdump, llvm-objdump to
dump the debug data (I believe). I'd try to use llvm-readobj's --section-data or other options where possible, like you suggest,
but it seems OK to use llvm-dwarfdump for other cases to me.

Partially I guess this problem will be solved by obj2yaml when it will be able to dump debug sections.
I.e. I expect that it might be able to catch some things after dumping objects produced by yaml2obj as probably will use some
independent code for dumping.

grimar added inline comments.May 21 2020, 2:07 AM
llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
6

For .debug_str, it would be trivial to use other options, like llvm-readobj's --section-data or --string-dump options.

I think we should do it. And anyways llvm-readelf is already used below for --section-headers

Higuoxing updated this revision to Diff 265664.May 21 2020, 9:00 PM

Thanks for reviewing and giving me useful suggestions.


Addressed comments.

  • Get rid of unimplemented debug sections in getUsedSectionNames(), so that we don't need to process them in initImplicitSections().
  • Use generateContentsFromDWARFEntry() to generate section contents from "DWARF" entry. If the contents are successfully generated for "DWARF" entry, it returns true, otherwise, it returns false.
  • Use llvm-readelf in tests.
  • Remove test for .debug_info section. ("debug_info" entry can be parsed, but will not be emitted.)
  • Add tests
    • c) Generate the .debug_str section when the "Size" is specified.
    • d) Test that yaml2obj emits an error message when both the "Size" and the "debug_str" entry at the same time.
    • e) Test that yaml2obj emits an error message when both the "Content" and the "debug_str" entry at the same time.
    • f) Test that all the properties can be overriden by the section header when the "debug_str" entry is used.
    • g) Test that all the properties can be overriden by the section header when the "debug_str" entry doesn't exist.
    • h) Check the default values of sh_entsize, sh_flags, sh_info and sh_addralign when the "debug_str" entry doesn't exist.
    • i) Set the address for the .debug_str section when the "debug_str" entry exists.
    • j) Set the address for the .debug_str section when the "debug_str" doesn't exists.
Higuoxing marked an inline comment as done.May 21 2020, 9:03 PM
Higuoxing added inline comments.
llvm/lib/ObjectYAML/ELFEmitter.cpp
226

This allocator is from D79984, I will rebase this patch after that.

Thanks. I've mostly got test comments for you.

llvm/lib/ObjectYAML/ELFEmitter.cpp
269

If I'm not mistaken, StringRef + const char * results in a Twine, so you probably can just do ("." + DebugSecName).str() without the explicit Twine.

787

Would this be better if it were more explicit? Something like: "initDWARFSectionHeader() should only be called if something describes the section".

llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
17

Let's be more preceise with our prefix names, e.g. "SHDRS" here, "DWARF-DEFAULT" above.

37

I think you should get rid of the Flags property below and show the section header details here. That way, you'll show what the default flag properties are if the DWARF tag is not specified.

54

You probably want to show that the content of the section is all zero too. You can use the --hex-dump switch if needed for that, or the od tool.

110

overriden -> overridden

Same elsewhere.

133–134

I think another set of interesting tests would be when the DWARF tag exists, but no debug_str entry is specified. I also think it would be interesting to show that no .debug_str section is generated if the DWARF tag is used, but no debug_str entry is specified within it or the Sections tag.

160–161

I think you can merge this test case with case b) above (if you drop the flags bit).

179

Fold this in with the similar test cases for sh_entsize etc.

201

Ditto.

I am looking. Going to suggest a few little changes in the code a bit later today.

grimar added inline comments.May 22 2020, 4:26 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
773

FTR: You should probably use YAMLSec->Offset here instad of "None" to allow setting any offset using the "Offset" section key. (And this needs a test case). I think for this patch you can leave it as is though.

787

It should not be llvm_unreachable, because it is possible to reach it using the following input:

--- !ELF
FileHeader:
  Class:   ELFCLASS64
  Data:    ELFDATA2LSB
  Type:    ET_EXEC
  Machine: EM_X86_64
Sections:
  - Name:    .debug_str
    Type:    SHT_DYNAMIC
    Flags:   [SHF_MERGE, SHF_STRINGS]
    Content: "610062006300"
DWARF:

(I guess it could be a error instead for start).

803

I'd suggest the following code change:

static bool shouldEmitDWARF(Optional<DWARFYAML::Data> DWARF,
                            StringRef SecName) {
  return StringSwitch<bool>(SecName)
      .Case(".debug_str", !DWARF->DebugStrings.empty())
      .Default(false);
}

template <class ELFT>
static uint64_t emitDWARF(typename ELFT::Shdr &SHeader, StringRef Name,
                          const DWARFYAML::Data &DWARF, raw_ostream &OS) {
  uint64_t BeginOffset = OS.tell();
  if (Name == ".debug_str")
    DWARFYAML::EmitDebugStr(OS, DWARF);
  else
    llvm_unreachable("unexpected emitDWARF() call");
  return OS.tell() - BeginOffset;
}

template <class ELFT>
void ELFState<ELFT>::initDWARFSectionHeader(Elf_Shdr &SHeader, StringRef Name,
                                            ContiguousBlobAccumulator &CBA,
                                            ELFYAML::Section *YAMLSec) {
  zero(SHeader);
  SHeader.sh_name = DotShStrtab.getOffset(ELFYAML::dropUniqueSuffix(Name));
  SHeader.sh_type = YAMLSec ? YAMLSec->Type : ELF::SHT_PROGBITS;
  SHeader.sh_addralign = YAMLSec ? (uint64_t)YAMLSec->AddressAlign : 1;
  SHeader.sh_offset = alignToOffset(CBA, SHeader.sh_addralign, /*Offset=*/None);

  ELFYAML::RawContentSection *RawSec =
      dyn_cast_or_null<ELFYAML::RawContentSection>(YAMLSec);
  if (Doc.DWARF && shouldEmitDWARF(*Doc.DWARF, Name)) {
    if (RawSec && (RawSec->Content || RawSec->Size))
      reportError("cannot specify the '" + Name +
                  "' section contents in the 'DWARF' entry and the 'Content' "
                  "or 'Size' in the 'Sections' entry at the same time");
    else
      SHeader.sh_size = emitDWARF<ELFT>(SHeader, Name, *Doc.DWARF, CBA.getOS());
  } else if (RawSec) {
    SHeader.sh_size = writeContent(CBA.getOS(), RawSec->Content, RawSec->Size);
  } else {
    reportError("not supported"); // ??
  }

....

Benefits are:

  1. sh_size is set in a single place.
  2. "cannot specify the ...Content/SIze ..." error is reported before emitting the DWARF data and not after.
  3. no need to have generateContentsFromDWARFEntry member.
grimar added inline comments.May 22 2020, 4:40 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
803

BTW, you even can avoid having the

static bool shouldEmitDWARF(Optional<DWARFYAML::Data> DWARF,
                            StringRef SecName) {
  return StringSwitch<bool>(SecName)
      .Case(".debug_str", !DWARF->DebugStrings.empty())
      .Default(false);
}

Because getUsedSectionNames() provides you all information needed
to avoid checking !DWARF->DebugStrings.empty() here again (it seems).
Perhaps `getUsedSectionNames() could return MapVector<StringRef> and you could
do something like:

static bool shouldEmitDWARF(Optional<DWARFYAML::Data> DWARF,
                            StringRef SecName) {
  MapVector<StringRef> M = DWARF.getUsedSectionNames();
  if (!SecName.empty() && SecName[0] == '.' && M.count(SecName.drop_front())
    return true;
  return false;
}
Higuoxing updated this revision to Diff 265844.May 23 2020, 2:57 AM

Addressed comments:

[ObjectYAML]

  • Support adjusting section offset via YAMLSec->Offset
  • Replace llvm_unreachable by reportError() and add one testcase (i) for it.
  • Thanks for @grimar 's suggestion
    • Add two helper functions: shouldEmitDWARF(), emitDWARF().
    • Change the return type of getUsedSectionNames() to SetVector<>

[Test]

  • Use precise check tags.
  • Merge some testcases
  • Add test for 'offset' property in (g) and (f).
  • Add one test (h).
    • .debug_str is defined in the "Sections" entry, but with SHT_DYNAMIC tag.
  • Add one test (i).
    • Only the "DWARF" entry exists. yaml2obj will not emit the .debug_str section.
  • Use llvm-readelf --hex-dump to check the content of zeros.
grimar accepted this revision.EditedMay 25 2020, 2:43 AM

LGTM with 2 last comments addressed.

Please wait for @jhenderson approvement too.

llvm/lib/ObjectYAML/ELFEmitter.cpp
158

You do not need to make emitDWARF to be a member. It can be just a static helper function.

751

Just

return (!Name.empty() && Name[0] == '.' && DebugSecNames.count(Name.drop_front()));

Perhaps also worth adding a comment. I guess for people who are unfamilar with what is going on here
it might be not that easy to understand the logic.

This revision is now accepted and ready to land.May 25 2020, 2:43 AM
MaskRay added inline comments.May 25 2020, 10:55 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
749

return Name.consume_front(".") && DebugSecnames.count(Name);

782

Use "section '" + Name + "' " and omit the

ditto below

790

We usually use ("section '" + Name + "' " + ...

llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
17

--section-headers is very common and people tend to recognize -S

193

FileCheck /dev/null --implicit-check-not=.debug_str

Higuoxing updated this revision to Diff 266098.May 25 2020, 7:12 PM

Address comments.

  • Make emitDWARF() be a static helper function.
  • Simplify the logic inside shouldEmitDWARF()
  • Use llvm-readelf -S rather than llvm-readelf --section-headers
  • Make error messages be consistent with others.
jhenderson added inline comments.May 26 2020, 2:17 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
788–789

Probably I'd say "can only be initialized via the 'DWARF' entry or a section's 'Content' or 'Size' fields"

llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
2

g) Something show that assignSectionAddresses is called.

I'm not sure I see which test case demonstrates this?

139

We probably want a "Link:" reference too (e.g. "Link: .symtab").

149

It might make slightly more sense for f) and g) to be switched around (have the "base" case before the ones which include more data).

186–187

My instinct says that this should only be an error if we start using something like "Entries:" (for dynamic sections) etc. Otherwise, we cannot test the case where the type of a debug section is one of those kinds of sections (dynamic, relocation etc). Thus the cases below should not be errors:

- Name: .debug_str
  Type: SHT_DYNAMIC
- Name: .debug_str
  Type: SHT_DYNAMIC
  Size: 42
- Name: .debug_str
  Type: SHT_DYNAMIC
  Content: '006f00'

whereas the following would be:

- Name: .debug_str
  Type: SHT_DYNAMIC
  Entries:
   - Tag: DT_NULL
     Value: 0
Higuoxing marked an inline comment as done.May 26 2020, 2:49 AM
Higuoxing added inline comments.
llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
2

The default sh_addr is 0x0000 for .debug_str. Test (f) shows that it's overridden to be 0x2020.

Higuoxing updated this revision to Diff 266415.May 26 2020, 9:18 PM

Thanks for reviewing!


Addressed comments:

  • sh_link can be overridden as well.
  • Support generating debug sections from RawContentSection if their sh_type are the preserved one, e.g., SHT_DYNAMIC.
  • Reject the input at the lexical level, if the debug section is initialized from non-RawContentSection.
Higuoxing requested review of this revision.May 26 2020, 9:20 PM

I think this revision needs another round of reviewing, since I change the logic in DWARFYAML.cpp

FYI, I'm not working tomorrow, so it'll be Friday before I am able to give further comments. If @grimar approves this before Friday, I'm happy for you to commit and I'll review it post-commit after I'm back.

llvm/lib/ObjectYAML/ELFYAML.cpp
1254

I'd add the following sentence to this comment: "This means they can have other SHT_* type values, but that the special parsing of those types is not supported."

This seems like a reasonable solution to me, but I'm interested to here @grimar's thoughts.

llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
145

There's a typo: OVERRIEDDEN -> OVERRIDDEN

192

I'm not sure "preserved" makes sense here. Probably you mean "is a type which usually uses a different syntax".

196

I'd avoid using -EMPTY in your suffix name, since -EMPTY is a valid FileCheck suffix. Probably it's okay just to call these cases DYNAMIC1, DYNAMIC2 etc.

198

Did you deliberately not include the DYNAMIC-EMPTY tag here? If so, you need '##'. Same applies in all the other test cases too.

grimar added inline comments.May 27 2020, 3:03 AM
llvm/lib/ObjectYAML/ELFYAML.cpp
1254

If I understand it right, that change means:

  1. We are unable to create, for example, SHT_DYNAMIC section named .debug_ anymore.
  2. It was done to allow creating .debug_ sections that has sh_type overriden to something, like to SHT_DYNAMIC.

Currently, yaml2obj uses section type to make a specific section mapping. It looks generally more right than checking sections by name (in the ideal world). Unfortunately we have to check section names still. But it is just because they have no dedicated SHT_* types first of all I believe.

And my concern that (2) might be confusing, because the current way we have to override things is to use Sh* fields. E.g:

// This can be used to override the offset stored in the sh_name field.
// It does not affect the name stored in the string table.
Optional<llvm::yaml::Hex64> ShName;

// This can be used to override the sh_offset field. It does not place the
// section data at the offset specified.
Optional<llvm::yaml::Hex64> ShOffset;

// This can be used to override the sh_size field. It does not affect the
// content written.
Optional<llvm::yaml::Hex64> ShSize;

// This can be used to override the sh_flags field.
Optional<llvm::yaml::Hex64> ShFlags;

Also, what if we want to create a SHT_RELR section with the type SHT_DYNAMIC?

I think the better way would be to introduce ShType field that would allow to override the sh_type to anything.
(And I believe it can be done in a follow-up patch, no need to support such specific things from the begining).

How does it sound?

grimar added inline comments.May 27 2020, 3:23 AM
llvm/lib/ObjectYAML/ELFYAML.cpp
1254

To clarify.

This will create a dynamic section with the name .debug_str.

- Name: .debug_str
  Type: SHT_DYNAMIC

And this then can be used to create a debug section with the type SHT_DYNAMIC:

- Name: .debug_str
  Type: SHT_PROGBITS
  ShType: SHT_DYNAMIC
Higuoxing marked an inline comment as done.May 27 2020, 4:04 AM
Higuoxing added inline comments.
llvm/lib/ObjectYAML/ELFYAML.cpp
1254

Thank you @grimar, It sounds reasonable to me since we've already had shName/shSize... fields to override section header. Besides, it's more flexaible to create sections with uncommon properties.

Thanks for @grimar 's suggestion, I revert this revision to the previous one with some small changes.

Changes are listed below.

  • The error message is changed to "can only be initialized via the 'DWARF' entry or a section's 'Content' or 'Size' fields"
  • Support the modification of the sh_link value.

It looks good, have a last suggestion.
I've tested that all tests in debug-str.yaml pass with it, except --docnum=9, which is fixed to produce the desired (I believe) output.

llvm/lib/ObjectYAML/ELFEmitter.cpp
787

I think you can change this to be unreachable.

If you do the following:

template <class ELFT>
bool ELFState<ELFT>::initImplicitHeader(ContiguousBlobAccumulator &CBA,
                                        Elf_Shdr &Header, StringRef SecName,
                                        ELFYAML::Section *YAMLSec) {
  // Check if the header was already initialized.
  if (Header.sh_offset)
    return false;

  if (SecName.startswith(".debug_")) {
    // If we have, for example, a SHT_DYNAMIC section with the name ".debug_*",
    // we treat it as dynamic section.
    if (YAMLSec && !isa<ELFYAML::RawContentSection>(YAMLSec))
      return false;
    initDWARFSectionHeader(Header, SecName, CBA, YAMLSec);
    return true;
  }
...

(the comment could be better, I think, but shows the idea).

llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
198

This should produce the SHT_DYNAMIC section with the name .debug_str, I think. See my suggestion above.

Higuoxing updated this revision to Diff 266812.May 28 2020, 4:27 AM

Thanks for reviewing!

Addressed comments

  • If we have a ".debug_*" section whose "Type" is a preserved one, e.g. SHT_DYNAMIC, we will treat it as a dynamic section.
  • Update test (i).
grimar accepted this revision.May 28 2020, 6:20 AM

LGTM with a nit. Thanks!

llvm/lib/ObjectYAML/ELFEmitter.cpp
409

I think instead of "will treat it as a dynamic section" it should say "will not treat it as a debug section",
otherwise it probably sounds like we treat all preserved sections as dynamic.

This revision is now accepted and ready to land.May 28 2020, 6:20 AM
This revision was automatically updated to reflect the committed changes.

It looks like my last batch of test comments weren't addressed. Please could you post a new patch to fix them all.

llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
145

It looks like this comment wasn't addressed?

192

This comment wasn't addressed either. "preserved" is not the right word for this sentence ("preserve" means "to keep something from going bad" or "to protect" and similar meanings).

"with a preserved "Type"" -> "with a "Type" with non-standard syntax"

198

This comment wasn't addressed either.

Higuoxing marked 3 inline comments as done.May 29 2020, 11:55 PM
Higuoxing added inline comments.
llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
145

Should be fixed in D80802. Sorry I didn't notice it here.

196

The check-prefix is changed to DYN-SHDR in my commit. Sorry, I didn't submit a updated-diff before my commit.

jhenderson added inline comments.Jun 1 2020, 2:52 AM
llvm/test/tools/yaml2obj/ELF/DWARF/debug-str.yaml
145

For posterity:
D80802 -> D80862