This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Fix DIE value emitters to be compatible with DWARF64 (2/19).
ClosedPublic

Authored by ikudrin on Sep 2 2020, 6:22 AM.

Details

Summary

DW_FORM_sec_offset and DW_FORM_strp imply values of different sizes with DWARF32 and DWARF64. The patch fixes DIE value classes to use correct sizes when emitting their values. For DIELocList it ensures that the requested DWARF form matches the current DWARF format because that class uses a method that selects the size automatically.

Diff Detail

Event Timeline

ikudrin created this revision.Sep 2 2020, 6:22 AM
ikudrin requested review of this revision.Sep 2 2020, 6:22 AM

Might be too fine-grained and better off rolled into whatever patch exercises this functionality when emitting assembly.

llvm/lib/CodeGen/AsmPrinter/DIE.cpp
479

did one of your other patches added a function to AsmPrinter to get the size directly, rather than having to query the version then compute the size from that? (or maybe you added that to MCStreamer) Would be nice to avoid rewriting this code to compute size from DWARF64-ness.

ikudrin updated this revision to Diff 289639.Sep 2 2020, 10:44 PM
ikudrin marked an inline comment as done.
  • Changed AP->isDwarf64() ? 8 : 4 to AP->getDwarfOffsetByteSize().

Might be too fine-grained and better off rolled into whatever patch exercises this functionality when emitting assembly.

That would require to update a particular method several times and potentially would leave it in a partially updated state because some code paths are hard to reach. I believe that the way of this patch is simple and straightforward, as it changes easily observable code places and ensures that they are now accurate and can be reliably used from the other code parts.

Might be too fine-grained and better off rolled into whatever patch exercises this functionality when emitting assembly.

That would require to update a particular method several times and potentially would leave it in a partially updated state because some code paths are hard to reach.

Could you explain this in more detail?

I'm not really comfortable with updating the size calculation while not updating the emission calculation - that seems like this code is then currently unusable as-committed, so not a great granularity to commit at. I really appreciate all the work you've done in splitting this patch series up, but it might be a bit too fine-grained overall.

That would require to update a particular method several times and potentially would leave it in a partially updated state because some code paths are hard to reach.

Could you explain this in more detail?

Let's take, for example, DIEExpr. As far as I can find, with DW_FORM_sec_offset it is used only in the DebugInfo/DWARF unit tests, in dwarfgen::DIE::addStrOffsetsBaseAttribute(). If we took the approach with end-to-end functional testing, this class would not be updated. And the path is not simple to track, by the way.

Other classes in this patch are used to emit compilation units, so moving the changes there would bloat that patch up, making it less focused and harder to track how each individual change is tested because, when looking to the test source, it is not evident which attributes use a particular class to emit their values.

dblaikie accepted this revision.Sep 10 2020, 1:33 PM

That would require to update a particular method several times and potentially would leave it in a partially updated state because some code paths are hard to reach.

Could you explain this in more detail?

Let's take, for example, DIEExpr. As far as I can find, with DW_FORM_sec_offset it is used only in the DebugInfo/DWARF unit tests, in dwarfgen::DIE::addStrOffsetsBaseAttribute(). If we took the approach with end-to-end functional testing, this class would not be updated. And the path is not simple to track, by the way.

API only intended for unit tests would be tested/updated with unit tests.

Other classes in this patch are used to emit compilation units, so moving the changes there would bloat that patch up, making it less focused and harder to track how each individual change is tested because, when looking to the test source, it is not evident which attributes use a particular class to emit their values.

I think what might've been tidier could've been dumping sections alone - guess it's always going to be broken in some way in this intermediate state of some sections with DWARF64 and some without, until the whole patch series is in. But perhaps adding support for one section (while potentially mangling another section/causing it to be partly DWARF32/DWARF64, if it used some common API but had bits of its own implementation - like the loclist/rnglist situation) - adding tests for that section (llvm-dwarfdumping only that section - perhaps that deosn't work for some DWARFv4 sections where it's not self-describing), then adding another, etc. (so, for instance, doing str_offsets first, then doing debug_info that depends on str_offsets, etc)

Also would've probably be good to do more of this incrementally as you went - as much as I appreciate figuring out the end picture and then trying to piece together an ideal patch series, it makes it harder to change things (having to adjust all the subsequent patches) than if we could've discussed them earlier, etc. (& just a lot to review at one time, compared to a small patch here and there)

Anyway - in the interests of practicality, I'll try to get through these mostly on the "what does it look like overall" rather than the nitpicking nuance of what the perfect patch series might look like from my perspective.

This revision is now accepted and ready to land.Sep 10 2020, 1:33 PM

Let's take, for example, DIEExpr. As far as I can find, with DW_FORM_sec_offset it is used only in the DebugInfo/DWARF unit tests, in dwarfgen::DIE::addStrOffsetsBaseAttribute(). If we took the approach with end-to-end functional testing, this class would not be updated. And the path is not simple to track, by the way.

API only intended for unit tests would be tested/updated with unit tests.

The DIEExpr class is not intended to be used only in unit tests; the CodeGen library uses it, but with different DWARF forms. The generator in DebugInfo/DWARF unit tests does not test the class, it just uses it to create some bits of DWARF data. Moreover, these unit tests are for a different library and as such are not the best place to be updated if we want to fix the issue in a class of the CodeGen library. It is much better to test the implementation in the library's own unit tests, so this patch does exactly that. I do not want to leave DIEExpr in the not updated state, because the new code may use it without noticing that the implementation is broken for DWARF64.

Other classes in this patch are used to emit compilation units, so moving the changes there would bloat that patch up, making it less focused and harder to track how each individual change is tested because, when looking to the test source, it is not evident which attributes use a particular class to emit their values.

I think what might've been tidier could've been dumping sections alone - guess it's always going to be broken in some way in this intermediate state of some sections with DWARF64 and some without, until the whole patch series is in. But perhaps adding support for one section (while potentially mangling another section/causing it to be partly DWARF32/DWARF64, if it used some common API but had bits of its own implementation - like the loclist/rnglist situation) - adding tests for that section (llvm-dwarfdumping only that section - perhaps that deosn't work for some DWARFv4 sections where it's not self-describing), then adding another, etc. (so, for instance, doing str_offsets first, then doing debug_info that depends on str_offsets, etc)

I believe that my set of patches does exactly that. After a few preparation steps, which are tested separately with unit tests, there is a series of patches for individual sections, where interferences between patches are minimal. And a patch for each section is tested with llvm-dwarfdump which extracts only the required section(s), ignoring others, which might still be broken.

Also would've probably be good to do more of this incrementally as you went - as much as I appreciate figuring out the end picture and then trying to piece together an ideal patch series, it makes it harder to change things (having to adjust all the subsequent patches) than if we could've discussed them earlier, etc. (& just a lot to review at one time, compared to a small patch here and there)

Well, the current state of the series is definitely not the way I worked on the patches. It is maybe the third iteration, cleaned, polished, and rearranged so that each step is easily observable and thus more convenient for review. I guess if I had been posting the patches in progress, that would be a real mess.

Anyway - in the interests of practicality, I'll try to get through these mostly on the "what does it look like overall" rather than the nitpicking nuance of what the perfect patch series might look like from my perspective.

Thank you for reviewing the patches!

Let's take, for example, DIEExpr. As far as I can find, with DW_FORM_sec_offset it is used only in the DebugInfo/DWARF unit tests, in dwarfgen::DIE::addStrOffsetsBaseAttribute(). If we took the approach with end-to-end functional testing, this class would not be updated. And the path is not simple to track, by the way.

API only intended for unit tests would be tested/updated with unit tests.

The DIEExpr class is not intended to be used only in unit tests; the CodeGen library uses it, but with different DWARF forms. The generator in DebugInfo/DWARF unit tests does not test the class, it just uses it to create some bits of DWARF data. Moreover, these unit tests are for a different library and as such are not the best place to be updated if we want to fix the issue in a class of the CodeGen library. It is much better to test the implementation in the library's own unit tests, so this patch does exactly that. I do not want to leave DIEExpr in the not updated state, because the new code may use it without noticing that the implementation is broken for DWARF64.

Agreed - I don't mean to suggest leaving pieces un-updated (unless they're like a whole distinct thing, like the Apple names accelerator tables - and in that case including assertions to ensure anyone attempting to use DWARF64 there would find out these bits aren't implemented). Just haggling over the particular implementation/testing strategy. But I know it's a complicated set of tradeoffs.

Other classes in this patch are used to emit compilation units, so moving the changes there would bloat that patch up, making it less focused and harder to track how each individual change is tested because, when looking to the test source, it is not evident which attributes use a particular class to emit their values.

I think what might've been tidier could've been dumping sections alone - guess it's always going to be broken in some way in this intermediate state of some sections with DWARF64 and some without, until the whole patch series is in. But perhaps adding support for one section (while potentially mangling another section/causing it to be partly DWARF32/DWARF64, if it used some common API but had bits of its own implementation - like the loclist/rnglist situation) - adding tests for that section (llvm-dwarfdumping only that section - perhaps that deosn't work for some DWARFv4 sections where it's not self-describing), then adding another, etc. (so, for instance, doing str_offsets first, then doing debug_info that depends on str_offsets, etc)

I believe that my set of patches does exactly that. After a few preparation steps, which are tested separately with unit tests, there is a series of patches for individual sections, where interferences between patches are minimal. And a patch for each section is tested with llvm-dwarfdump which extracts only the required section(s), ignoring others, which might still be broken.

*nod* Certainly not far off between your approach and what I'm suggesting. The differences were that I think str_offsets and debug_info ended up being bundled together somewhat? and the preparatory patches I probably would've rolled into some functionality-changing patches instead. No worries.

Also would've probably be good to do more of this incrementally as you went - as much as I appreciate figuring out the end picture and then trying to piece together an ideal patch series, it makes it harder to change things (having to adjust all the subsequent patches) than if we could've discussed them earlier, etc. (& just a lot to review at one time, compared to a small patch here and there)

Well, the current state of the series is definitely not the way I worked on the patches. It is maybe the third iteration, cleaned, polished, and rearranged so that each step is easily observable and thus more convenient for review. I guess if I had been posting the patches in progress, that would be a real mess.

Yeah, I really appreciate the work that went into it, for sure - I know it's not easy to wrangle and rearrange all that code, and getting patches as they go means accepting less polish/more redundancy/changes in direction, etc. No super clear right answer/best direction by any means.

Anyway - in the interests of practicality, I'll try to get through these mostly on the "what does it look like overall" rather than the nitpicking nuance of what the perfect patch series might look like from my perspective.

Thank you for reviewing the patches!

Sure thing - thanks for all the work on them, and in presenting them in this way!