This is an archive of the discontinued LLVM Phabricator instance.

[llvm-dwp] Add support for rnglists and loclists
ClosedPublic

Authored by kimanh on May 5 2021, 4:39 AM.

Details

Summary

This patch updates llvm-dwp to include rnglists and loclists
when parsing debug sections.

Diff Detail

Event Timeline

kimanh requested review of this revision.May 5 2021, 4:39 AM
kimanh created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2021, 4:39 AM
kimanh updated this revision to Diff 343000.May 5 2021, 4:42 AM

Minor change: edit comment in test

kimanh updated this revision to Diff 343002.May 5 2021, 4:52 AM

Minor: use wildcard in test.

dblaikie accepted this revision.May 5 2021, 1:14 PM

Sounds OK - test might be able to be simplified a bit.

llvm/test/tools/llvm-dwp/X86/loclists.s
26–31

No need for actual .text in these files - these files are only dwo files, so they can have arbitrary hardcoded constants for the starting/ending offsets.

llvm/test/tools/llvm-dwp/X86/rnglists.s
63–77

Strange to have a complicated abbreviation that's not actually used - maybe if this abbrev was simplified to compile_unit, CHILDREN_no, with no attributes - then the TU/CU could be a bit more valid by only including that one DIE without any attributes/children - rather than missing their unit DIE entirely.

(I guess this was tested this way because something in llvm-dwp ended up requiring that the CU had a valid DIE? Maybe we could fix that so there's no need for there to be that empty/degenerate DIE in the CU? (so it can 'work' the same as the TU - without any DIEs at all))

This revision is now accepted and ready to land.May 5 2021, 1:14 PM
kimanh updated this revision to Diff 343328.May 6 2021, 2:02 AM
kimanh marked an inline comment as done.

Simplify tests:

  • use hardocoded constants as offsets
  • simplify abbrev section and remove str section
kimanh added inline comments.May 6 2021, 2:14 AM
llvm/test/tools/llvm-dwp/X86/rnglists.s
63–77

I have simplified the abbrev section now, as you suggested: With CHILDREN_no, and no attributes, and removed the debug_str section that it was referring to too.

Yes exactly, for CU units there are "CUIdentifiers" (Signature, name, dwo id) that are recorded for error purposes, if a duplicate CU is detected. Before extracting these attributes there is a check whether this unit is actually is a CU unit or not, so we need the abbrev section to at least have info on the unit type. This check does currently not exist for TU units (for DWARF5 this will be checked using the unit type field, for DWARF4 type units this is however not checked at the moment). I think the check if it's a compile unit should stay where it is (given that we need to access the attributes for error purposes afterwards), so I'm not sure how we should change the code to allow for no DIEs at all.

Yes exactly, for CU units there are "CUIdentifiers" (Signature, name, dwo id) that are recorded for error purposes, if a duplicate CU is detected. Before extracting these attributes there is a check whether this unit is actually is a CU unit or not, so we need the abbrev section to at least have info on the unit type. This check does currently not exist for TU units (for DWARF5 this will be checked using the unit type field, for DWARF4 type units this is however not checked at the moment). I think the check if it's a compile unit should stay where it is (given that we need to access the attributes for error purposes afterwards), so I'm not sure how we should change the code to allow for no DIEs at all.

Ah, right right, I remember adding that - potentially that code could be refactored to not try to parse the name until the dwo ID collision occurs - then these tests wouldn't need any DIEs, for instance (which isn't, in and of itself, a valuable thing - but does show that usually the content of the DIE tree is not examined/used/relevant to the DWP tool, only a "nice to have" for error messages).

Yes exactly, for CU units there are "CUIdentifiers" (Signature, name, dwo id) that are recorded for error purposes, if a duplicate CU is detected. Before extracting these attributes there is a check whether this unit is actually is a CU unit or not, so we need the abbrev section to at least have info on the unit type. This check does currently not exist for TU units (for DWARF5 this will be checked using the unit type field, for DWARF4 type units this is however not checked at the moment). I think the check if it's a compile unit should stay where it is (given that we need to access the attributes for error purposes afterwards), so I'm not sure how we should change the code to allow for no DIEs at all.

Ah, right right, I remember adding that - potentially that code could be refactored to not try to parse the name until the dwo ID collision occurs - then these tests wouldn't need any DIEs, for instance (which isn't, in and of itself, a valuable thing - but does show that usually the content of the DIE tree is not examined/used/relevant to the DWP tool, only a "nice to have" for error messages).

I had another look at the code: not all of the information that is used in the CUIdentifiers is actually needed at that point in time (although it is set, but it is set to null). The only piece that definitely is required is the signature of compile units if parsing DWARFv4, since the signature is used to build the index. The signature is taken from the .debug_abbrev section, so the check if it's actually a compile unit or not should stay (at least for v4), so I tend towards keeping it this way. WDYT?

Yes exactly, for CU units there are "CUIdentifiers" (Signature, name, dwo id) that are recorded for error purposes, if a duplicate CU is detected. Before extracting these attributes there is a check whether this unit is actually is a CU unit or not, so we need the abbrev section to at least have info on the unit type. This check does currently not exist for TU units (for DWARF5 this will be checked using the unit type field, for DWARF4 type units this is however not checked at the moment). I think the check if it's a compile unit should stay where it is (given that we need to access the attributes for error purposes afterwards), so I'm not sure how we should change the code to allow for no DIEs at all.

Ah, right right, I remember adding that - potentially that code could be refactored to not try to parse the name until the dwo ID collision occurs - then these tests wouldn't need any DIEs, for instance (which isn't, in and of itself, a valuable thing - but does show that usually the content of the DIE tree is not examined/used/relevant to the DWP tool, only a "nice to have" for error messages).

I had another look at the code: not all of the information that is used in the CUIdentifiers is actually needed at that point in time (although it is set, but it is set to null). The only piece that definitely is required is the signature of compile units if parsing DWARFv4, since the signature is used to build the index. The signature is taken from the .debug_abbrev section, so the check if it's actually a compile unit or not should stay (at least for v4), so I tend towards keeping it this way. WDYT?

I don't think there's anything you /have/ to do here - my suggestion was more a "could be nice", sorry for the tone/confusion.

I think in principle it would be nice if we didn't parse stuff we didn't need to - yeah, that means in DWARFv4 we still need to parse the CU's main DIE to get the dwo_id. But in DWARFv5 we can get away with only reading the header - then if there's a duplicate dwo_id /then/ we could try to parse the DIE to get the dwo name, etc.

Just a nice to have, maybe, at some point. Though there are far bigger performance problems to deal with with llvm-dwp anyway, so not to worry. Sorry for the diversion.

Yes exactly, for CU units there are "CUIdentifiers" (Signature, name, dwo id) that are recorded for error purposes, if a duplicate CU is detected. Before extracting these attributes there is a check whether this unit is actually is a CU unit or not, so we need the abbrev section to at least have info on the unit type. This check does currently not exist for TU units (for DWARF5 this will be checked using the unit type field, for DWARF4 type units this is however not checked at the moment). I think the check if it's a compile unit should stay where it is (given that we need to access the attributes for error purposes afterwards), so I'm not sure how we should change the code to allow for no DIEs at all.

Ah, right right, I remember adding that - potentially that code could be refactored to not try to parse the name until the dwo ID collision occurs - then these tests wouldn't need any DIEs, for instance (which isn't, in and of itself, a valuable thing - but does show that usually the content of the DIE tree is not examined/used/relevant to the DWP tool, only a "nice to have" for error messages).

I had another look at the code: not all of the information that is used in the CUIdentifiers is actually needed at that point in time (although it is set, but it is set to null). The only piece that definitely is required is the signature of compile units if parsing DWARFv4, since the signature is used to build the index. The signature is taken from the .debug_abbrev section, so the check if it's actually a compile unit or not should stay (at least for v4), so I tend towards keeping it this way. WDYT?

I don't think there's anything you /have/ to do here - my suggestion was more a "could be nice", sorry for the tone/confusion.

I think in principle it would be nice if we didn't parse stuff we didn't need to - yeah, that means in DWARFv4 we still need to parse the CU's main DIE to get the dwo_id. But in DWARFv5 we can get away with only reading the header - then if there's a duplicate dwo_id /then/ we could try to parse the DIE to get the dwo name, etc.

Just a nice to have, maybe, at some point. Though there are far bigger performance problems to deal with with llvm-dwp anyway, so not to worry. Sorry for the diversion.

Oh, no worries at all, and thanks for the clarification! :) Ok, I'll keep it this way then. Since this patch is dependent on the others I'll ping this thread again as soon as the parent patches are ready.

kimanh updated this revision to Diff 349249.Jun 2 2021, 6:05 AM

Rebase and reupload to trigger premerge checks.

This revision was automatically updated to reflect the committed changes.