This is an archive of the discontinued LLVM Phabricator instance.

[DWARF] Check for format mismatch between CU and Range List Table.
AcceptedPublic

Authored by ikudrin on Sep 3 2019, 7:11 AM.

Details

Summary

As stated in section 7.4, p. 198 of DWARF Standard Version 5, "The 32-bit and 64-bit DWARF format conventions must not be intermixed within a single compilation unit."

Diff Detail

Event Timeline

Orlando added a subscriber: Orlando.Sep 3 2019, 7:30 AM
Orlando added inline comments.
lib/DebugInfo/DWARF/DWARFUnit.cpp
311 ↗(On Diff #218449)

Is there a reason for OffsetIn to exist as opposed to just using Offset here?

ikudrin marked an inline comment as done.Sep 3 2019, 7:53 AM
ikudrin added inline comments.
lib/DebugInfo/DWARF/DWARFUnit.cpp
311 ↗(On Diff #218449)

Offset is changed in Table.extractHeaderAndOffsets() (line 323), but we have to show the original value in the error message (line 326).

aprantl accepted this revision.Sep 3 2019, 9:28 AM
aprantl added inline comments.
lib/DebugInfo/DWARF/DWARFUnit.cpp
311 ↗(On Diff #218449)

Perhaps: StartOffset?

This revision is now accepted and ready to land.Sep 3 2019, 9:28 AM

I'm not sure if using "OffsetIn" in the diagnostic is the right thing to do.

Honestly the fact that DWARFv5 specifies that the "rnglist base address" (& loclist bast address, and str_offsets base) point to the beginning of the list rather than the start of the header is super quirky and I don't think it's helpful to users to highlight that. (though I doubt DWARFvNext would bother fixing this - it'sjust quirky, not super problematic for any practical reasons I can think of)

I'm not sure if using "OffsetIn" in the diagnostic is the right thing to do.

Maybe it will be better to use the adjusted offset which points to the start of the header?

ikudrin updated this revision to Diff 218630.Sep 4 2019, 3:13 AM
  • Print the adjusted offset which points to the start of the header.

@dblaikie, what about this variant?

I'm not sure if using "OffsetIn" in the diagnostic is the right thing to do.

Honestly the fact that DWARFv5 specifies that the "rnglist base address" (& loclist bast address, and str_offsets base) point to the beginning of the list rather than the start of the header is super quirky and I don't think it's helpful to users to highlight that. (though I doubt DWARFvNext would bother fixing this - it'sjust quirky, not super problematic for any practical reasons I can think of)

IIRC this is one of Cary's, and argued that it could be directly used to index into section, instead of you having to step past the header to use it. Anyway we let him get away with it, for better or worse. It would have to be essentially unusable or horribly inefficient, for the committee to be willing to change it now.

I'm not sure if using "OffsetIn" in the diagnostic is the right thing to do.

Honestly the fact that DWARFv5 specifies that the "rnglist base address" (& loclist bast address, and str_offsets base) point to the beginning of the list rather than the start of the header is super quirky and I don't think it's helpful to users to highlight that. (though I doubt DWARFvNext would bother fixing this - it'sjust quirky, not super problematic for any practical reasons I can think of)

IIRC this is one of Cary's, and argued that it could be directly used to index into section, instead of you having to step past the header to use it. Anyway we let him get away with it, for better or worse.

Meh, I can't imagine the (Start+Header)[N] would be too expensive (& Start+Header could be computed once/ahead of time easily enough).

It would have to be essentially unusable or horribly inefficient, for the committee to be willing to change it now.

Not sure what would be unusable or inefficient about it - could you explain more?

I'd figure if DWARFvNext changed the semantics of the various *_base attributes, then an implementation would look at the CU version to decide whether it should walk backwards to find the header, or forwards to find the list.

  • Print the adjusted offset which points to the start of the header.

@dblaikie, what about this variant?

This update didn't seem to touch a test case - is this untested?

This update didn't seem to touch a test case - is this untested?

The test is for a DWO case, where we do not adjust the offset. I would not expect the check to trigger for regular files because we adjust the offset depending on the assumed header size.

ikudrin updated this revision to Diff 218849.Sep 5 2019, 12:11 AM
  • Rebase on the current tip.

I'm not sure if using "OffsetIn" in the diagnostic is the right thing to do.

Honestly the fact that DWARFv5 specifies that the "rnglist base address" (& loclist bast address, and str_offsets base) point to the beginning of the list rather than the start of the header is super quirky and I don't think it's helpful to users to highlight that. (though I doubt DWARFvNext would bother fixing this - it'sjust quirky, not super problematic for any practical reasons I can think of)

IIRC this is one of Cary's, and argued that it could be directly used to index into section, instead of you having to step past the header to use it. Anyway we let him get away with it, for better or worse.

Meh, I can't imagine the (Start+Header)[N] would be too expensive (& Start+Header could be computed once/ahead of time easily enough).

By that argument, computing (Start-Header) is not expensive and so why is it worth changing? For aesthetics? If it bugs you that much, you can write the proposal and post it to dwarfstd.org; it doesn't have to include the edit to the wording, just "these attributes should point here instead of there."

It would have to be essentially unusable or horribly inefficient, for the committee to be willing to change it now.

Not sure what would be unusable or inefficient about it - could you explain more?

(rewrite) There isn't anything terribly unusable or inefficient about it, so unless somebody really complains, the committee is unlikely to change it now. This isn't the C++ committee, after all.

I'm not sure if using "OffsetIn" in the diagnostic is the right thing to do.

Honestly the fact that DWARFv5 specifies that the "rnglist base address" (& loclist bast address, and str_offsets base) point to the beginning of the list rather than the start of the header is super quirky and I don't think it's helpful to users to highlight that. (though I doubt DWARFvNext would bother fixing this - it'sjust quirky, not super problematic for any practical reasons I can think of)

IIRC this is one of Cary's, and argued that it could be directly used to index into section, instead of you having to step past the header to use it. Anyway we let him get away with it, for better or worse.

Meh, I can't imagine the (Start+Header)[N] would be too expensive (& Start+Header could be computed once/ahead of time easily enough).

By that argument, computing (Start-Header) is not expensive and so why is it worth changing? For aesthetics? If it bugs you that much, you can write the proposal and post it to dwarfstd.org; it doesn't have to include the edit to the wording, just "these attributes should point here instead of there."

Yep, that's roughly what I was getting at when I said "(though I doubt DWARFvNext would bother fixing this - it'sjust quirky, not super problematic for any practical reasons I can think of)".

It would have to be essentially unusable or horribly inefficient, for the committee to be willing to change it now.

Not sure what would be unusable or inefficient about it - could you explain more?

(rewrite) There isn't anything terribly unusable or inefficient about it, so unless somebody really complains, the committee is unlikely to change it now. This isn't the C++ committee, after all.

Not sure I understand the snark at the C++ committee, but anyway.

Not sure I understand the snark at the C++ committee, but anyway.

Oh, because of their perennial willingness to change a released standard, with no way to identify revisions. The C++11 of today is not the same as the C++11 of 2011, for example. This is basic revision control, everybody (else) understands this. I mean, if we found a bug in Clang 8.0.0, we wouldn't overwrite the archived 8.0.0 to fix it. But that's effectively what the C++ committee does with its "defect repair."
The DWARF committee has the decency to provide a new version number when the interpretation of some attribute changes.

Not sure I understand the snark at the C++ committee, but anyway.

Oh, because of their perennial willingness to change a released standard, with no way to identify revisions. The C++11 of today is not the same as the C++11 of 2011, for example. This is basic revision control, everybody (else) understands this. I mean, if we found a bug in Clang 8.0.0, we wouldn't overwrite the archived 8.0.0 to fix it. But that's effectively what the C++ committee does with its "defect repair."
The DWARF committee has the decency to provide a new version number when the interpretation of some attribute changes.

Ah, you're referring to the C++ standard defect report process. Fair enough. Though it seems a bit out of place/unrelated to this discussion. I didn't intend to suggest that DWARFv4 should be retroactively changed - sorry if it came off that way.

Happy to have a separate discussion about the C++ committee process some time.

Sorry to intervene, but what about the patch itself? Is it OK to land with the latest changes or should I improve it somehow?

Ah right, the actual patch. I think it would be better to also report the base of the unit that referenced the range list, if that's not too inconvenient.
What you have here is fine with me, apart from that question.

FTR, I have also been thinking that perhaps the dumper should be more tolerant of this kind of thing, attempting to report more of what is present rather than throwing a hissy fit when things aren't perfect. But I'm just noting that for the record. What this patch does is consistent with how the rest of lib/DebugInfo/DWARF behaves, and revisiting that behavior is well beyond the scope of this patch.

As I said above, the patch so far is fine with me, with the one open question about reporting the referencing unit.

ikudrin updated this revision to Diff 234874.Dec 20 2019, 5:56 AM
  • Rebase to the tip.
  • Add reporting the offset of a compilation unit which format is taken as a reference.
dblaikie accepted this revision.Dec 26 2019, 8:37 AM

yeah, as @probinson said, at some point someone could add a callback error filter here to decide what's a fatal error and what isn't - but this seems reasonable for now

Actually, I've discovered that we parse range list table headers for every unit, including type units, even if they do not reference them. Thus, this patch creates more problems than solves; it should be postponed until we make extracting range list tables lazy and do it only for compilation units which actually reference the tables.