Page MenuHomePhabricator

[DWARF] Refactor RelocVisitor and fix computation of SHT_RELA-typed relocation entries
ClosedPublic

Authored by MaskRay on Feb 8 2019, 12:02 AM.

Details

Summary

getRelocatedValue may compute incorrect value for SHT_RELA-typed relocation entries.

// DWARFDataExtractor.cpp
uint64_t DWARFDataExtractor::getRelocatedValue(uint32_t Size, uint32_t *Off,
...

// This formula is correct for REL, but may be incorrect for RELA if the value
// stored in the location (getUnsigned(Off, Size)) is not zero.
return getUnsigned(Off, Size) + Rel->Value;

In this patch, we

  • refactor these visit* functions to include a new parameter uint64_t A. Since these visit* functions are no longer used as visitors, rename them to resolve*. + REL: A is used as the addend. A is the value stored in the location where the relocation applies: getUnsigned(Off, Size) + RELA: The addend encoded in RelocationRef is used, e.g. getELFAddend(R)
  • and add another set of supports* functions to check if a given relocation type is handled. DWARFObjInMemory uses them to fail early.

Diff Detail

Repository
rL LLVM

Event Timeline

MaskRay created this revision.Feb 8 2019, 12:02 AM

This is a bit hard to follow in Phab due to the file rename (it looks like all the code is "new" when it isn't) - would it be reasonable to do this without the file rename first, then rename the file in a follow-up commit?

RelocVisitor.h -> RelocationResolver.cpp there are so many differences that git doesn't consider it a rename.

git diff 'HEAD^':include/llvm/Object/RelocVisitor.h lib/Object/RelocationResolver.cpp

The member functions visit* have been made static functions. The changed indentation, changed signature probably make git think they are different chunks.

Anyway, we can see from below the structure of the architecture dispatch is kept.

patch
+      switch (Obj.getArch()) {
       case Triple::x86_64:
-        return visitX86_64(Rel, R, Value);
+        return resolveX86_64;
       case Triple::aarch64:
       case Triple::aarch64_be:
-        return visitAarch64(Rel, R, Value);
+        return resolveAarch64;
       case Triple::bpfel:
       case Triple::bpfeb:
-        return visitBpf(Rel, R, Value);
+        return resolveBpf;
       case Triple::mips64el:
       case Triple::mips64:
-        return visitMips64(Rel, R, Value);
+        return resolveMips64;
       case Triple::ppc64le:
       case Triple::ppc64:
-        return visitPPC64(Rel, R, Value);
+        return resolvePPC64;
       case Triple::systemz:
-        return visitSystemz(Rel, R, Value);
+        return resolveSystemz;
       case Triple::sparcv9:
-        return visitSparc64(Rel, R, Value);
+        return resolveSparc64;
       case Triple::amdgcn:
-        return visitAmdgpu(Rel, R, Value);
+        return resolveAmdgpu;
       default:
-        HasError = true;
-        return 0;
+        return nullptr;
       }
     }

RelocVisitor.h -> RelocationResolver.cpp there are so many differences that git doesn't consider it a rename.

Right - what I'm asking is if you could undo/not apply the file name change in this revision, so the patch/review is easier to read, possibly? (you could undo the file name change & upload a new revision for this review - then the delta might be smaller/easier to read)

git diff 'HEAD^':include/llvm/Object/RelocVisitor.h lib/Object/RelocationResolver.cpp

The member functions visit* have been made static functions. The changed indentation, changed signature probably make git think they are different chunks.

Anyway, we can see from below the structure of the architecture dispatch is kept.

patch
 +      switch (Obj.getArch()) {
        case Triple::x86_64:
 -        return visitX86_64(Rel, R, Value);
 +        return resolveX86_64;
        case Triple::aarch64:
        case Triple::aarch64_be:
 -        return visitAarch64(Rel, R, Value);
 +        return resolveAarch64;
        case Triple::bpfel:
        case Triple::bpfeb:
 -        return visitBpf(Rel, R, Value);
 +        return resolveBpf;
        case Triple::mips64el:
        case Triple::mips64:
 -        return visitMips64(Rel, R, Value);
 +        return resolveMips64;
        case Triple::ppc64le:
        case Triple::ppc64:
 -        return visitPPC64(Rel, R, Value);
 +        return resolvePPC64;
        case Triple::systemz:
 -        return visitSystemz(Rel, R, Value);
 +        return resolveSystemz;
        case Triple::sparcv9:
 -        return visitSparc64(Rel, R, Value);
 +        return resolveSparc64;
        case Triple::amdgcn:
 -        return visitAmdgpu(Rel, R, Value);
 +        return resolveAmdgpu;
        default:
 -        HasError = true;
 -        return 0;
 +        return nullptr;
        }
      }

RelocVisitor.h -> RelocationResolver.cpp there are so many differences that git doesn't consider it a rename.

Right - what I'm asking is if you could undo/not apply the file name change in this revision, so the patch/review is easier to read, possibly? (you could undo the file name change & upload a new revision for this review - then the delta might be smaller/easier to read)

I just created https://reviews.llvm.org/D58020 for exposition purposes. This patch and D58020 have no functional difference but D58020 puts all changes together in RelocVisitor.h

dblaikie added inline comments.Feb 14 2019, 11:56 AM
include/llvm/DebugInfo/DWARF/DWARFRelocMap.h
24 ↗(On Diff #185915)

Could this have a more verbose/self-descriptive name?

lib/DebugInfo/DWARF/DWARFDataExtractor.cpp
27 ↗(On Diff #185915)

Could RelocAddrEntry maybe have a "resolve" function that handles most of this? (eg: passes in E->Reloc and E->S, handles the "getValueOr" part too, probably - and/or maybe the "getValueOr" part could be handled by having a default/trivial Resolver that returns A? Less branching in the code required that way)

MaskRay updated this revision to Diff 186965.Feb 14 2019, 11:17 PM
MaskRay marked an inline comment as done.

make Resolver* more efficient

MaskRay marked 2 inline comments as done.Feb 14 2019, 11:18 PM
MaskRay added inline comments.
include/llvm/DebugInfo/DWARF/DWARFRelocMap.h
24 ↗(On Diff #185915)

Changed it to SymbolValue.

In ELF ABI, this is typically: "S Represents the value of the symbol whose index resides in the relocation entry"

lib/DebugInfo/DWARF/DWARFDataExtractor.cpp
27 ↗(On Diff #185915)

Changed it to return E->Resolver(E->Reloc, E->SymbolValue, A);

The check (whether Resolver can handle the relocation type) in DWARFContext.cpp is gross but efficient..

dblaikie added inline comments.Feb 18 2019, 4:22 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
1512–1515 ↗(On Diff #186965)

What's the failure mode if this is not diagnosed here?

If there's a valid fallback/reliable failure later - could we remove this code entirely? (rather than having a best effort then fallback - seems like having two partial solutions in different places is liable to have things slip through)

MaskRay marked 3 inline comments as done.Feb 20 2019, 7:05 AM
MaskRay added inline comments.
lib/DebugInfo/DWARF/DWARFContext.cpp
1512–1515 ↗(On Diff #186965)

What's the failure mode if this is not diagnosed here?

The unresolved relocation type will slip through and there will be no diagnostics later when the relocations are actually resolved. (Note that the return types of visit* (or resolve* after the patch) are uint64_t, instead of Error Optional or others).

And the usage (call sites of getRelocatedValue) does not allow it to do fancy error handling.

uint64_t DWARFDataExtractor::getRelocatedValue(uint32_t Size, uint32_t *Off,
                                               uint64_t *SecNdx) const {
  if (SecNdx)
    *SecNdx = -1ULL;
  if (!Section)
    return getUnsigned(Off, Size);
  Optional<RelocAddrEntry> Rel = Obj->find(*Section, *Off);
  if (!Rel)
    return getUnsigned(Off, Size);
  if (SecNdx)
    *SecNdx = Rel->SectionIndex;
  return getUnsigned(Off, Size) + Rel->Value;
}

So I know that this is ugly and can be improved.. But to fix the bug I cannot think of a better way without being more intrusive (I do not want to touch many other places for this patch).

dblaikie added inline comments.Feb 21 2019, 9:52 AM
lib/DebugInfo/DWARF/DWARFContext.cpp
1512–1515 ↗(On Diff #186965)

OK, it sounds like the comment might be a bit off, then:

" It isn't a big deal if this check fails. We just fail to catch this error early."

That last bit sounds like "well, we don't catch it /early/, I guess we catch it later" - but it sounds like we don't catch it later? So I'm not sure if it's not a big deal.

I'm not sure the partial solution is ideal either, as it seems to make it harder to see that it's incomplete/still problematic. Perhaps it'd be better to remove those Resolver -1ULL calls to demonstrate the general failure & implement a general fix - maybe not in this patch, maybe in a precursor or follow-up. Not sure about breaking this temporarily rather than laying out the patches to fix this without a period of regression.

@echristo - got any thoughts on this?

MaskRay updated this revision to Diff 187896.Feb 21 2019, 6:35 PM

Update the comments as the old comments might be a bit off.

echristo added inline comments.Mar 14 2019, 1:08 AM
lib/DebugInfo/DWARF/DWARFContext.cpp
1512–1515 ↗(On Diff #186965)

Taking a look through this ... we've got terrible error handling here. I'm also not a huge fan of storing the resolver inside every part of the map. That said as an incremental step it's not too bad as long as we document it a bit better explaining that the -{1,2,3}ULL mean in the if statement at what the purpose is.

Then we can probably refactor some of our relocation handling to be able to better check valid relocation first and then go from there.

WDYT?

MaskRay updated this revision to Diff 190637.Mar 14 2019, 8:25 AM

Better comment

dblaikie added inline comments.Mar 14 2019, 12:00 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
1512–1515 ↗(On Diff #186965)

Taking a look through this ... we've got terrible error handling here. I'm also not a huge fan of storing the resolver inside every part of the map. That said as an incremental step it's not too bad as long as we document it a bit better explaining that the -{1,2,3}ULL mean in the if statement at what the purpose is.

I think the bit I'm uncertain about is that this seems like a step backwards in terms of error coverage - not quite incremental progress. But if the existing error coverage isn't reliable in some way (it's not clear to me that it is) - yeah, I don't have strong feelings about preserving it.

Then we can probably refactor some of our relocation handling to be able to better check valid relocation first and then go from there.

If we can refactor the error checking first so it's preserved through this change, that sounds fine.

Validating relocations up-front might be problematic from a performance perspective ( comes back to some of the design discussions around error handling in lld, etc - up-front error checking that might check too many things (when some sections are discarded, etc)).

MaskRay updated this revision to Diff 191082.Mar 18 2019, 7:16 AM

Add supports*

MaskRay marked 5 inline comments as done.Mar 18 2019, 7:17 AM
MaskRay added inline comments.
lib/DebugInfo/DWARF/DWARFContext.cpp
1512–1515 ↗(On Diff #186965)

Deleted -2/-3 hack and added supports* functions.

MaskRay marked an inline comment as done.Mar 18 2019, 7:36 AM
echristo accepted this revision.Mar 18 2019, 2:50 PM

The duplication is a little sad, but I don't have a lot of ideas on how to remove that in the near term so LGTM. Dave, any other concerns on your end?

This revision is now accepted and ready to land.Mar 18 2019, 2:50 PM
MaskRay updated this revision to Diff 191816.Thu, Mar 21, 7:30 PM
MaskRay edited the summary of this revision. (Show Details)
MaskRay removed a subscriber: jdoerfert.

Update description

MaskRay updated this revision to Diff 191818.Thu, Mar 21, 7:34 PM
MaskRay retitled this revision from Refactor RelocVisitor and fix computation of SHT_RELA-typed relocation entries to [DWARF] Refactor RelocVisitor and fix computation of SHT_RELA-typed relocation entries.
MaskRay edited the summary of this revision. (Show Details)

Update title

Harbormaster completed remote builds in B29472: Diff 191818.