This is an archive of the discontinued LLVM Phabricator instance.

Preliminary support for dynamically loadable coff objects
ClosedPublic

Authored by AndyAyers on Feb 20 2015, 9:36 AM.

Details

Summary

Provide basic support for dynamically loadable coff objects. Only handles a subset of x64 currently.

Diff Detail

Repository
rL LLVM

Event Timeline

AndyAyers updated this revision to Diff 20416.Feb 20 2015, 9:36 AM
AndyAyers retitled this revision from to Preliminary support for dynamically loadable coff objects.
AndyAyers updated this object.
AndyAyers edited the test plan for this revision. (Show Details)
AndyAyers added a reviewer: lhames.
AndyAyers set the repository for this revision to rL LLVM.
AndyAyers added a subscriber: Unknown Object (MLST).
chandlerc added inline comments.
include/llvm/Support/COFF.h
5 ↗(On Diff #20416)

Please don't modify the banner at the top.

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp
5 ↗(On Diff #20416)

Same comment.

chandlerc added inline comments.Feb 20 2015, 5:42 PM
lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
5 ↗(On Diff #20416)

Doh, and this one. Sorry. Meant to flag them all for you so they were easy to find.

Should be just those 3 places.

Is there language anywhere under llvm.org that I can point to as a reference about this policy? I and some others looked and didn't see any.

majnemer added inline comments.
lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
296–304 ↗(On Diff #20416)

Your change is correct, it's a shame we can't just ask SectionRef::isBSS...

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp
187 ↗(On Diff #20416)

Any reason you chose to include the RelocationTypeAMD64 scope qualifier?

231–233 ↗(On Diff #20416)

Curly braces here aren't needed.

236–237 ↗(On Diff #20416)

Could you remove the extra scope qualifiers? I think the enumeration value is pretty specific on its own.

261–262 ↗(On Diff #20416)

This might be a little simpler as a range-based for loop.

265 ↗(On Diff #20416)

Is it not possible for this to fail in practice?

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
29 ↗(On Diff #20416)

Any reason why you chose to make this different from ELF? They use Check.

46–61 ↗(On Diff #20416)

Something tells me you aren't trying to support SystemZ ;)

I'd remove support for everything but the ARM and x86 variants.

lib/Object/COFFObjectFile.cpp
193–195 ↗(On Diff #20416)

Good catch :)

376 ↗(On Diff #20416)

What's the rationale for this change?

Should be just those 3 places.

Is there language anywhere under llvm.org that I can point to as a reference about this policy? I and some others looked and didn't see any.

http://llvm.org/docs/CodingStandards.html#file-headers

The more general policy for all contributions of course lives in http://llvm.org/docs/DeveloperPolicy.html

Yeah, we'd seen those bits.

I'll relay that not marking up headers is the custom and get an update out soon.

Thanks.

lhames edited edge metadata.Feb 21 2015, 10:45 AM

Hi Andy,

Thanks very much for working on this. It looks great.

If you plan to support more architectures in the future you should consider moving the target specific code into a subclass (along the same lines as RuntimeDyldMachO). In my experience mixing the relocation handling code for multiple architectures was a debugging and maintenance headache - I ended up expending a lot of effort to untangle that code last year.

Could you also add some tests under llvm/test/ExecutionEngine/RuntimeDyld/ ?

Cheers,
Lang.

Lang --

I'd be happy to add a test -- would checking in an object file be ok, or would you rather have the test input be an assembly file?

If the latter, I'll have to figure out how to express what I want in your assembly syntax.

I'll probably hold off refactoring for architecture unless you feel strongly we should do it now.

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp
187 ↗(On Diff #20416)

Just force of habit... have trimmed this back.

236–237 ↗(On Diff #20416)

Yep.

261–262 ↗(On Diff #20416)

Made it so.

265 ↗(On Diff #20416)

I don't think coff allows nameless sections, but added a Check here anyways.

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
29 ↗(On Diff #20416)

Thanks for reminding me about this. I had a name collision and just renamed it as a workaround.

I'll moved Check into RuntimeDyldImpl.h so that both ELF and COFF can share.

46–61 ↗(On Diff #20416)

Yep. Thanks.

lib/Object/COFFObjectFile.cpp
376 ↗(On Diff #20416)

I thought it better matched the intent of the comment in ObjectFile.h:

// A section is 'virtual' if its contents aren't present in the object image.

For coff this means the section has no raw data, regardless of section characteristics. In practice it probably doesn't matter so I'd be happy to revert this if you prefer.

majnemer added inline comments.Feb 23 2015, 6:48 PM
lib/Object/COFFObjectFile.cpp
376 ↗(On Diff #20416)

Compiling char x[300]; as C++ with MSVC 2015 results in a .bss section with SizeOfRawData set to 0x12c.

The relevant text from the COFF spec is:

The size of the section (for object files) or the size of the initialized data on disk (for image files).

Heh, I guess my formulation of isSectionVirtual was more appropriate for object files while yours is better for executables.

AndyAyers added inline comments.Feb 24 2015, 10:42 AM
lib/Object/COFFObjectFile.cpp
376 ↗(On Diff #20416)

Perhaps a better check is if PointerToRawData is zero, this should work in both obj and pe cases?

majnemer added inline comments.Feb 24 2015, 11:05 AM
lib/Object/COFFObjectFile.cpp
376 ↗(On Diff #20416)

Yes, I think that would work perfectly.

zturner added inline comments.Feb 24 2015, 11:08 AM
lib/Object/COFFObjectFile.cpp
376 ↗(On Diff #20416)

I'm following this from the sidelines as I don't know much in this area, but I think it would be good to have a comment summarizing the takeaway from this discussion at the point you add the check for PointerToRawData == 0

Hi Andy,

I'd be happy to add a test -- would checking in an object file be ok, or would you rather have the test input be an assembly file?

If the latter, I'll have to figure out how to express what I want in your assembly syntax.

I usually just take some IR that requires the relocations that I want to test, then run it through llc to get the assembly. Then you can add a few rtdyld-check lines to verify that the relocations are correctly applied.

The various *.s files under llvm/test/ExecutionEngine/RuntimeDyld/ should give you an idea of the rtdyld-check syntax, but if anything's unclear I'd be happy to help out.

I'll probably hold off refactoring for architecture unless you feel strongly we should do it now.

I have a strong preference for the refactor being done before this is committed. You don't have to go as far with the refactor as RuntimeDyldMachO does (i.e using the curiously recursive template idiom). You can just subclass RuntimeDyldCOFF and implement processRelocationRef, resolveRelocation and finalizeObject in the the RuntimeDyldCOFF_X86_64 subclass if you like.

Ok, thanks. Will refactor to split out the architecture-specific bits.

On the testing front -- I started trying to load up one of our generated .objs via rtdyld and ran into some assumptions in the base loader code. In particular the object we generate has a zero-sized .data, and the loader code is a bit too eager to get at the contents of .bss. But our objs are kind of degenerate; I don't know if the intent is to have this code work with arbitrary objects or just the plausible ones you'd see from a compiler or assembler where zero-sized sections probably get squashed.

AndyAyers updated this revision to Diff 20691.Feb 25 2015, 11:08 AM
AndyAyers edited edge metadata.

This update should address all of the review feedback on the code. Still working on tests.

majnemer added inline comments.Feb 25 2015, 3:31 PM
lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.cpp
132 ↗(On Diff #20691)

Would report_fatal_error be more reasonable here?

141 ↗(On Diff #20691)

Likewise, would it be nicer to report an error in release builds as well?

145–146 ↗(On Diff #20691)

Please stick this curly brace after the colon.

166 ↗(On Diff #20691)

Please stick the ampersand next to the variable name.

zturner added inline comments.Feb 25 2015, 3:38 PM
lib/ExecutionEngine/RuntimeDyld/CMakeLists.txt
8 ↗(On Diff #20691)

Minor CMake style nit. The relationship between this folder and Targets is a little awkward (not because of your doing, seems historical). In any case, adding a file from a subfolder into this library can be strange to someone using an IDE because they will see RuntimeDyldCOFFX86_64.cpp in the same file list as the rest, even though it's in a different location on disk.

It might be a little better if "Targets" showed up as a folder in the IDE, with RuntimeDyldCOFFx86_64.cpp showing up under that folder. An example of the CMake needed to do this is in \lib\DebugInfo\PDB\CMakeLists.txt. It's only a couple of boiler plate lines.

AndyAyers added a comment.EditedFeb 25 2015, 4:29 PM

I've added this to CMakeLists.txt, but VS doesn't show things any differently. So not sure if I'm doing it wrong or other IDEs honor these groups....

macro(add_dyld_targets_folder group)
  list(APPEND DYLD_TARGETS_SOURCES ${ARGN})
  source_group(${group} FILES ${ARGN})
endmacro()

add_dyld_targets_folder(Targets
  Targets/RuntimeDyldCOFFX86_64.cpp
  )

add_llvm_library(LLVMRuntimeDyld
  RTDyldMemoryManager.cpp
  RuntimeDyld.cpp
  RuntimeDyldChecker.cpp
  RuntimeDyldCOFF.cpp
  RuntimeDyldELF.cpp
  RuntimeDyldMachO.cpp
  ${DYLD_TARGETS_SOURCES}
  )

I applied your patch. It shows up for me, but actually in this case your situation is different so I still end up with 3 folders Header Files, Source Files, and Targets. I'm not sure how (or even if) you can make it appear as a subfolder under Source Files. So feel free to ignore my suggestion now and revert that part of the commit, and leave it as it was before :)

Sorry for the noise.

andrew.w.kaylor added inline comments.
lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.cpp
94 ↗(On Diff #20691)

Shouldn't you be using Section.ObjAddress here?

132 ↗(On Diff #20691)

I know we're doing this all over the existing code, but RuntimeDyld should really avoid fatal errors, including llvm_unreachable, as much as possible. With a static compiler it is kind of reasonable to bring down the program if a catastrophic error occurs, but RuntimeDyld really shouldn't do that.

I'm not suggesting that this problem needs to be addressed in this change set. I'm just bringing it up as a general concern that needs to be dealt with sooner or later.

AndyAyers added inline comments.Feb 26 2015, 1:05 PM
lib/ExecutionEngine/RuntimeDyld/CMakeLists.txt
8 ↗(On Diff #20691)

Sounds like we're leaving this as is, per your investigation.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.cpp
94 ↗(On Diff #20691)

You should get the same result either way, but I agree the intent might be clearer if we read from the original object instead of its loaded copy.

However once we implement the rel forms (which I am ending up doing as part of adding test cases) we need to know the actual address of the location to fix up; then it seems simpler to use that address to fetch any possible target displacement too.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.cpp
94 ↗(On Diff #20691)

One of issues that ObjAddress is meant to address, and I agree you don't have this problem the way you implemented things here, is that after the relocation has been applied once you no longer have access to to original placeholder value by way of the loaded object. If relocations are applied a second time the second pass can't rely on the loaded object having the correct placeholder value.

If I've got everything straight, your implementation here is grabbing the placeholder value and rolling it into the relocation entry, which is a decent solution. So using Section.Address instead of Section.ObjAddress is really only very technically incorrect. I would still advocate for the use of Section.ObjAddress for the sake of any future refactoring that is meant to address this class of behavior in some currently unanticipated way.

I know Lang wanted to do something to eliminate the need to keep the original object image around. Are the "rel forms" related to that? I don't think I'm familiar with that term. Whether yes or no, how will the rel forms address this issue?

Thanks very much for doing the refactor. This is looking great.

Are you still having trouble running rtdyld on your objects? Assuming you've got that working, please feel free to go ahead and commit.

If you're still having trouble let me know - I'm busy for the next few days, but later next week I will have time to test this out.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.cpp
94 ↗(On Diff #20691)

I definitely want to remove the need to keep the original object around, but I don't have a timeline for it yet. I think using ObjAddress is the appropriate solution for now.

132 ↗(On Diff #20691)

Yep - I don't mind this going in as is. I'm going to do a thorough audit for these in all the RuntimeDyld subclasses fairly soon. I can change it to something more reasonable then.

Still working on tests. The verify option to rtdyld turned up some things needing fixing, which is great (we didn't see these since we are strictly in-proc).

Working on a test also brought up some other issues, in part since we're really loading up a coff object file (not a PE):

  • The way we use this code now we never see the REL32 relocations, but every test case I can create from text input creates these (and we really should too, we're just not doing it yet). So I'll add support for this class of relocs -- though maybe not testing for the full set of REL32_* variants yet. These relative relocations can only be used if the dynamic loader promises to keep sections within 32 bit reaches of each other. I currently don't see any way to ensure this will be true.
  • ADDR32NB is a bit of a hack, since (from what I can tell) there is no real notion of an image base, at least in the non-PE case. I don't need this to work since I only see these relocs in .pdata which I am ignoring, but it would be nice if it there would be some way to have a base and they are used more broadly. For instance some times jump tables are encoded this way to save size.
  • Zero sized sections cause trouble. I routinely see zero sized .data and .bss. I have marked any zero sized section as not required for execution and am deferring looking at section data in some places until we know the section is required. An alternative here would be to get the coff producer to stop emitting zero sized sections.

Hope to have an update out in a day or two, but I have some vacation here and there. So we'll see.

AndyAyers updated this revision to Diff 20945.Feb 28 2015, 5:12 PM
AndyAyers edited the test plan for this revision. (Show Details)

Add a rtdyld test case for coff object files.

Rework the x86_64 relocation processing so that it can pass the rtdyld's verification and check tests. In particular be more careful about Address vs ObjAddress vs LoadedAddress.

Implement REL32 forms of relocations. Add a test case showing that the dynamic linker does the proper update for the basic REL32 relocation. Streamline the relocation processing to remove the second of a matching set of switches.

Stub out ADDR32NB until we understand how the dynamic linker is going to provide an image base and constrain sections to be within a 4GB region.

Fix some issues where the dynamic linker was confused by zero sized sections (we now claim they don't need to be loaded for COFF) or was too eager to look at .bss.

majnemer added inline comments.Feb 28 2015, 11:01 PM
lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
271–272 ↗(On Diff #20945)

This makes sense but is a little subtle. Both checks are needed because:

  • VirtualSize may be zero in object files while SizeOfRawData is non-zero.
  • SizeOfRawData may be zero in executables while VirtualSize is non-zero.
lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
29 ↗(On Diff #20945)

This space is superfluous.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.cpp
57–58 ↗(On Diff #20945)

I thought Value was a variable with automatic storage duration, please consider indenting this.

63 ↗(On Diff #20945)

This space is superfluous.

72 ↗(On Diff #20945)

This seems wrong for RelType != COFF::IMAGE_REL_AMD64_REL32. I think we would want + 4 for IMAGE_REL_AMD64_REL32, + 3 for IMAGE_REL_AMD64_REL32_1, etc.

138 ↗(On Diff #20945)

This space is superfluous.

146 ↗(On Diff #20945)

Please insert a space between the pointee type and the star token.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.h
26 ↗(On Diff #20945)

This space is superfluous.

37 ↗(On Diff #20945)

Variable names are capitalized in LLVM.

45–48 ↗(On Diff #20945)

This looks a little funny, would you mind reformatting this? You might want to look into using clang-format, many contributors use it to automate reformatting source code to the LLVM conventions.

50 ↗(On Diff #20945)

You could just go with unsigned.

53–54 ↗(On Diff #20945)

Likewise, the way RuntimeDyldELF.h formats this method declaration is more in line with LLVM's style.

lib/Object/COFFObjectFile.cpp
364–371 ↗(On Diff #20945)

It might be nicer to store the flags in a variable, perhaps:

const uint32_t BSSFlags = COFF::IMAGE_SCN_CNT_UNINITIALIZED_DATA |
                          COFF::IMAGE_SCN_MEM_READ | COFF::IMAGE_SCN_MEM_WRITE;
return (Sec->Characteristics & BSSFlags) == BSSFlags;
lhames accepted this revision.Mar 1 2015, 2:33 PM
lhames edited edge metadata.

LGTM pending addressing of majnemer's comments. Running clang-format over the source is a particularly good idea.

Thanks again for your work on this Andy, and your patience with the review process. It's really great to see the JIT getting COFF support.

This revision is now accepted and ready to land.Mar 1 2015, 2:33 PM

Will have an updated patch out shortly....

lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
271–272 ↗(On Diff #20945)

I'll add a comment here.

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
29 ↗(On Diff #20945)

Am going to run clang-format as suggested to clear these up.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.cpp
72 ↗(On Diff #20945)

There was a "Delta" in processRelocationRef that was supposed to take care of this, but I had the sign wrong there... will move it up here since it probably fits better here than there.

Note the _N variants subtract additional N, not add it. Eg REL32_1 means there's a reloc that begins 5 bytes from the end of the instruction.

lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.h
37 ↗(On Diff #20945)

Will fix.

45–48 ↗(On Diff #20945)

Yep, clang-format it is.

lib/Object/COFFObjectFile.cpp
364–371 ↗(On Diff #20945)

Will do.

AndyAyers updated this revision to Diff 21033.Mar 2 2015, 12:55 PM
AndyAyers edited edge metadata.

Reformat using clang-format. Note in edited files, I only took formatting changes for the parts I had edited.

Incorporate all the review feedback from the last iteration.

Rework how the REL_N relocations handle the _N part -- move the adjustment into resolveRelocation, and fix the sign of the adjustment.

majnemer accepted this revision.Mar 6 2015, 3:12 PM
majnemer added a reviewer: majnemer.

LGTM!

Thanks, this looks excellent :)

Is there anything else I need to do here?

I don't have commit access, so if you wouldn't mind...

This revision was automatically updated to reflect the committed changes.

Excellent. Thanks Andy and David!