This is an archive of the discontinued LLVM Phabricator instance.

[LLD][COFF] Fix missing MergeChunk::Instances cleanup in COFF::link()
ClosedPublic

Authored by blackhole12 on Jun 7 2019, 8:31 PM.

Details

Summary

When calling COFF::link() with CanExitEarly set to false, the function needs to clean up several global variable caches to ensure that the next invocation of the function starts from a clean slate. The MergeChunk::Instances cache is missing from this cleanup code, and as a result will create nondeterministic memory access errors and sometimes infinite loops due to invalid memory being referenced on the next call to COFF::link().

This fix simply calls MergeChunk::Instances.clear() before exiting the function.

An additional review of the COFF library was made to try and find any other missing global caches, but I was unable to find any other than MergeChunk. Someone more familiar with the global variables might want to do their own check.

This fix was made to support inNative's .wast script compiler, which must build multiple incremental builds. It relies on statically linking LLD because the entire compiler must be a single statically embeddable library, thus preventing it from being able to call LLD as a new process.

Diff Detail

Repository
rL LLVM

Event Timeline

blackhole12 created this revision.Jun 7 2019, 8:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2019, 8:31 PM
ruiu accepted this revision.Jun 10 2019, 12:50 AM
ruiu added a subscriber: ruiu.

LGTM

Yes, I suspect that there are other globals that are not reset between multiple executions of this linker.

As to how to invoke the linker as a separate process, I wonder if you can run your compiler as a new subprocess of the compiler. If you can do that, you can invoke the linker with a special option to call lld::link immediately on process startup, which effectively transforms your compiler to a linker. This way, you still have to statically link lld to your compiler, but you can run the linker as a separate child process, which is generally preferred because bad thing like this or unknown bugs in the linker doesn't harm your compiler.

This revision is now accepted and ready to land.Jun 10 2019, 12:50 AM
ruiu added a comment.Jun 10 2019, 12:51 AM

LGTM

Yes, I suspect that there are other globals that are not reset between multiple executions of this linker.

As to how to invoke the linker as a separate process, I wonder if you can run your compiler as a new subprocess of the compiler. If you can do that, you can invoke the linker with a special option to call lld::link immediately on process startup, which effectively transforms your compiler to a linker. This way, you still have to statically link lld to your compiler, but you can run the linker as a separate child process, which is generally preferred because bad thing like this or unknown bugs in the linker doesn't harm your compiler.

s/you can invoke the linker with a special option/you can invoke the compiler with a special option/

That is not an acceptable solution. This compiler is an embedded, statically linked library that can be used by other programs. It is not a standalone compiler. If the library is statically linked, there is literally no way to start a new process without forcing every single user of my library to support their own program being called with a special option for the sole purpose of invoking a linker. Since one of the intended uses, for example, is a game engine script library, which would compile a webassembly script for a map, this compilation process cannot require the game to support calling it's own executable, or link the script engine as a DLL for the sole purpose of having the script engine then load it's own DLL in a new process. These are not reasonable restrictions to have on what is supposed to be a self-contained compiler.

The correct fix is to ensure that the global caches are cleared properly. Even better, perhaps the global variables should all be gathered in the same place instead of being sprinkled around in random locations. There are many, many reasonable ways that the LLD codebase could be cleaned up to support being called as a library, and I don't think it's an unreasonable expectation. If LLD is not interested in this, I will simply continue fixing these bugs as I find them.

ruiu added a comment.Jun 10 2019, 2:04 AM

I was not trying to force you to use lld in some particular way. I was just trying to help you by suggesting one solution that might work for some use cases, and I didn't know that that wouldn't work for you. If it doesn't work for you, you can continue using lld as a library as you are already doing. We support that as well, and a bug like this should be fixed.

As to "clean up", I honestly don't think that lld's code is that bad. We use global variables, and there are pros and cons about that, but the design decision to use them was carefully made. This is not a quick dirty hack or something like that.

By the way thank you for fixing this bug. Feel free to send more patches if you find any bugs.

ruiu added a comment.Jun 10 2019, 3:04 AM

Do you have a commit access?

I do not have access to anything. The only repo I currently have cloned is my fork of the official git repository.

ruiu added a comment.Jun 10 2019, 3:21 AM

Got it. I'll commit this patch on behalf of you.

This revision was automatically updated to reflect the committed changes.