This is an archive of the discontinued LLVM Phabricator instance.

Avoid undefined behavior in LinkAllPasses.h
ClosedPublic

Authored by dim on Jan 8 2016, 8:28 AM.

Details

Summary

The LinkAllPasses.h file is included in several main programs, to force
a large number of passes to be linked in. However, the ForcePassLinking
constructor uses undefined behavior, since it calls member functions on
nullptr, e.g.:

((llvm::Function*)nullptr)->viewCFGOnly();
llvm::RGPassManager RGM;
((llvm::RegionPass*)nullptr)->runOnRegion((llvm::Region*)nullptr, RGM);

When the optimization level is -O2 or higher, the code below the first
nullptr dereference is optimized away, and replaced by ud2 (on x86).

Therefore, the calls after that first dereference are never emitted. In
my case, I noticed there was no call to llvm::sys::RunningOnValgrind()!

Replace these two instances of nullptr by 1, which is ugly, but at
least defined, and accomplishes the goal of emitting the desired
references.

Diff Detail

Repository
rL LLVM

Event Timeline

dim updated this revision to Diff 44330.Jan 8 2016, 8:28 AM
dim retitled this revision from to Avoid undefined behavior in LinkAllPasses.h.
dim updated this object.
dim added reviewers: chandlerc, craig.topper, rnk.
dim added subscribers: llvm-commits, emaste.
dim updated this revision to Diff 44533.Jan 11 2016, 12:09 PM
dim added a subscriber: jroelofs.

Since @jroelofs reminded me that 1 is maybe not such a good idea for a
pointer, let's use a stack-allocated object instead. Since RGM (a
RGPassManager object) was already there, we might as well use it.

dim updated this revision to Diff 44544.Jan 11 2016, 1:23 PM

Trying to make both calls defined:

We can use llvm::Function::Create() to give us a nice, valid Function pointer.

For the llvm::RegionPass part, the problem is that there is only one descendant of this class in the tree, and it isn't exposed via a header. So create a bogus descendant, and let it call llvm::RegionPass::runOnRegion().

dim added a comment.Jan 11 2016, 1:34 PM
In D15996#323997, @dim wrote:

For the llvm::RegionPass part, the problem is that there is only one descendant of this class in the tree, and it isn't exposed via a header. So create a bogus descendant, and let it call llvm::RegionPass::runOnRegion().

Sigh, unfortunately clang now indeed emits a call to the pure virtual function llvm::RegionPass::runOnRegion(), and since that isn't defined, the program fails to link.

I guess the whole exercise to attempt to call this function is useless? Could we maybe just delete this part (e.g. lines 196 through 197 in the new version)?

dim added a subscriber: grosser.Jan 12 2016, 12:02 AM
dim added inline comments.
include/llvm/LinkAllPasses.h
191 ↗(On Diff #44544)

@grosser, you originally added this part in rL117263 ("Reference RegionPass to stop it being eliminated"), do you have any suggestions? If the goal is to to unsure RegionPass.cpp is linked in, I think we can call RegionPass::createPrinterPass() instead.

Do you need to call anything? Could you just take the address of the
function and return it, stuff it in a global, or otherwise escape it?

dim updated this revision to Diff 44652.Jan 12 2016, 10:35 AM

Two updates:

  • Use a wrapper class around RegionPass, and call createPrinterPass on it. This ensures RegionPass.cpp is linked in.
  • Use a raw_string_ostream to pass to functions requiring a raw_ostream, to avoid having to dereference yet another nullptr.
mehdi_amini added inline comments.
include/llvm/LinkAllPasses.h
198 ↗(On Diff #44652)

Isn't instantiating a RGPassManager enough to force link RegionPass.cpp?

dim added inline comments.Jan 12 2016, 12:38 PM
include/llvm/LinkAllPasses.h
198 ↗(On Diff #44652)

Well, this was already done in the original code, but then they tried to pass it as an argument to a call to runOnRegion. I am unsure what this gained, and why the RGPassManager appears to be a little different than the other pass managers...

That said, I think such a local object cannot be optimized out, certainly because the constructor is not inlined. The assembly shows:

leal    72(%esp), %esi
movl    %esi, (%esp)
calll   llvm::RGPassManager::RGPassManager()@PLT

so it's indeed definitely being called.

adding Michael, who should be able to test this change with Polly on Windows.

include/llvm/LinkAllPasses.h
198 ↗(On Diff #44652)

I looked at this again and I think instantiation RGPassManager is likely to be enough. runOnRegion should already be transitively preserved through llvm::createStructurizeCFGPass()
(StructurizeCFG has not been available when the original code was added)

In the unlikely case we remove too much, the linker error should allow us to understand without too much hassle what else needs to be preserved.

dim updated this revision to Diff 44682.Jan 12 2016, 2:57 PM

Updates:

  • Now simply instantiating RGPassManager
  • Also eliminate the last nullptr dereference.

Is it undefined behaviour if it never executes?

IMHO the complete function already relies on assumptions made on how intelligent today's compilers work as the initial comment states, so even whether any of the functions is linked in is undefined.

I have another suggestion how to to the same thing in a more conformant way:

(ForcePassLinging*)()[] = { &llvm::createAAEvalPass, &llvm::createAggressiveDCEPass, ... }
auto ForceRegionPassLinking = &llvm::Function::Create

Because the global variables are not static the compiler has to assume that these might be used somewhere else. LTO might remove it but that would be okay because it still has to execute the pass registration since the object file in question is linked-in.

Will still try with Polly on Windows

mehdi_amini accepted this revision.Jan 13 2016, 10:16 AM
mehdi_amini added a reviewer: mehdi_amini.

LGTM as is.

I like Michael suggestion of replacing this with an array, and I encourage a subsequent patch to tackle it.

This revision is now accepted and ready to land.Jan 13 2016, 10:16 AM
dim added a comment.Jan 13 2016, 10:29 AM

Is it undefined behaviour if it never executes?

At least for how the code is emitted, it is; e.g. everything after the first nullptr dereference was optimized away in the old code, defeating the purpose of the whole exercise.

On the other hand, the link failures I originally got were because Valgrind.cpp was not linked into the main clang executable, but apparently the only thing requiring functions from this file is this ForcePassLinking part! I looked through the whole llvm+clang source tree, and only bugpoint.cpp calls RunningOnValgrind().

So maybe at some point we may be able to ditch this header altogether? :-) I would be more inclined to do find a way to do that...

IMHO the complete function already relies on assumptions made on how intelligent today's compilers work as the initial comment states, so even whether any of the functions is linked in is undefined.

I have another suggestion how to to the same thing in a more conformant way:

(ForcePassLinging*)()[] = { &llvm::createAAEvalPass, &llvm::createAggressiveDCEPass, ... }
auto ForceRegionPassLinking = &llvm::Function::Create

Because the global variables are not static the compiler has to assume that these might be used somewhere else. LTO might remove it but that would be okay because it still has to execute the pass registration since the object file in question is linked-in.

Will still try with Polly on Windows

I'd be interested in the effects of simply removing the inclusion of this header in all files that currently use it, on Windows. In combination with LTO, obviously.

This revision was automatically updated to reflect the committed changes.

Links fine with Polly under windows, but the buildbots will probably say the same soon.

In D15996#325926, @dim wrote:

I'd be interested in the effects of simply removing the inclusion of this header in all files that currently use it, on Windows. In combination with LTO, obviously.

What compiler? clang? mingw(-64)? msvc?

Michael