Page MenuHomePhabricator

[IRBuilder] Virtualize IRBuilder
ClosedPublic

Authored by nikic on Feb 1 2020, 11:11 AM.

Details

Summary

This patch moves the IRBuilder from templating over the constant folder and inserter towards making both of these virtual. There are a couple of motivations for this:

  • Currently, it's hard/impossible to share code using IRBuilders with different folders/inserters. The original motivation was to make the SimplifyLibCalls utility (which further depends on the BuildLibCall utility) make use of the InstCombine IR builder (which uses a custom inserter). However, there isn't really any way to do that without bleeding the used inserter across the codebase.
  • There are some creation methods on BaseBuilder. However, these do not make use of the inserter and perform manual insertion instead, making the inserter callback unreliable. This happened because the BaseBuilder just doesn't have access to the inserter...
  • Nearly the entirety of IRBuilder is implemented as inline methods in the header. This is a hefty 3k loc file and very little of it looks like code that benefits from being inlined. However, because it depends on the template arguments, it cannot be moved into the source file.

This patch addresses the issue by making the following changes:

  • IRBuilderDefaultInserter::InsertHelper becomes virtual. The IRBuilderBase accepts a reference to it.
  • IRBuilderFolder is introduced as a virtual base class. It is implemented by ConstantFolder (default), NoFolder and TargetFolder. The IRBuilderBase has a reference to this as well.
  • All the logic is moved from IRBuilder to IRBuilderBase. This means that methods can in the future replace their IRBuilder<> & uses (or other specific IRBuilder types) with IRBuilderBase & and thus be usable with different IRBuilders.
  • The IRBuilder class is now a thin wrapper around IRBuilderBase. Essentially it only stores the folder and inserter and takes care of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

  • Fixing use of the inserter for creation methods originally defined on IRBuilderBase (this is a trivial change, keeping it out of here to keep things NFC).
  • Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
  • Moving code from the IRBuilder header to the source file.

From the user perspective, these changes are mostly transparent: The only thing that consumers may need to do is mark their InsertHelper as public instead of protected. (This patch currently only has the LLVM part, one or two adjustments like that will be needed in clang/polly).

What do you think?

Diff Detail

Event Timeline

nikic created this revision.Feb 1 2020, 11:11 AM
Herald added a project: Restricted Project. · View Herald Transcript
Meinersbur added a subscriber: Meinersbur.EditedFeb 4 2020, 7:22 AM

In my experience, LLVM tends more towards removing virtual inheritance and towards compile-time polymorphism (e.g. the pass manager, llvm::Instruction without vtable). This is a patch into the other direction. Personally, I prefer to use the language-provided mechanism.

There might be a performance impact (less inlining), it would help if you could measure it.

nikic updated this revision to Diff 242396.Feb 4 2020, 11:59 AM

Rebase over committed changes and add vtable anchors.

nikic added a comment.Feb 4 2020, 12:50 PM

In my experience, LLVM tends more towards removing virtual inheritance and towards compile-time polymorphism (e.g. the pass manager, llvm::Instruction without vtable). This is a patch into the other direction. Personally, I prefer to use the language-provided mechanism.

There might be a performance impact (less inlining), it would help if you could measure it.

Unfortunately I don't have a setup suitable to perform compile-time benchmarks. I can barely get LLVM built, my previous attempts at compile-time measurements were hopelessly noisy. I can't really imagine IRBuilder taking up a large fraction of compile-time though (I don't believe I've ever seen it show up in profiles.)

I'm not particularly good with C++, so if there's some other way to go about this that avoids using "virtual", I'd be happy to try that. I think that disregarding all the other considerations, the one issue here that must be fixed one way or another is that the "inserter" is not used if methods defined on IRBuilderBase are called. In particular this means that any intrinsics inserted by InstCombine right now do not get added to the worklist. Other passes that use a custom inserter probably experience similar subtle issues.

It might make sense to break this patch up to address just add that issue first, and leave the more involved "constant folder" changes to another patch. That would still require making something virtual though... either by making the inserter virtual (what this patch does), or making IRBuilder::Insert virtual (https://gist.github.com/nikic/7d6223bc0df4798181fa781ca771e78c for a quick prototype). I actually like the latter a bit more (it is less intrusive), but this approach doesn't allow us to perform vtable anchoring (I don't know how much we care about that). Another possibility would be to just add an std::function callback on IRBuilderBase -- which is essentially the same as using "virtual", but leaves us in control of the implementation details.

In my experience, LLVM tends more towards removing virtual inheritance and towards compile-time polymorphism (e.g. the pass manager, llvm::Instruction without vtable). This is a patch into the other direction. Personally, I prefer to use the language-provided mechanism.

There might be a performance impact (less inlining), it would help if you could measure it.

Unfortunately I don't have a setup suitable to perform compile-time benchmarks. I can barely get LLVM built, my previous attempts at compile-time measurements were hopelessly noisy. I can't really imagine IRBuilder taking up a large fraction of compile-time though (I don't believe I've ever seen it show up in profiles.)

It doesn't need to be an entire run of clang. Even a result 'difference is not significant' behind the noise is some supporting result over guessing.

I'm not particularly good with C++, so if there's some other way to go about this that avoids using "virtual", I'd be happy to try that. I think that disregarding all the other considerations, the one issue here that must be fixed one way or another is that the "inserter" is not used if methods defined on IRBuilderBase are called. In particular this means that any intrinsics inserted by InstCombine right now do not get added to the worklist. Other passes that use a custom inserter probably experience similar subtle issues.

The obvious solution would be to move these methods from IRBuilderBase to IRBuilder.

This is a significant change for the framework, I'd recommend writing an RFC to the llvm-dev list to get more feedback.

If people are concerned about virtual function calls, I do think it may be possible to have a generic builder class using virtual dispatch, but also a variant that doesn't use virtual dispatch. I've outlined a possible direction here: https://pastebin.com/i4mC6GAf

That said, I think it's plausible that just using virtual dispatch unconditionally is fine, some superficial compile-time benchmarks would be nice.

If people are concerned about virtual function calls,

To clarify: I am not, just pointing out that others might.

nikic updated this revision to Diff 243369.Feb 8 2020, 7:40 AM

Add clang part (change protected -> public).

nikic added a comment.Feb 8 2020, 7:51 AM

As a sanity check for compile-time I've compared build times for zend_execute.c (about 60k lines of C code before expansion). Raw perf stat results at https://gist.github.com/nikic/9a87083358d98f34e8e64851a84ff864, here retired instruction counts for three runs each (timing results are too noisy on my machine):

        | before         | after
-O2 -g  | 68.602.837.460 | 68.821.051.076
        | 68.566.922.163 | 68.538.048.917
        | 68.538.997.311 | 68.509.463.180
-O0 -g  | 25.162.639.627 | 25.135.102.049
        | 25.129.536.208 | 25.093.005.413
        | 25.146.124.560 | 25.119.503.250
-O0 -g0 | 16.232.387.222 | 16.215.831.899
        | 16.200.846.485 | 16.231.492.196
        | 16.216.916.110 | 16.192.568.580

It looks like the difference is in the noise, at least for this test case.

fhahn added a subscriber: fhahn.Feb 11 2020, 6:58 AM

As a sanity check for compile-time I've compared build times for zend_execute.c (about 60k lines of C code before expansion). Raw perf stat results at https://gist.github.com/nikic/9a87083358d98f34e8e64851a84ff864, here retired instruction counts for three runs each (timing results are too noisy on my machine):

        | before         | after
-O2 -g  | 68.602.837.460 | 68.821.051.076
        | 68.566.922.163 | 68.538.048.917
        | 68.538.997.311 | 68.509.463.180
-O0 -g  | 25.162.639.627 | 25.135.102.049
        | 25.129.536.208 | 25.093.005.413
        | 25.146.124.560 | 25.119.503.250
-O0 -g0 | 16.232.387.222 | 16.215.831.899
        | 16.200.846.485 | 16.231.492.196
        | 16.216.916.110 | 16.192.568.580

It looks like the difference is in the noise, at least for this test case.

It might be also interesting to gather numbers for the CTMark part of the test-suite.

Please can we have some perf comparisons for just -emit-llvm (not invoking any of the back-end infrastructure)? Given the numbers so far, I'd expect these to also be in the noise, but even at -O0 a quite significant proportion of the compile time is spent after the stages that use IRBuilder.

You can completely isolate the code generation time using the "Code Generation Time" timer from clang -ftime-report.

For the same test case as before, instruction counts with -emit-llvm -Xclang -disable-llvm-passes:

# before
8.409.134.507
8.439.851.250
8.436.145.165
# after
8.415.563.795
8.426.474.507
8.434.608.865

So this still looks like noise to me. And looking at a perf trace, IRBuilder methods show up with fractions of 0,01-0,04%, even with all optimizations and codegen disabled, so it's not very surprising that there is no observable difference.

simoll added a subscriber: simoll.Feb 12 2020, 12:17 AM
nhaehnle accepted this revision.Feb 16 2020, 4:02 AM

Two minor questions, apart from that LGTM.

llvm/include/llvm/Analysis/TargetFolder.h
32

Could we make the TargetFolder final?

llvm/include/llvm/IR/ConstantFolder.h
28–30

Same question here: could we make the ConstantFolder final?

This revision is now accepted and ready to land.Feb 16 2020, 4:02 AM
nikic updated this revision to Diff 244870.Feb 16 2020, 5:36 AM
nikic marked 2 inline comments as done.

Make ConstantFolder/TargetFolder/NoFolder final.

nikic marked an inline comment as done.Feb 16 2020, 5:37 AM
nikic added inline comments.
llvm/include/llvm/Analysis/TargetFolder.h
32

Yeah, don't see a reason not to... can always drop it again if someone is extending these for some reason.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 16 2020, 5:40 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

This causes this build warning:

../../llvm/include/llvm/IR/IRBuilder.h:77:16: warning: 'anchor' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
  virtual void anchor();
               ^
../../llvm/include/llvm/IR/IRBuilder.h:62:16: note: overridden virtual function is here
  virtual void anchor();
               ^
nikic reopened this revision.Feb 16 2020, 8:12 AM

Reverted because of failures on http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/22164. Not sure why this is the only builder that's failing, but I believe the problematic code is https://github.com/llvm/llvm-project/blob/af480e8c63b27ad247b8430cf124da8bcdf752f8/llvm/lib/IR/DIBuilder.cpp#L898-L910. Without NRVO this may copy IRBuilder<>, which is now self-referential due to the Folder/Inserter references in IRBuilderBase.

I'm not sure what the right way to fix this is. Should I add an explicit copy constructor? I guess that would also involve explicit move ctor and copy/move assignment, which seems pretty ugly. Maybe it makes more sense to explicitly delete it and just avoid the copy constructor in that particular code?

This revision is now accepted and ready to land.Feb 16 2020, 8:12 AM
nikic added a comment.Feb 16 2020, 8:51 AM

Tried what happens if the copy constructor is deleted: https://gist.github.com/nikic/1e6ab5bbf42cfe48e7b848e60a2df180 It looks like the referenced occurrence in DIBuilder was the only place that was using the copy constructor "for good reason", most of the other uses either copy the whole IRBuilder where they just want to pass a reference or copy the whole IRBuilder where they really just want an InsertPointGuard.

xbolva00 added a subscriber: xbolva00.EditedFeb 16 2020, 9:43 AM

@nikic: Can you please do compile-time benchmarks w/o this patch with e.g. php ?

This revision was automatically updated to reflect the committed changes.

This broke the LLVM_ENABLE_MODULES build because it introduces a cyclic dependency between LLVM_IR and LLVM_intrinsic_gen:

FAILED: tools/lld/lib/ReaderWriter/MachO/CMakeFiles/lldMachO.dir/MachONormalizedFileBinaryReader.cpp.o 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++  -DGTEST_HAS_RTTI=0 -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/lld/lib/ReaderWriter/MachO -I/Users/teemperor/1llvm/llvm-project/lld/lib/ReaderWriter/MachO -I/Users/teemperor/1llvm/llvm-project/lld/include -Itools/lld/include -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk/usr/include/libxml2 -Iinclude -I/Users/teemperor/1llvm/llvm-project/llvm/include -I/Users/teemperor/1llvm/llvm-project/lld/lib/ReaderWriter/MachO/. -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -fmodules -fmodules-cache-path=/Users/teemperor/1llvm/rel/module.cache -fcxx-modules -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -O3  -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk    -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT tools/lld/lib/ReaderWriter/MachO/CMakeFiles/lldMachO.dir/MachONormalizedFileBinaryReader.cpp.o -MF tools/lld/lib/ReaderWriter/MachO/CMakeFiles/lldMachO.dir/MachONormalizedFileBinaryReader.cpp.o.d -o tools/lld/lib/ReaderWriter/MachO/CMakeFiles/lldMachO.dir/MachONormalizedFileBinaryReader.cpp.o -c /Users/teemperor/1llvm/llvm-project/lld/lib/ReaderWriter/MachO/MachONormalizedFileBinaryReader.cpp
While building module 'LLVM_Object' imported from /Users/teemperor/1llvm/llvm-project/lld/lib/ReaderWriter/MachO/MachONormalizedFileBinaryReader.cpp:36:
While building module 'LLVM_IR' imported from /Users/teemperor/1llvm/llvm-project/llvm/include/llvm/Object/IRSymtab.h:29:
While building module 'LLVM_intrinsic_gen' imported from /Users/teemperor/1llvm/llvm-project/llvm/include/llvm/IR/IRBuilderFolder.h:18:
In file included from <module-includes>:1:
/Users/teemperor/1llvm/llvm-project/llvm/include/llvm/IR/Argument.h:19:10: fatal error: cyclic dependency in module 'LLVM_IR': LLVM_IR -> LLVM_intrinsic_gen -> LLVM_IR
#include "llvm/IR/Value.h"
         ^
While building module 'LLVM_Object' imported from /Users/teemperor/1llvm/llvm-project/lld/lib/ReaderWriter/MachO/MachONormalizedFileBinaryReader.cpp:36:
While building module 'LLVM_IR' imported from /Users/teemperor/1llvm/llvm-project/llvm/include/llvm/Object/IRSymtab.h:29:
In file included from <module-includes>:13:
/Users/teemperor/1llvm/llvm-project/llvm/include/llvm/IR/IRBuilderFolder.h:18:10: fatal error: could not build module 'LLVM_intrinsic_gen'
#include "llvm/IR/InstrTypes.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
While building module 'LLVM_Object' imported from /Users/teemperor/1llvm/llvm-project/lld/lib/ReaderWriter/MachO/MachONormalizedFileBinaryReader.cpp:36:
In file included from <module-includes>:4:
/Users/teemperor/1llvm/llvm-project/llvm/include/llvm/Object/IRSymtab.h:29:10: fatal error: could not build module 'LLVM_IR'
#include "llvm/IR/GlobalValue.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
/Users/teemperor/1llvm/llvm-project/lld/lib/ReaderWriter/MachO/MachONormalizedFileBinaryReader.cpp:36:10: fatal error: could not build module 'LLVM_Object'
#include "llvm/Object/MachO.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
4 errors generated.
modocache added inline comments.
llvm/include/llvm/IR/IRBuilder.h
137

I experienced a tiny regression after applying this patch, and I think this line might be the reason why. I tried addressing the issue in D74754, please give that a review when you get a chance.