This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Details
- Reviewers
Bigcheese jfb rsmith RKSimon rnk - Commits
- rG950b70dcc7e3: [Clang Interpreter] Initial patch for the constexpr interpreter
rC371834: [Clang Interpreter] Initial patch for the constexpr interpreter
rL371834: [Clang Interpreter] Initial patch for the constexpr interpreter
rG32f82c9cbaf3: [Clang Interpreter] Initial patch for the constexpr interpreter
rC370839: [Clang Interpreter] Initial patch for the constexpr interpreter
rL370839: [Clang Interpreter] Initial patch for the constexpr interpreter
rG8327fed9475a: [Clang Interpreter] Initial patch for the constexpr interpreter
rL370636: [Clang Interpreter] Initial patch for the constexpr interpreter
rC370636: [Clang Interpreter] Initial patch for the constexpr interpreter
rGafcb3de11726: [Clang Interpreter] Initial patch for the constexpr interpreter
rC370584: [Clang Interpreter] Initial patch for the constexpr interpreter
rL370584: [Clang Interpreter] Initial patch for the constexpr interpreter
rGd4c1002e0bbb: [Clang Interpreter] Initial patch for the constexpr interpreter
rC370531: [Clang Interpreter] Initial patch for the constexpr interpreter
rL370531: [Clang Interpreter] Initial patch for the constexpr interpreter
rGa55909505497: [Clang Interpreter] Initial patch for the constexpr interpreter
rC370476: [Clang Interpreter] Initial patch for the constexpr interpreter
rL370476: [Clang Interpreter] Initial patch for the constexpr interpreter
Diff Detail
- Repository
- rL LLVM
Event Timeline
clang/include/clang/Basic/LangOptions.def | ||
---|---|---|
291–294 ↗ | (On Diff #213277) | The goal is to evaluate everything that is constant - I would like to avoid using constexpr specifically in the name. |
clang/lib/AST/ExprConstant.cpp | ||
5039–5040 ↗ | (On Diff #213277) | This is intended - if the interpreter bails and ForceInterp is set, we emit a diagnostic in Context, promoting the error to a failure. |
clang/lib/AST/Interp/ByteCodeGen.cpp | ||
352–381 ↗ | (On Diff #213277) | Same as with statement expressions in the evaluating emitter - some sort of closure which compiles and executes when triggered. |
536 ↗ | (On Diff #213277) | We absolutely need this to emit the warning: constexpr int callee_ptr(int *beg, int *end) { const int x = 2147483647; *beg = x + x; // expected-warning {{overflow in expression; result is -2 with type 'int'}}\ // expected-note 3{{value 4294967294 is outside the range of representable values of type 'int'}} return *beg; } |
546 ↗ | (On Diff #213277) | Refactored + fixed this by moving stuff to separate functions. |
647–662 ↗ | (On Diff #213277) | This is the only case where such a switch occurs - there is another for zero initialisation, but it's quite different. A def which would capture the little common functionality would be quite ugly, so I would like to avoid it. As for other integers, there won't be support for any. There are two more branches for the fixed-point catch-all fallback. |
clang/lib/AST/Interp/ByteCodeGen.h | ||
40 ↗ | (On Diff #213277) | The entry point is through the emitter which needs to call into the code generator, things don't really work if the emitter is a field. |
clang/lib/AST/Interp/EvalEmitter.cpp | ||
71–75 ↗ | (On Diff #213277) | Currently the interpreter cannot work side-by-side with the interpreter since reading from APValue is not supported at all... Since I am unsure whether I ever want to implement an APValue -> Interpreter mapping, I do not mind this, as long as diagnostic pinpoint the next feature to be implemented. |
105–111 ↗ | (On Diff #213277) | Theoretically not, but it helped me catch a few bugs in the past, so I would like to keep it conditional. |
clang/lib/AST/Interp/EvalEmitter.h | ||
69 ↗ | (On Diff #213277) | The conditional operator will require it to handle the fallthrough from the false branch to the following expression. |
112–114 ↗ | (On Diff #213277) | I'd like to revisit this at a later point in time - for now I think predication is the simplest option and I would like to stick with the simple options until the interpreter is feature-complete. Statement expressions would not be predicated, the current plan is to generate a closure-like method for them. |
clang/lib/AST/Interp/Interp.h | ||
213–222 ↗ | (On Diff #213277) | I don't think there's a zero-cost way around this. We'll need to track more metadata to check whether fields are public/private and traverse the pointer chain to find divergence points. The InlineDescriptors might have a few spare bits could be used to cache "tags" which classify pointers into comparable classes? At this point, it is quite hard to decide what case to make fast at what cost. |
clang/lib/AST/Interp/Opcodes.td | ||
1 ↗ | (On Diff #213277) | I was a bit afraid of documenting everything and then changing everything, but things seem to be quite stable now, so I will add some documentation. |
clang/lib/AST/Interp/Pointer.h | ||
69–78 ↗ | (On Diff #213277) | It's definitely on the TODO list - I would like to avoid such optimisations for now, until enough features are available to run sizeable benchmarks. |
245–249 ↗ | (On Diff #213277) | I haven't written things up yet since I changed quite a few things over the past few weeks. I am wrapping up support for unions and will produce a more detailed document afterwards. In the meantime, I have documented this field. |
251–261 ↗ | (On Diff #213277) | It's on the TODO list - I am aiming for C++11 conformance ASAP and this will be required. |
The old path-based approach is slow for most operations, while the new pointer-based approach is fast for the most common ones.
For read/write/update operations, it is enough to check the pointer and at most a single descriptor to ensure whether the operation
can be performed. For some operations, such as some pointer comparisons, in the worst case the old-style path can be reproduced by
following the pointers produced through the getBase method, allowing us to implement everything.
Extern objects are handled by allocating memory for them, allowing pointers to those regions to exist, but marking the actual storage
as extern/uninitialized and preventing reads/writes to that storage.
Some details of corner cases are not clear at this point, but the fact that we can reproduce full paths and have space to store any special
bits gives me the confidence that we can converge to a full implementation.
I want to stress again the fact that C++ has an insane number of features and I am focusing on implementing as many of them, leaving a
path towards future optimisations open. I do not want to clobber initial patches with optimisations that would increase complexity - the
few things we do with direct local reads/writes and improved loops already yield significant speedupds.
I think we're past the point of large-scale structural comments that are better addressed before the first commit, and it'd be much better to make incremental improvements from here.
Please go ahead and commit this, and we can iterate in-tree from now on. Thanks!
clang/include/clang/Basic/OptionalDiagnostic.h | ||
---|---|---|
40–74 ↗ | (On Diff #214828) | I think these three all belong in PartialDiagnostic rather than in OptionalDiagnostic. (It doesn't make sense that an OptionalDiagnostic can format an APSInt but a PartialDiagnostic cannot.) |
clang/include/clang/Driver/Options.td | ||
839 ↗ | (On Diff #214828) | const -> constant |
clang/lib/AST/Interp/ByteCodeGen.cpp | ||
25 ↗ | (On Diff #214828) | using llvm::Expected; |
536 ↗ | (On Diff #213277) | As noted, the evaluation should be treated as non-constant in this case. This is "evaluate for overflow" mode, in which we want to keep evaluating past non-constant operations (and merely skip sub-evaluations that we can't process because they have non-constant values). We will need a bytecode representation for "non-constant evaluation". We should use that here. |
clang/lib/AST/Interp/ByteCodeGen.h | ||
40 ↗ | (On Diff #213277) | I think the fact that the entry point to ByteCodeGen is in the Emitter base class is the cause of this design complexity (the inheritance, virtual functions, and layering cycle between the code generator and the emitter). For example, looking at ByteCodeEmitter, we have:
(And the same thing happens in EvalEmitter.) That to me suggests an opportunity to improve the layering:
No need to do this right away. |
85 ↗ | (On Diff #214828) | Please can you pick a different name for this, that's more different from the AccessKinds enum? AccessOp or something like that maybe? |
clang/lib/AST/Interp/Opcodes.td | ||
178–221 ↗ | (On Diff #214828) | Please document what the Args mean. |
Seems like this patch introduced a cyclic dependency between Basic and AST when building with LLVM_ENABLE_MODULES=On:
While building module 'Clang_AST' imported from llvm/lldb/include/lldb/Symbol/ClangUtil.h:14: While building module 'Clang_Basic' imported from llvm/llvm/../clang/include/clang/AST/Stmt.h:18: In file included from <module-includes>:73: clang/include/clang/Basic/OptionalDiagnostic.h:17:10: fatal error: cyclic dependency in module 'Clang_AST': Clang_AST -> Clang_Basic -> Clang_AST #include "clang/AST/APValue.h"
This patch broke my build as well (discovered by runnin git bisect), command line
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=/usr/local/bin/clang -DCMAKE_ASM_COMPILER=/usr/local/bin/clang -DCMAKE_CXX_COMPILER=/usr/local/bin/clang++ -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_SHARED_LIBS=ON -DLLVM_USE_SPLIT_DWARF=ON -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=ARC -DLLVM_ENABLE_PROJECTS=clang -DLLVM_BUILD_TESTS=ON -DLLVM_BUILD_DOCS=ON -DLLVM_ENABLE_WERROR=ON -H/redacted/git/llvm-project/llvm -B/tmp/llvm-project_dbg_compiled-with-clang -GNinja
ninja -C /tmp/llvm-project_dbg_compiled-with-clang
Error:
/usr/bin/ld: tools/clang/lib/AST/Interp/CMakeFiles/obj.clangInterp.dir/Block.cpp.o:(.data+0x0): undefined reference to `llvm::EnableABIBreakingChecks'
/usr/bin/ld: tools/clang/lib/AST/Interp/CMakeFiles/obj.clangInterp.dir/ByteCodeEmitter.cpp.o: in function `clang::interp::ByteCodeEmitter::emitAdd(clang::interp::PrimType, clang::interp::SourceInfo const&)':
/usr/bin/ld: tools/clang/lib/AST/Interp/CMakeFiles/obj.clangInterp.dir/ByteCodeEmitter.cpp.o: in function `clang::interp::ByteCodeEmitter::emitAddOffset(clang::interp::PrimType, clang::interp::SourceInfo const&)':
...
Shared library builds seem to be broken indeed. I tried fixing by adding Support and clangAST as dependencies for clangInterp, but that creates a cyclic dependency between clangAST <-> clangInterp. Which makes me wonder whether clangInterp should be a separate library or be part of clangAST?
And that was even pointed out several days ago before the patch got relanded last two times:
@nand Please don't reland this patch until these concerns are addressed.
Thanks for identifying these issues - I fixed the cycle detected by modules since then, however I wasn't aware of the issue with shared libs until now.
MSVC builds are seeing template instantiation issues with this patch: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/27972/steps/build/logs/warnings%20%2824%29
Merged clangInterp and clangAST into a single library while keeping Interp in a separate folder.
This is required since clangInterp needs access to the AST, but the intry points to the interpreter
are also in AST.
Does anyone have any suggestions on how to fix the MSVC problems? I am trying to get access to a Windows machine, but it's not guaranteed.
I don't have an actionable alternative suggestion but this seems like counter-correct layering, trying to paper over the issue instead of fixing it.
The existing evaluator is not a separate library to start with. Given the tangling between ExprConstants.cpp and the AST nodes, I don't really see any elegant way of dealing with this problem.
An example of the problem is FieldDecl::getBitWidthValue.
@nand The MSVC warnings are self explanatory - you've declared a number of methods (visitIndirectMember, emitConv and getPtrConstFn) but not provided definitions, as they're on template classes MSVC complains, a lot.
I am providing definitions in the C++ file - the problem is that they are not available in the header before the extern declaration. The methods are available at the site of the extern definition.
gcc and clang accept this, so does Visual Studio 2019. This feels like an incorrect implementation of extern templates in Visual Studio?
I see two ways to proceed: move everything into a header (would like to avoid this) or silence the warning on VC++ (not great either).
Is there a better way? Which option is less bad from these two?
I think I'm missing something - for instance when I search the patch for "visitIndirectMember", all I get is this in ByteCodeExprGen.h:
bool visitIndirectMember(const BinaryOperator *E);
I don't see it in ByteCodeExprGen.cpp
Totally missed that - thanks for noticing. I must have forgotten to remove stuff from the header since clang/gcc don't warn about it.
I'll get hold of a Windows machine soon-ish and I'll make sure to fix this problem.
Thanks!
If you remove those 3 declarations from the headers and update the patch I can test it on my windows machines for you.