This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/
-
clang/
-
CodeGen/
-
CodeGenAction.h
-
Frontend/
2/2
FrontendAction.h
-
Interpreter/
-
Interpreter.h
5/5
Transaction.h
-
lib/
-
CMakeLists.txt
-
CodeGen/
7/7
CodeGenAction.cpp
-
Frontend/
-
FrontendAction.cpp
-
Interpreter/
-
CMakeLists.txt
7/7
IncrementalExecutor.h
5/5
IncrementalExecutor.cpp
4/4
IncrementalParser.h
23/26
IncrementalParser.cpp
12/12
Interpreter.cpp
-
test/
-
CMakeLists.txt
-
Interpreter/
-
execute.cpp
-
sanity.c
1/1
lit.cfg.py
-
tools/
-
CMakeLists.txt
-
clang-repl/
6/6
CMakeLists.txt
2/2
ClangRepl.cpp
-
unittests/
-
CMakeLists.txt
-
CodeGen/
-
CMakeLists.txt
-
IncrementalProcessingTest.cpp
-
Interpreter/
-
CMakeLists.txt
-
IncrementalProcessingTest.cpp
7/7
InterpreterTest.cpp

Differential D96033

[clang-repl] Land initial infrastructure for incremental parsing
ClosedPublic

Authored by v.g.vassilev on Feb 4 2021, 6:31 AM.

Download Raw Diff

Details

Reviewers

rsmith
lhames
rjmccall
teemperor
aprantl
sgraenitz
hfinkel

Commits

rG44a4000181e1: [clang-repl] Land initial infrastructure for incremental parsing

Summary

In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have mentioned our plans to make some of the incremental compilation facilities available in llvm mainline.

This patch proposes a minimal version of a repl, clang-repl, which enables interpreter-like interaction for C++. For instance:

./bin/clang-repl
clang-repl> int i = 42;
clang-repl> extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=42
clang-repl> quit

The patch allows very limited functionality, for example, it crashes on invalid C++. The design of the proposed patch follows closely the design of cling. The idea is to gather feedback and gradually evolve both clang-repl and cling to what the community agrees upon.

The IncrementalParser class is responsible for driving the clang parser and codegen and allows the compiler infrastructure to process more than one input. Every input adds to the “ever-growing” translation unit. That model is enabled by an IncrementalAction which prevents teardown when HandleTranslationUnit.

The IncrementalExecutor class hides some of the underlying implementation details of the concrete JIT infrastructure. It exposes the minimal set of functionality required by our incremental compiler/interpreter.

The Transaction class keeps track of the AST and the LLVM IR for each incremental input. That tracking information will be later used to implement error recovery.

The Interpreter class orchestrates the IncrementalParser and the IncrementalExecutor to model interpreter-like behavior. It provides the public API which can be used (in future) when using the interpreter library.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@v.g.vassilev Great to see this getting upstreamed! @teemperor Thanks for adding, I will take a look in the next days.

Address more review comments.

clang/include/clang/Frontend/FrontendAction.h
237	Done. Ideally I would like to avoid making these routines virtual. However, here we call `EndSourceFile` https://github.com/llvm/llvm-project/blob/1f6ec3d08f75dba6c93c291bd92552b807736eb3/clang/lib/Frontend/CompilerInstance.cpp#L952 and this shuts down some of the objects required by the IncrementalAction. In fact, after some checking, it does not make sense to allow overriding `BeginSourceFile` -- it initializes state of the `Action` class and does not make sense to forward it for instance in the `WrapperFrontendAction`. In principle, clients may be interested in initialize differently but for clang-repl we only need the `EndSourceFile` to be virtual. Alternatively, we could somehow make the Inputs to take an "external" iterator (https://github.com/llvm/llvm-project/blob/1f6ec3d08f75dba6c93c291bd92552b807736eb3/clang/lib/Frontend/CompilerInstance.cpp#L942) which will provide us with a hook to treat each input line as a separate Input file. AFAICT, that has two disadvantages: 1) we will initialize/finalize more state of various objects which can hinder performance; 2) each input file is considered as a separate translation unit (TU) which clashes with the concept we introduce of an ever growing single TU.
clang/include/clang/Interpreter/Transaction.h
2	I added some documentation. Having a long name for the Transaction class will make the future code clunky. This class will be extensively used throughout the interpreter codebase and api. I'd be in favor of going for a namespace but then I think we will have confusing naming for say `clang::interpreter::Interpreter`...
clang/lib/Interpreter/IncrementalExecutor.h
37	The case we have is when there is no JIT -- currently we have such a case in IncrementalProcessingTest I think. Another example, which will show up in future patches, is the registration of atexit handlers. That is, before we `runCtors` we make a pass over the LLVM IR and collect some specific details and (possibly change the IR and then run). I'd rather keep it separate for now if that's okay.
clang/lib/Interpreter/IncrementalParser.cpp
64	Cling, on which this patch is based on, has a CodeCompleteConsumer and it seems to work. I can leave it out but then we will have to remember where to put it. I have a slight preference to leave it as it is.
70	Indeed, should be fixed.
102	Nice catch! Thanks!
137	Not directly. A test case should be #pragma weak f void f() {} This is tested by standard, we "just" record some decls in the vector. Do you have a specific test in mind?
166	That is good to know. I will try to remember this and use it if needed.
190	assert on the FID you mean?
196	I can work on that but I'd be in favor of not holding up that patch due to this issue. I see two options: remove the code -- that'd make it harder for cling to reuse this piece of code. keep it as is -- the risk is that we have an untested codepath. Either way, that may be some bug in clang that we need to fix and drop these lines altogether... That particular issue is tracked here https://github.com/root-project/root/pull/7078 (some of the related context is here https://paste.ubuntu.com/p/bM2VRgmqSG/ in a very obscure way :) )
clang/lib/Interpreter/IncrementalParser.h
51	Yeah, much nicer.
clang/lib/Interpreter/Interpreter.cpp
44	Let's keep it here for now and we can always move somewhere else. Fixing the includes. Thanks!
83	I do not understand the problem entirely, could you propose wording for the FIXME?
141	No worries, fixed.
clang/tools/clang-repl/CMakeLists.txt
4	It compiles just fine for me. Should we add it back once we find the setup that requires it?
clang/tools/clang-repl/ClangRepl.cpp
60	Good point!

v.g.vassilev added inline comments.Feb 13 2021, 12:41 AM

clang/lib/CodeGen/CodeGenAction.cpp
888	@rjmccall, we were wondering if there is a better way to ask CodeGen to start a new module. The current approach seems to be drilling hole in a number of abstraction layers. In the past we have touched that area a little in https://reviews.llvm.org/D34444 and the answer may be already there but I fail to connect the dots. Recently, we thought about having a new FrontendAction callback for beginning a new phase when compiling incremental input. We need to keep track of the created objects (needed for error recovery) in our Transaction. We can have a map of `Transaction` to `llvm::Module` in CodeGen. The issue is that new JITs take ownership of the `llvm::Module*` which seems to make it impossible to support jitted code removal with that model (cc: @lhames, @rsmith).

Hi Vassil, thanks for upstreaming this! I think it goes into a good direction.

The last time I looked at the Cling sources downstream, it was based on LLVM release 5.0. The IncrementalJIT class was based on what we call OrcV1 today. OrcV1 is long deprecated and even though it's still in tree today, it will very likely be removed in the upcoming release cycle. So I guess, one of the challenges will be porting the Cling internals to OrcV2 -- a lot has changed, mostly to the better :) Not all of this is relevant for this patch, but maybe it's worth mentioning for upcoming additions.

OrcV2 works with Dylibs, basically symbol namespaces. When you add a module to a Dylib, all its symbols will be added. Respectively, if you want to remove something from a Dylib, you have to remove the symbols (for fine-tuning it you can reach for a Dylib's ResourceTracker). Symbols won't be materialized until you look them up. I guess for incremental compilation you would keep on adding symbols, one increment at a time.

int var1 = 42; int f() { return var1; }
int var2 = f();

Let's say these are two inputs. The first line only adds definitions for symbols var1 and f() but won't materialize anything. The second line would have to lookup f(), execute it and emit a new definition for var2. I never got into Cling deep enough to find out how it works, but I assume it's high-level enough that it won't require large changes. One thing I'd recommend to double-check: if there is a third line that adds a static constructor, will LLJIT only run this one or will it run all previous static ctors again when calling initialize()? I assume the former but I wouldn't bet on it.

Another aspect is that downstream Cling is based on RuntimeDyld for linking Orc's output object files. I remember RemovableObjectLinkingLayer adding some object file removal code. Upstream OrcV2 grew it's own linker in the meantime. It's called JITLink and gets pulled into LLJIT via ObjectLinkingLayer. RuntimeDyld-based linking is still supported with the RTDyldObjectLinkingLayer. JITLink is not complete for all platforms yet. Thus, LLJITBuilder defaults to JITLink on macOS and RuntimeDyld otherwise. Chances are that JITLink gets good enough for ELF to enable it by default on Linux (at least x86-64). I guess that's your platform of concern? The related question is whether you are aiming for JITLink straight away or staying with RuntimeDyld for now.

For the moment, I added a few pointers inline. Some are referring to my general comments above.

clang/lib/CodeGen/CodeGenAction.cpp
888	When compiling incrementally, doeas a "new phase" mean that all subsequent code will go into a new module from then on? How will dependencies to previous symbols be handled? Are all symbols external? The issue is that new JITs take ownership of the llvm::Module* That's true, but you can still keep a raw pointer to it, which will be valid at least as long as the module wasn't linked. Afterwards it depends on the linker: RuntimeDyld can return ownership of the object's memory range via `NotifyEmittedFunction` JITLink provides the `ReturnObjectBufferFunction` for the same purpose seems to make it impossible to support jitted code removal with that model Can you figure out what symbols are affected and remove these? A la: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/include/llvm/ExecutionEngine/Orc/Core.h#L1020 I think @anarazel has ported a client with code removal to OrcV2 successfully in the past. Maybe there's something we can learn from it.
clang/lib/Interpreter/IncrementalExecutor.h
37	Should we maybe merge runCtors and addModule? +1 even though there may be open questions regarding incremental initialization. The case we have is when there is no JIT -- currently we have such a case in IncrementalProcessingTest Can you run anything if there is no JIT? I think what you have in `IncrementalProcessing.EmitCXXGlobalInitFunc` is `getGlobalInit(llvm::Module*)`, which checks for symbol names with a specific prefix. before we runCtors we make a pass over the LLVM IR and collect some specific details and (possibly change the IR and then run). The idiomatic solution for such modifications would use an IRTransformLayer as in: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/examples/OrcV2Examples/LLJITWithOptimizingIRTransform/LLJITWithOptimizingIRTransform.cpp#L108 Another example, which will show up in future patches, is the registration of atexit handlers `atexit` handlers as well as global ctors/dtors should be covered by LLJIT PlatformSupport. The LLJITBuilder will inject a GenericLLVMIRPlatformSupport instance by default: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L125 It's not as comprehensive as e.g. the MachO implementation, but should be sufficient for your use-case as you have IR for all your JITed code. (It would NOT work if you cached object files, reloaded them in a subsequent session and wanted to run their ctors.) So, your below call to `initialize()` should do it already.
clang/unittests/Interpreter/InterpreterTest.cpp
30	Warning: `std::move` prevents copy elision

Hi Stefan,

Thanks a lot for the details you shared. They are really helpful to me.

In D96033#2565708, @sgraenitz wrote:

Hi Vassil, thanks for upstreaming this! I think it goes into a good direction.

The last time I looked at the Cling sources downstream, it was based on LLVM release 5.0. The IncrementalJIT class was based on what we call OrcV1 today. OrcV1 is long deprecated and even though it's still in tree today, it will very likely be removed in the upcoming release cycle. So I guess, one of the challenges will be porting the Cling internals to OrcV2 -- a lot has changed, mostly to the better :) Not all of this is relevant for this patch, but maybe it's worth mentioning for upcoming additions.

Cling is currently being upgraded to llvm9 (https://github.com/vgvassilev/cling/tree/upgrade_llvm90). I expect by the end of this year to be upgraded again to llvm12 and the big challenge is making good use of OrcV2. We need to support code removal of the kind:

[cling] struct Adder { double Add(double a, double b) {return a - b; }; // comes in a "script" file.
[cling] Adder adder; printf("%f\n", adder.Add(1, 2)); //realize that we have a mistake.
[cling] .undo 2
[cling] struct Adder { double Add(double a, double b) {return a + b; }; // comes in a "script" file.
[cling] Adder adder; printf("%f\n", adder.Add(1, 2));

Some more details can be seen here -- https://blog.llvm.org/posts/2020-11-30-interactive-cpp-with-cling/

In the example above the JIT will need to remove objects from its state (including already created machine code). Until recently that was not entirely possible with the OrcV2. Lang was actively working on this and maybe this now works well.

The other aspect of this is that upon unloading of these pieces of code we need to run the destructors (that's why we need some non-canonical handling of when we run the atexit handlers).

I'd very much want to jump on the OrcV2 with this patch but that would cause me a longer term problem as now the initial version of minimal cling (clang-repl) is already substantially different than cling itself. My upstream strategy was to make a minimal patch with design as close as possible to cling. Then depending on the comments we will start evolving both systems in such a way that cling is not left substantially behind as it still has a majority of interactive C++ cases which we care about.

OrcV2 works with Dylibs, basically symbol namespaces. When you add a module to a Dylib, all its symbols will be added. Respectively, if you want to remove something from a Dylib, you have to remove the symbols (for fine-tuning it you can reach for a Dylib's ResourceTracker). Symbols won't be materialized until you look them up. I guess for incremental compilation you would keep on adding symbols, one increment at a time.

All this sounds super nice and I am eager to start gradually using it.

int var1 = 42; int f() { return var1; }
int var2 = f();
Let's say these are two inputs. The first line only adds definitions for symbols var1 and f() but won't materialize anything. The second line would have to lookup f(), execute it and emit a new definition for var2. I never got into Cling deep enough to find out how it works, but I assume it's high-level enough that it won't require large changes. One thing I'd recommend to double-check: if there is a third line that adds a static constructor, will LLJIT only run this one or will it run all previous static ctors again when calling initialize()? I assume the former but I wouldn't bet on it.

Would that capture your concern?

./bin/clang-repl 
clang-repl> extern "C" int printf(const char*,...);
clang-repl> int var1 = 42; int f() { return printf("init_once\n"); }
clang-repl> int var2 = f();
init_once
clang-repl> int var3 = f();
init_once
clang-repl> struct S{ S(const char*) {} } s("");
clang-repl>

I think the reason we do not rerun the static constructors is this tweak we have in codegen https://github.com/llvm/llvm-project/commit/188ad3ac02d06

Another aspect is that downstream Cling is based on RuntimeDyld for linking Orc's output object files. I remember RemovableObjectLinkingLayer adding some object file removal code. Upstream OrcV2 grew it's own linker in the meantime. It's called JITLink and gets pulled into LLJIT via ObjectLinkingLayer. RuntimeDyld-based linking is still supported with the RTDyldObjectLinkingLayer. JITLink is not complete for all platforms yet. Thus, LLJITBuilder defaults to JITLink on macOS and RuntimeDyld otherwise. Chances are that JITLink gets good enough for ELF to enable it by default on Linux (at least x86-64). I guess that's your platform of concern?

We also care about COFF.

The related question is whether you are aiming for JITLink straight away or staying with RuntimeDyld for now.

I'd prefer to stay closer to cling at least with that initial patch. That'd mean stick to the RuntimeDyld and switch when cling is ready (presumably end of this year).

clang/lib/CodeGen/CodeGenAction.cpp
888	When compiling incrementally, doeas a "new phase" mean that all subsequent code will go into a new module from then on? How will dependencies to previous symbols be handled? Are all symbols external? There is some discussion on this here https://reviews.llvm.org/D34444#812418 I think the relevant bit is that 'we have just one ever growing TU [...] which we send to the RuntimeDyLD allowing only JIT to resolve symbols from it. We aid the JIT when resolving symbols with internal linkage by changing all internal linkage to external (We haven't seen issues with that approach)'. The issue is that new JITs take ownership of the llvm::Module* That's true, but you can still keep a raw pointer to it, which will be valid at least as long as the module wasn't linked. That was my first implementation when I upgraded cling to llvm9 where the `shared_ptr`s went to `unique_ptr`s. This was quite problematic for many of the use cases we support as the JIT is somewhat unpredictable to the high-level API user. Afterwards it depends on the linker: RuntimeDyld can return ownership of the object's memory range via `NotifyEmittedFunction` JITLink provides the `ReturnObjectBufferFunction` for the same purpose That's exactly what we ended up doing (I would like to thank Lang here who gave a similar advice). seems to make it impossible to support jitted code removal with that model Can you figure out what symbols are affected and remove these? A la: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/include/llvm/ExecutionEngine/Orc/Core.h#L1020 I think @anarazel has ported a client with code removal to OrcV2 successfully in the past. Maybe there's something we can learn from it. Indeed. That's not yet on my radar as seemed somewhat distant in time.
clang/lib/Interpreter/IncrementalExecutor.h
37	Should we maybe merge runCtors and addModule? +1 even though there may be open questions regarding incremental initialization. The case we have is when there is no JIT -- currently we have such a case in IncrementalProcessingTest Can you run anything if there is no JIT? I think what you have in `IncrementalProcessing.EmitCXXGlobalInitFunc` is `getGlobalInit(llvm::Module)`, which checks for symbol names with a specific prefix. Yes, I'd think such mode is useful for testing but also for other cases where the user is handed a Transaction and allowed to make some modification before processing the `llvm::Module` before we runCtors we make a pass over the LLVM IR and collect some specific details and (possibly change the IR and then run). The idiomatic solution for such modifications would use an IRTransformLayer as in: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/examples/OrcV2Examples/LLJITWithOptimizingIRTransform/LLJITWithOptimizingIRTransform.cpp#L108 That looks very nice. It assumes the JIT is open to the users, here we open only the `llvm::Module` (I am not arguing if that's a good idea in general). Another example, which will show up in future patches, is the registration of atexit handlers `atexit` handlers as well as global ctors/dtors should be covered by LLJIT PlatformSupport. The LLJITBuilder will inject a GenericLLVMIRPlatformSupport instance by default: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L125 Does that give me control over when the `atexit` handlers are called? Can the interpreter call them at its choice? It's not as comprehensive as e.g. the MachO implementation, but should be sufficient for your use-case as you have IR for all your JITed code. (It would NOT work if you cached object files, reloaded them in a subsequent session and wanted to run their ctors.) So, your below call to `initialize()` should do it already.

teemperor added inline comments.Feb 17 2021, 12:56 AM

clang/lib/Interpreter/IncrementalParser.cpp
64	Alright, given that this is just passing along the CompletionConsumer from Ci to Sema, I think this can stay then.
185	The `+ 2` here is probably not what you want. This will just give you a pointer into Clang's source buffers and will eventually point to random source buffers (or worse) once InputCount is large enough. I feel like the proper solution is to just use the StartOfFile Loc and don't add any offset to it. I think Clang should still be able to figure out a reasonable ordering for overload candidates etc. (I thought I already commented on this line before, but I can't see my comment or any replies on Phab so I'm just commenting again).
190	I meant the `EnterSourceFile` call has a `bool` error it returns (along with a diagnostic). I think the only way it should fail is if the previous code got somehow messed up, hence the assert suggestion.
clang/lib/Interpreter/Interpreter.cpp
83	The PCM files generated with `-gmodules` are object-files that contain the Clang AST inside them. To deal with the object-file wrapping and get to the AST inside, we need the `ObjectFilePCHContainerReader`. But that class is part of CodeGen which isn't a dependency of the normal parsing logic, so these classes can't be hooked up automatically by Clang like the other 'ContainerReaders' classes (well, there is only one other classes which just opens the file normally). So to work around the fact that to parse code we now need a CodeGen dependency, every clang tool (which usually depend on CodeGen + parsing code) has to manually register the `ObjectFilePCHContainer*` classes. My point is just that it's not sustainable to have everyone copy around these three lines as otherwise Clang won't handle any PCM file that was generated with -gmodules. I think the FIXME for this could be: `FIXME: Clang should register these container operations automatically.`.
clang/tools/clang-repl/CMakeLists.txt
4	On my Linux setup the dependencies were required (and it seems logical that they are required), so I would just add them.

lhames added inline comments.Feb 17 2021, 7:26 PM

clang/lib/CodeGen/CodeGenAction.cpp
888	Recently, we thought about having a new FrontendAction callback for beginning a new phase when compiling incremental input. We need to keep track of the created objects (needed for error recovery) in our Transaction. We can have a map of Transaction* to llvm::Module* in CodeGen. The issue is that new JITs take ownership of the llvm::Module* which seems to make it impossible to support jitted code removal with that model (cc: @lhames, @rsmith). In the new APIs, in order to enable removable code, you can associate Modules with ResourceTrackers when they're added to the JIT. The ResourceTrackers then allow for removal. Idiomatic usage looks like: auto Mod = /* create module /; auto RT = JD.createResourceTracker(); J.addModule(RT, std::move(Mod)); //... if (auto Err = RT.remove()) / handle Err /; we have just one ever growing TU [...] which we send to RuntimeDyld... So is a TU the same as an llvm::Module in this context? If so, how do you reconcile that with the JIT taking ownership of modules? Are you just copying the Module each time before adding it? We need to keep track of the created objects (needed for error recovery) in our Transaction. Do you need the Module for error recovery? Or just the Decls?
clang/lib/Interpreter/IncrementalExecutor.cpp
30–52	I think this can be shortened to: using namespace llvm::orc; llvm::ErrorAsOutParameter EAO(&Err); if (auto JitOrErr = LLJITBuilder.create()) Jit = std::move(JitOrErr); else { Err = JitOrErr.takeError(); return; } const auto &DL = Jit->getDataLayout(); if (auto PSGOrErr = DynamicLibrarySearchGenerator::GetForCurrentProcess(DL.getGlobalPrefix())) Jit->getMainJITDylib().addGenerator(std::move(PSGOrErr)); else { Err = PSGOrErr.takeError(); return; } You don't need the call to `llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);` any more: DynamicLibrarySearchGenerator::GetForCurrentProcess does that for you.
57–60	This doesn't look right. The ThreadSafeContext has to contain the LLVMContext for the module, but here you're creating a new unrelated ThreadSafeContext.
clang/lib/Interpreter/IncrementalExecutor.h
37	Should we maybe merge runCtors and addModule? +1 even though there may be open questions regarding incremental initialization. In the long term constructors should be run via the Orc runtime (currently planned for initial release in LLVM 13 later this year). I like the idea of keeping "add module" and "run initializers" as two separate steps, with initializers being run only when you execute a top level expression. It would allow for workflows like this: interpreter% :load a.cpp interpreter% :load b.cpp where an initializer in a.cpp depends on code in b.cpp. It would also allow for defining constructors with forward references in the REPL itself. The Orc runtime is currently focused on emulating the usual execution environment: The canonical way to execute initializers is by calling jit_dlopen on the target JITDylib. I think the plan should be to generalize this behavior (either in the jit_dlopen contract, or by introducing a jit_dlopen_repl function) to allow for repeated calls to dlopen, with each subsequent dlopen call executing any discovered-but-not-yet-run initializers. Does that give me control over when the atexit handlers are called? Can the interpreter call them at its choice? It's not as comprehensive as e.g. the MachO implementation, but should be sufficient for your use-case as you have IR for all your JITed code. (It would NOT work if you cached object files, reloaded them in a subsequent session and wanted to run their ctors.) So, your below call to initialize() should do it already. Yep -- initialize should run the constructors, which should call cxa_atexit. The cxa_atexit calls should be interposed by GenericLLVMIRPlatform, and the atexits run when you call LLJIT::deinitialize on the JITDylib. There are some basic regression tests for this, but it hasn't been stress tested yet. GenericLLVMIRPlatform should actually support initializers in cached object files that were compiled from Modules added to LLJIT: The platform replaces llvm.global_ctors with an init function with a known name, then looks for that name in objects loaded for the cache. At least that was the plan, I don't recall whether it has actually been tested. What definitely doesn't work is running initializers in objects produced outside LLJIT. That will be fixed by JITLink/ELF and the Orc Runtime though (and already works for MachO in the runtime prototype).
clang/lib/Interpreter/IncrementalParser.cpp
156–160	Wherever this CodeGenAction is created it's probably the missing piece of the ThreadSafeContext puzzle: CodeGenAction's constructor takes an LLVMContext*, creating a new LLVMContext if the argument is null. In your system (at least to start with) I guess you will want to create one ThreadSafeModule associated with the interpreter and pass a pointer to that context to all CodeGenActions.

Address comments from Raphael and Stefan.

clang/lib/Interpreter/IncrementalParser.cpp
64	Cool, thanks!
185	The `+ 2` here is probably not what you want. This will just give you a pointer into Clang's source buffers and will eventually point to random source buffers (or worse) once InputCount is large enough. Indeed. I feel like the proper solution is to just use the StartOfFile Loc and don't add any offset to it. I think Clang should still be able to figure out a reasonable ordering for overload candidates etc. That particular part of the input processing has been causing a lot of troubles for cling over the years. If we use the StartOfFile each new line will appear before the previous which can be problematic for as you say diagnostics but also template instantiations. Cling ended up solving this by introducing a virtual file with impossible to allocate size of `1U << 15U` (https://github.com/vgvassilev/cling/blob/da1bb78f3dea4d2bf19b383aeb1872e9f2b117ad/lib/Interpreter/CIFactory.cpp#L1516-L1527 and https://github.com/vgvassilev/cling/blob/da1bb78f3dea4d2bf19b383aeb1872e9f2b117ad/lib/Interpreter/IncrementalParser.cpp#L394) Then we are save to get Loc + 1 (I do not remember how that + 2 came actually) and you should be still safe. I wonder if that's something we should do here? (I thought I already commented on this line before, but I can't see my comment or any replies on Phab so I'm just commenting again).
190	Got it now, somehow overlooked that `EnterSourceFile` returns true on a failure. I decided to return an error.
clang/tools/clang-repl/CMakeLists.txt
4	Ok, readded.

Address Lang's comments.

clang/lib/CodeGen/CodeGenAction.cpp
888	Recently, we thought about having a new FrontendAction callback for beginning a new phase when compiling incremental input. We need to keep track of the created objects (needed for error recovery) in our Transaction. We can have a map of Transaction* to llvm::Module* in CodeGen. The issue is that new JITs take ownership of the llvm::Module* which seems to make it impossible to support jitted code removal with that model (cc: @lhames, @rsmith). In the new APIs, in order to enable removable code, you can associate Modules with ResourceTrackers when they're added to the JIT. The ResourceTrackers then allow for removal. Idiomatic usage looks like: auto Mod = /* create module /; auto RT = JD.createResourceTracker(); J.addModule(RT, std::move(Mod)); //... if (auto Err = RT.remove()) / handle Err /; Nice, thanks! we have just one ever growing TU [...] which we send to RuntimeDyld... So is a TU the same as an llvm::Module in this context? If so, how do you reconcile that with the JIT taking ownership of modules? Are you just copying the Module each time before adding it? Each incremental chunk with which the TU grows has a corresponding `llvm::Module`. Once clang's CodeGen is done for the particular module it transfers the ownership to the `Transaction` which, in turn, hands it to the JIT and once the JIT is done it retains the ownership again. We need to keep track of the created objects (needed for error recovery) in our Transaction. Do you need the Module for error recovery? Or just the Decls? Yes, we need a `llvm::Module` that corresponds to the Decls as sometimes CodeGen will decide not to emit a Decl.
clang/lib/Interpreter/IncrementalExecutor.cpp
30–52	Cool, thanks!
57–60	Thanks. I think I fixed it now. Can you take a look?
clang/lib/Interpreter/IncrementalExecutor.h
37	@sgraenitz, @lhames, thanks for the clarifications. I am marking your comments as resolved (for easier tracking on my end). If the intent was to change something in this patch could you elaborate a little more what specifically I need to do here?

The other aspect of this is that upon unloading of these pieces of code we need to run the destructors (that's why we need some non-canonical handling of when we run the atexit handlers).

I just noticed this comment. I think long term you could handle this by introducing an "initialization generation" -- each time you run jit_dlopen_repl you would increment the generation. You'd point the __cxa_atexit alias at a custom function that keeps a map: __dso_handle -> (generation -> [ atexits ]). Then you could selectively run atexits for each generation before removing them.

clang/lib/CodeGen/CodeGenAction.cpp
888	Each incremental chunk with which the TU grows has a corresponding llvm::Module. Once clang's CodeGen is done for the particular module it transfers the ownership to the Transaction which, in turn, hands it to the JIT and once the JIT is done it retains the ownership again. Yes, we need a llvm::Module that corresponds to the Decls as sometimes CodeGen will decide not to emit a Decl. Can you elaborate on this? (Or point me to the relevant discussion / code?) Does CodeGen aggregate code into the Module as you CodeGen each incremental chunk? Or do you Link the previously CodeGen'd module into a new one?
clang/lib/Interpreter/IncrementalExecutor.cpp
57–60	Yep -- This looks right now.
clang/lib/Interpreter/IncrementalExecutor.h
37	I don't think there's anything to do here -- those notes were just background info.

v.g.vassilev marked 3 inline comments as done.Feb 23 2021, 4:27 AM

v.g.vassilev added inline comments.

clang/lib/CodeGen/CodeGenAction.cpp
888	Each incremental chunk with which the TU grows has a corresponding llvm::Module. Once clang's CodeGen is done for the particular module it transfers the ownership to the Transaction which, in turn, hands it to the JIT and once the JIT is done it retains the ownership again. Yes, we need a llvm::Module that corresponds to the Decls as sometimes CodeGen will decide not to emit a Decl. Can you elaborate on this? (Or point me to the relevant discussion / code?) Does CodeGen aggregate code into the Module as you CodeGen each incremental chunk? Or do you Link the previously CodeGen'd module into a new one? Cling's "code unloading" rolls back the states of the various objects without any checkpointing. Consider the two subsequent incremental inputs: `int f() { return 12; }` and `int i = f();`; `undo 1`. When we ask CodeGen to generate code for the first input it will not as `f` is not being used. Transaction1 will contain the `FunctionDecl` for `f` but the corresponding llvm::Module will be empty. Then when we get the second input line, the Transaction2 will contain the `VarDecl` but the corresponding llvm::Module will contain both IR definitions -- of `f` and `i`. Having the clang::Decl is useful because we can restore the previous state of the various internal frontend structures such as lookup tables. However, we cannot just drop the llvm::Module as it might contain deferred declarations which were emitted due to a use. That's pretty much the rationale behind this and the design dates back to pre-MCJIT times. I am all for making this more robust but that's what we currently have. The "code unloading" is mostly done in cling's DeclUnloader. There was some useful discussion about the model here quite some time ago

rsmith added inline comments.Mar 3 2021, 1:42 PM

clang/test/Interpreter/execute.c
1 ↗	(On Diff #325249)	Presumably here (and in all the interpreter tests) we will need to check that we configured Clang and LLVM so that they can actually JIT code for the host machine, and disable the test if not.

ychen added a subscriber: ychen.Mar 3 2021, 6:25 PM

Add a lit feature to check if llvm has jit support to selectively run tests.

Harbormaster completed remote builds in B92742: Diff 329131.Mar 8 2021, 8:54 PM

Do not rely on process exit code for --host-supports-jit but parse the true/false flag.

Move the IncrementalProcessingTest unittest from unittests/CodeGen to unittests/Interpreter.

Harbormaster completed remote builds in B92811: Diff 329234.Mar 9 2021, 7:03 AM

Address most of the formatting suggestions.

Harbormaster completed remote builds in B93256: Diff 329894.Mar 11 2021, 9:12 AM

I think this makes good progress. I found two details in the test code that need attention. The stdout issue might be done now or in a follow-up patch some time soon. Otherwise, this seems to get ready to land.

@teemperor What about your notes. Are there any open issues remaining?

clang/lib/Interpreter/Interpreter.cpp
94	It looks like `clang-repl` always dumps errors to stdout currently. This is fine for the interactive use case, but it will be impractical for input/output tests. As a result unit tests e.g. dump: Note: Google Test filter = InterpreterTest.Errors [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from InterpreterTest [ RUN ] InterpreterTest.Errors In file included from <built-in>:0: input_line_0:1:1: error: unknown type name 'intentional_error' intentional_error v1 = 42; ^ [ OK ] InterpreterTest.Errors (9024 ms) [----------] 1 test from InterpreterTest (9024 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (9025 ms total) [ PASSED ] 1 test. It would be useful to have an option for streaming diagnostics to an in-memory buffer (and maybe append them to returned llvm::Error instances in the future). Instead of `createDiagnostics()` you could pass a TextDiagnosticPrinter via `setDiagnostics()` here to accomplish it. Not insisting on having it in this review, but it would be a good follow-up task at least.
clang/test/Interpreter/execute.c
9 ↗	(On Diff #329894)	nit this should be a cpp file right? Otherwise, you should write `struct S *m = nullptr;` here. Also, C doesn't understand the `extern "C"` above :)
11 ↗	(On Diff #329894)	The `%p` format placeholder in printf is implementation-defined, so the output here varies. Maybe you can do something like this instead: auto r2 = printf("S[f=%f, m=0x%llx]\n", s.f, (unsigned long long)s.m); // CHECK-NEXT: S[f=1.000000, m=(0x0)] Or reinterpret_cast<unsigned long long>(s.m) if you go the C++ way.
clang/test/lit.cfg.py
97	I couldn't test this on a host that doesn't support JIT, but it looks like a nice "duck typing" way of testing for the feature.

Address comments -- rename test file; add a logging diagnostic consumer.

clang/lib/Interpreter/Interpreter.cpp

I should have addressed it now I get:

[==========] Running 5 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from IncrementalProcessing
[ RUN      ] IncrementalProcessing.EmitCXXGlobalInitFunc
[       OK ] IncrementalProcessing.EmitCXXGlobalInitFunc (17 ms)
[----------] 1 test from IncrementalProcessing (17 ms total)

[----------] 4 tests from InterpreterTest
[ RUN      ] InterpreterTest.Sanity
[       OK ] InterpreterTest.Sanity (8 ms)
[ RUN      ] InterpreterTest.IncrementalInputTopLevelDecls
[       OK ] InterpreterTest.IncrementalInputTopLevelDecls (9 ms)
[ RUN      ] InterpreterTest.Errors
[       OK ] InterpreterTest.Errors (91 ms)
[ RUN      ] InterpreterTest.DeclsAndStatements
[       OK ] InterpreterTest.DeclsAndStatements (8 ms)
[----------] 4 tests from InterpreterTest (116 ms total)

[----------] Global test environment tear-down
[==========] 5 tests from 2 test cases ran. (133 ms total)
[  PASSED  ] 5 tests.

clang/test/Interpreter/execute.c

9 ↗

(On Diff #329894)

Thanks! It is C++ indeed and I have changed the file extension.

Thanks. From my side this looks good now.

Harbormaster completed remote builds in B93611: Diff 330381.Mar 12 2021, 4:52 PM

Formatting.

Harbormaster completed remote builds in B93643: Diff 330428.Mar 13 2021, 1:32 AM

@rjmccall I'd like your input on this patch in particular, if you have time.

I'm nervous in general about the looming idea of declaration unloading, but the fact that it's been working in Cling for a long time gives me some confidence that we can do that in a way that's not prohibitively expensive and invasive.

In D96033#2630585, @rsmith wrote:

@rjmccall I'd like your input on this patch in particular, if you have time.

I've been specifically avoiding paying any attention to this patch because it sounds like an enormous time-sink to review. :) That's not to say it'd be time poorly spent, because it's an intriguing feature, but I just don't have the time to engage with it fully. I can try to give snippets of time if you can pose specific questions. I did at least go back and read the RFC from the summer. I'm not sure I have time to read the review thread up until now; it's quite daunting.

I'm nervous in general about the looming idea of declaration unloading, but the fact that it's been working in Cling for a long time gives me some confidence that we can do that in a way that's not prohibitively expensive and invasive.

I don't really know the jargon here. The biggest problem that I foresee around having a full-featured C REPL is the impossibility of replacing existing types — you might be able to introduce a *new* struct with a particular name, break the redeclaration chain, and have it shadow the old one, and we could probably chase down all the engineering problems that that causes in the compiler, but it's never going to be a particularly satisfying model. If we don't have to worry about that, then I feel like the largest problem is probably the IRGen interaction — in particular, whether we're going to have to start serializing IRGen state the same way that Sema state has to be serialized across PCH boundaries. But I'm sure the people who are working on this have more knowledge of what issues they're seeing than I can just abstractly anticipate.

I'm nervous in general about the looming idea of declaration unloading, but the fact that it's been working in Cling for a long time gives me some confidence that we can do that in a way that's not prohibitively expensive and invasive.

I don't really know the jargon here.

"Code unloading" is mentioned here https://reviews.llvm.org/D96033?id=323531#inline-911819

The biggest problem that I foresee around having a full-featured C REPL is the impossibility of replacing existing types — you might be able to introduce a *new* struct with a particular name, break the redeclaration chain, and have it shadow the old one, and we could probably chase down all the engineering problems that that causes in the compiler, but it's never going to be a particularly satisfying model.

Indeed. Cling did not have this feature until recently. We indeed "shadow" the reachable declaration maintaining (using inline namespaces) the redecl chain invariants (relevant code here). I am not sure that approach can work outside C++.

If we don't have to worry about that, then I feel like the largest problem is probably the IRGen interaction — in particular, whether we're going to have to start serializing IRGen state the same way that Sema state has to be serialized across PCH boundaries. But I'm sure the people who are working on this have more knowledge of what issues they're seeing than I can just abstractly anticipate.

We find ourselves patching often that area in CodeGen when upgrading llvm (eg. here). The current cling model requires the state to be transferred to subsequent calls of IRGen. We briefly touched that topic in https://reviews.llvm.org/D34444#812418 (and onward) and I thought we had a plan how to move forward.

In D96033#2630978, @v.g.vassilev wrote:

I'm nervous in general about the looming idea of declaration unloading, but the fact that it's been working in Cling for a long time gives me some confidence that we can do that in a way that's not prohibitively expensive and invasive.

I don't really know the jargon here.

"Code unloading" is mentioned here https://reviews.llvm.org/D96033?id=323531#inline-911819

I see. So you want to be able to sort of "roll back" Sema to a previous version of the semantic state, and yet you also need IRGen to *not* be rolled back because of lazy emission. That seems... tough. I don't think we're really architected to support that goal — frankly, I'm not sure C and C++ are architected to support that goal — and I'm concerned that it might require a ton of extra complexity and risk for us. I say this not to be dismissive, but to clarify how I see our various responsibilities here. Clang's primary mission is to be a static compiler, and it's been architected with that in mind, and it's acceptable for us to put that mission above other goals if we think they're in conflict. So as I see it, your responsibility here is to persuade us of one of the following:

that you can achieve your goals without introducing major problems for Clang's primary mission,
that the changes you'll need to make will ultimately have substantial benefits for Clang's primary mission, or
that we should change how we think about Clang's primary mission.

We probably need to talk about it.

The biggest problem that I foresee around having a full-featured C REPL is the impossibility of replacing existing types — you might be able to introduce a *new* struct with a particular name, break the redeclaration chain, and have it shadow the old one, and we could probably chase down all the engineering problems that that causes in the compiler, but it's never going to be a particularly satisfying model.

Indeed. Cling did not have this feature until recently. We indeed "shadow" the reachable declaration maintaining (using inline namespaces) the redecl chain invariants (relevant code here). I am not sure that approach can work outside C++.

If we don't have to worry about that, then I feel like the largest problem is probably the IRGen interaction — in particular, whether we're going to have to start serializing IRGen state the same way that Sema state has to be serialized across PCH boundaries. But I'm sure the people who are working on this have more knowledge of what issues they're seeing than I can just abstractly anticipate.

We find ourselves patching often that area in CodeGen when upgrading llvm (eg. here). The current cling model requires the state to be transferred to subsequent calls of IRGen. We briefly touched that topic in https://reviews.llvm.org/D34444#812418 (and onward) and I thought we had a plan how to move forward.

Ah, I sort of remember that conversation.

To be clear, what I'm trying to say is that — absent consensus that it's a major architecture shift is appropriate — we need to consider what functionality is reasonably achievable without it. I'm sure that covers most of what you're trying to do; it just may not include everything.

One of the fortunate things about working in a REPL is that many ABI considerations go completely out the window. The most important of those is the abstract need for symbol-matching; that is, there's practically no reason why a C REPL needs to use the simple C symbol mangling, because nobody expects code written outside the REPL to link up with user-entered declarations. So we can be quite aggressive about how we emit declarations "behind the scenes"; C mode really just means C's user-level semantics, calling convention, and type layout, but not any of C's ordinary interoperation/compatibility requirements.

In D96033#2632848, @rjmccall wrote:

In D96033#2630978, @v.g.vassilev wrote:

I'm nervous in general about the looming idea of declaration unloading, but the fact that it's been working in Cling for a long time gives me some confidence that we can do that in a way that's not prohibitively expensive and invasive.

I don't really know the jargon here.

"Code unloading" is mentioned here https://reviews.llvm.org/D96033?id=323531#inline-911819

I see. So you want to be able to sort of "roll back" Sema to a previous version of the semantic state, and yet you also need IRGen to *not* be rolled back because of lazy emission.

The state should match the state of Sema (at least that's the current case for cling).

That seems... tough. I don't think we're really architected to support that goal — frankly, I'm not sure C and C++ are architected to support that goal — and I'm concerned that it might require a ton of extra complexity and risk for us. I say this not to be dismissive, but to clarify how I see our various responsibilities here. Clang's primary mission is to be a static compiler, and it's been architected with that in mind, and it's acceptable for us to put that mission above other goals if we think they're in conflict. So as I see it, your responsibility here is to persuade us of one of the following:

that you can achieve your goals without introducing major problems for Clang's primary mission,

that the changes you'll need to make will ultimately have substantial benefits for Clang's primary mission, or

that we should change how we think about Clang's primary mission.

We have, over the years, deliberately have taken a very conservative stance and we have tried to achieve that goal without major modifications in clang in general. Cling has been ou-of-tree so clang design changes were not on the table then and now. In my opinion, we have managed to go long way using this set of changes in clang which in my view is fairly minimal considering the case we enable. My understanding of clang is limited so I cannot really judge if the approach we took is sustainable but over the years we have mostly suffered from having to selectively reset CodeGen state across incremental inputs. Plus some state in Sema which comes from another feature which requires recursive parsing but I would not consider that feature central for this particular discussion.

We probably need to talk about it.

+1. Do you use discord/slack/skype?

In D96033#2633479, @rjmccall wrote:

To be clear, what I'm trying to say is that — absent consensus that it's a major architecture shift is appropriate — we need to consider what functionality is reasonably achievable without it. I'm sure that covers most of what you're trying to do; it just may not include everything.

I understand and would like to thank you for bringing this up! I concur with that preference.

One of the fortunate things about working in a REPL is that many ABI considerations go completely out the window. The most important of those is the abstract need for symbol-matching; that is, there's practically no reason why a C REPL needs to use the simple C symbol mangling, because nobody expects code written outside the REPL to link up with user-entered declarations. So we can be quite aggressive about how we emit declarations "behind the scenes"; C mode really just means C's user-level semantics, calling convention, and type layout, but not any of C's ordinary interoperation/compatibility requirements.

I have never thought about ABI being a bi-directional problem, and indeed we can probably define away one direction. Do I understand that correctly? If I do, that would mean we can still embed the C REPL into third party code and be able to call compiled code from libraries?

We probably need to talk about it.

+1. Do you use discord/slack/skype?

I will try to summarize the discussion here. @rjmccall , @rsmith please feel free to correct me if I am wrong or add important points that I missed.
The discussion focused on supporting two major REPL-enabling features.

Error recovery:

cpp
[cling] #include <vector> // #1
[cling] std::vector<int> v; v[0].error_here; // #2, we need to undo the template instantiation.
input_line_4:2:26: error: member reference base type 'std::__1::__vector_base<int, std::__1::allocator<int> >::value_type' (aka 'int') is not a structure or union
 std::vector<int> v; v[0].error_here;
                     ~~~~^~~~~~~~~~~

Code unloading:

cpp
[cling] .L Adder.h // #1, similar to #include "Adder.h"
[cling] Add(3, 1) // int Add(int a, int b) {return a - b; }
(int) 2
[cling] .U Adder.h // reverts the state prior to #1
[cling] .L Adder.h
[cling] Add(3, 1) // int Add(int a, int b) {return a + b; }
(int) 4

The implementation of (1.) requires tracking of state in the clang Frontend and (2.) requires tracking of state in the clang Frontend and Backend. We discussed the current DeclUnloader implementation in Cling which tracks the emission of declarations (from Sema & co). And, upon request, cleans up various data structures such as lookup tables, AST, and CodeGen. It does not yet free the bump allocated AST memory but rather makes the AST nodes unreachable. It has a very similar conceptual design to the ASTReader which adds declarations to various data structures under the hood. The DeclUnloader does the opposite -- removes declarations from various internal data structures. We understood that the approach is not intrusive to the architecture of clang and can be very useful for a number of other tools and IDEs.

In addition, John pointed out that we can allow freeing chunks of memory if Sema and CodeGen track more rigorously the template instantiations and various other components which should not be a major challenge. What makes the implementation feasible is two assumptions:
a) each incremental input leaves the compiler in a valid state;
b) each state reversal transitions to a previously valid state.

We seem to have reached consensus that incremental compilation is possible to be supported without major changes in clang's architecture or significant changes in its main mission of being a static compiler.

@rjmccall, @rsmith, ping...

Sorry for the delay. That seems like a reasonable summary of our discussion. Let me try to lay out the right architecture for this as I see it.

Conceptually, we can split the translation unit into a sequence of partial translation units (PTUs). Every declaration will be associated with a unique PTU that owns it.

The first key insight here is that the owning PTU isn't always the "active" (most recent) PTU, and it isn't always the PTU that the declaration "comes from". A new declaration (that isn't a redeclaration or specialization of anything) does belong to the active PTU. A template specialization, however, belongs to the most recent PTU of all the declarations in its signature — mostly that means that it can be pulled into a more recent PTU by its template arguments.

The second key insight is that processing a PTU might extend an earlier PTU. Rolling back the later PTU shouldn't throw that extension away. For example, if the second PTU defines a template, and the third PTU requires that template to be instantiated at float, that template specialization is still part of the second PTU. Similarly, if the fifth PTU uses an inline function belonging to the fourth, that definition still belongs to the fourth. When we go to emit code in a new PTU, we map each declaration we have to emit back to its owning PTU and emit it in a new module for just the extensions to that PTU. We keep track of all the modules we've emitted for a PTU so that we can unload them all if we decide to roll it back.

Most declarations/definitions will only refer to entities from the same or earlier PTUs. However, it is possible (primarily by defining a previously-declared entity, but also through templates or ADL) for an entity that belongs to one PTU to refer to something from a later PTU. We will have to keep track of this and prevent unwinding to later PTU when we recognize it. Fortunately, this should be very rare; and crucially, we don't have to do the bookkeeping for this if we've only got one PTU, e.g. in normal compilation. Otherwise, PTUs after the first just need to record enough metadata to be able to revert any changes they've made to declarations belonging to earlier PTUs, e.g. to redeclaration chains or template specialization lists.

It should even eventually be possible for PTUs to provide their own slab allocators which can be thrown away as part of rolling back the PTU. We can maintain a notion of the active allocator and allocate things like Stmt/Expr nodes in it, temporarily changing it to the appropriate PTU whenever we go to do something like instantiate a function template. More care will be required when allocating declarations and types, though.

We would want the PTU to be efficiently recoverable from a Decl; I'm not sure how best to do that. An easy option that would cover most declarations would be to make multiple TranslationUnitDecls and parent the declarations appropriately, but I don't think that's good enough for things like member function templates, since an instantiation of that would still be parented by its original class. Maybe we can work this into the DC chain somehow, like how lexical DCs are.

Thank you for the details -- will be really useful to me and that architecture will define away several hard problems we tried to solve over the years.

In D96033#2693910, @rjmccall wrote:

Sorry for the delay. That seems like a reasonable summary of our discussion. Let me try to lay out the right architecture for this as I see it.

Conceptually, we can split the translation unit into a sequence of partial translation units (PTUs). Every declaration will be associated with a unique PTU that owns it.

The first key insight here is that the owning PTU isn't always the "active" (most recent) PTU, and it isn't always the PTU that the declaration "comes from". A new declaration (that isn't a redeclaration or specialization of anything) does belong to the active PTU. A template specialization, however, belongs to the most recent PTU of all the declarations in its signature — mostly that means that it can be pulled into a more recent PTU by its template arguments.

The second key insight is that processing a PTU might extend an earlier PTU. Rolling back the later PTU shouldn't throw that extension away. For example, if the second PTU defines a template, and the third PTU requires that template to be instantiated at float, that template specialization is still part of the second PTU. Similarly, if the fifth PTU uses an inline function belonging to the fourth, that definition still belongs to the fourth. When we go to emit code in a new PTU, we map each declaration we have to emit back to its owning PTU and emit it in a new module for just the extensions to that PTU. We keep track of all the modules we've emitted for a PTU so that we can unload them all if we decide to roll it back.

Most declarations/definitions will only refer to entities from the same or earlier PTUs. However, it is possible (primarily by defining a previously-declared entity, but also through templates or ADL) for an entity that belongs to one PTU to refer to something from a later PTU. We will have to keep track of this and prevent unwinding to later PTU when we recognize it. Fortunately, this should be very rare; and crucially, we don't have to do the bookkeeping for this if we've only got one PTU, e.g. in normal compilation. Otherwise, PTUs after the first just need to record enough metadata to be able to revert any changes they've made to declarations belonging to earlier PTUs, e.g. to redeclaration chains or template specialization lists.

It should even eventually be possible for PTUs to provide their own slab allocators which can be thrown away as part of rolling back the PTU. We can maintain a notion of the active allocator and allocate things like Stmt/Expr nodes in it, temporarily changing it to the appropriate PTU whenever we go to do something like instantiate a function template. More care will be required when allocating declarations and types, though.

We would want the PTU to be efficiently recoverable from a Decl; I'm not sure how best to do that. An easy option that would cover most declarations would be to make multiple TranslationUnitDecls and parent the declarations appropriately, but I don't think that's good enough for things like member function templates, since an instantiation of that would still be parented by its original class. Maybe we can work this into the DC chain somehow, like how lexical DCs are.

Would it make sense to have each Decl point to its owning PTU, similarly to what we do for the owning module (Decl::getOwningModule)?

In terms of future steps, do you prefer to try implementing what you suggested as part of this patch? I would prefer to land this patch and then add what we discussed here rather than keep piling to this already bulky patch.

In D96033#2695622, @v.g.vassilev wrote:

Would it make sense to have each Decl point to its owning PTU, similarly to what we do for the owning module (Decl::getOwningModule)?

I think that's the interface we want, but actually storing the PTU in every Decl that way is probably prohibitive in memory overhead; we need some more compact way to recover it. But maybe it's okay to do something like that if we can spare a bit in Decl. Richard, thoughts here?

In terms of future steps, do you prefer to try implementing what you suggested as part of this patch? I would prefer to land this patch and then add what we discussed here rather than keep piling to this already bulky patch.

It depends on how much you think your patch is working towards that architecture. Since this is just infrastructure without much in the way of Sema/IRGen changes, it's probably fine. I haven't reviewed it yet, though, sorry.

In D96033#2695748, @rjmccall wrote:

In D96033#2695622, @v.g.vassilev wrote:

Would it make sense to have each Decl point to its owning PTU, similarly to what we do for the owning module (Decl::getOwningModule)?

I think that's the interface we want, but actually storing the PTU in every Decl that way is probably prohibitive in memory overhead; we need some more compact way to recover it. But maybe it's okay to do something like that if we can spare a bit in Decl. Richard, thoughts here?

Ha, each Decl has a getTranslationUnitDecl() which may be rewired to point to the PTU...

In terms of future steps, do you prefer to try implementing what you suggested as part of this patch? I would prefer to land this patch and then add what we discussed here rather than keep piling to this already bulky patch.

It depends on how much you think your patch is working towards that architecture. Since this is just infrastructure without much in the way of Sema/IRGen changes, it's probably fine. I haven't reviewed it yet, though, sorry.

If you could skim through the patch it'd be great! I think the only bit that remotely touches on the new architecture is the Transaction class -- it is a pair of vector of decls and an llvm::Module. I think the vector of Decls would become a PTU in future.

@teemperor, could you take another look at this patch -- I believe most of your concerns are addressed now.

@rsmith ping...

Sorry for the delay. I think all my points have been resolved beside the insertion SourceLoc.

clang/lib/Interpreter/IncrementalParser.cpp
185	I think my point then is: If we would change Clang's behaviour to consider the insertion order in the include tree when deciding the SourceLocation order, wouldn't that fix the problem? IIUC that's enough to make this case work the intended way without requiring some fake large source file. It would also make this work in other projects such as LLDB. So IMHO we could just use StartOfFile as the loc here and then consider the wrong ordering as a Clang bug (and add a FIXME for that here).

@rsmith ping2...

Address source location comment of teemperor.

In D96033#2717749, @teemperor wrote:

Sorry for the delay. I think all my points have been resolved beside the insertion SourceLoc.

Apologies, I overlooked this comment. Should be fixed now.

Harbormaster completed remote builds in B102345: Diff 342485.May 3 2021, 12:48 PM

I believe everything I pointed out so far has been resolved. I still have one more nit about the unit test though (just to not make it fail on some Windows setups).

FWIW, given that this is in quite the raw state at the moment, I wonder if we should actually keep it out of the list of installed binaries until it reaches some kind of MVP state. Once this landed this otherwise gets shipped as part of the next clang release and shows up in people's shells.

clang/unittests/Interpreter/InterpreterTest.cpp
87	You still need a `#ifdef GTEST_HAS_DEATH_TEST` around this (death tests aren't supported on some specific setups)

This revision is now accepted and ready to land.May 4 2021, 5:23 AM

(Obviously this should still wait on Richard & John as I think they still have unaddressed objections)

Check if we can use death tests;
Do not install clang-repl as being an early stage development.

Harbormaster completed remote builds in B102695: Diff 342980.May 5 2021, 4:03 AM

Thanks, @teemperor. I have addressed your last comments, too.

I wanted to use the opportunity to ping @rjmccall and @rsmith.

rsmith added inline comments.May 10 2021, 6:45 PM

clang/lib/Interpreter/IncrementalParser.cpp
227–236	What are these tokens, exactly? Are we sure it's safe to discard them rather than parsing them?

Generally-speaking, we have a plan that I'm happy for us to work towards, and I'm happy for our progress towards that plan to be incremental. Even though this might not be fully in that direction right now, I think that's OK.

clang-tidy and clang-format

Harbormaster completed remote builds in B104129: Diff 344941.May 12 2021, 3:02 PM

This revision was landed with ongoing or failed builds.May 12 2021, 9:24 PM

Closed by commit rG44a4000181e1: [clang-repl] Land initial infrastructure for incremental parsing (authored by v.g.vassilev). · Explain Why

This revision was automatically updated to reflect the committed changes.

v.g.vassilev added a commit: rG44a4000181e1: [clang-repl] Land initial infrastructure for incremental parsing.

Herald added a project: Restricted Project. · View Herald TranscriptMay 12 2021, 9:24 PM

Thanks everybody for reviewing and help making this patch better!!

v.g.vassilev added a reverting change: rGf6907152db3d: Revert "[clang-repl] Land initial infrastructure for incremental parsing".May 12 2021, 9:44 PM

@v.g.vassilev, the test does not appear to be appropriately set up for builds that default to a non-native target:
https://lab.llvm.org/staging/#/builders/126/builds/371/steps/5/logs/FAIL__Clang__execute_cpp

Can you fix or revert until there is a fix?

In D96033#2757077, @hubert.reinterpretcast wrote:

@v.g.vassilev, the test does not appear to be appropriately set up for builds that default to a non-native target:
https://lab.llvm.org/staging/#/builders/126/builds/371/steps/5/logs/FAIL__Clang__execute_cpp

Can you fix or revert until there is a fix?

Apologies. Do you know if REQUIRES: native is sufficient to fix (trying to avoid churn)?

In D96033#2757107, @v.g.vassilev wrote:

In D96033#2757077, @hubert.reinterpretcast wrote:

@v.g.vassilev, the test does not appear to be appropriately set up for builds that default to a non-native target:
https://lab.llvm.org/staging/#/builders/126/builds/371/steps/5/logs/FAIL__Clang__execute_cpp

Can you fix or revert until there is a fix?

Apologies. Do you know if REQUIRES: native is sufficient to fix (trying to avoid churn)?

@lhames and I are working on a patch. We do not have easy access to such machines. Would you mind testing it on the bot?

In D96033#2757179, @v.g.vassilev wrote:

In D96033#2757107, @v.g.vassilev wrote:

In D96033#2757077, @hubert.reinterpretcast wrote:

@v.g.vassilev, the test does not appear to be appropriately set up for builds that default to a non-native target:
https://lab.llvm.org/staging/#/builders/126/builds/371/steps/5/logs/FAIL__Clang__execute_cpp

Can you fix or revert until there is a fix?

Apologies. Do you know if REQUIRES: native is sufficient to fix (trying to avoid churn)?

A speculative application of the above probably helps and would be harmless (I think).

@lhames and I are working on a patch. We do not have easy access to such machines. Would you mind testing it on the bot?

I have a local build I can apply a patch to.

In D96033#2757222, @hubert.reinterpretcast wrote:

...
I have a local build I can apply a patch to.

Hi Hubert,

Could you apply the following patch and let me know the output from the failing test? I'm trying to work out whether the JIT is getting the triple or the data layout wrong.

diff --git a/clang/tools/clang-repl/ClangRepl.cpp b/clang/tools/clang-repl/ClangRepl.cpp
index b5b5bf6e0c6b..cbf67f0e163e 100644
--- a/clang/tools/clang-repl/ClangRepl.cpp
+++ b/clang/tools/clang-repl/ClangRepl.cpp
@@ -57,6 +57,12 @@ int main(int argc, const char **argv) {
   llvm::InitializeNativeTarget();
   llvm::InitializeNativeTargetAsmPrinter();
 
+  auto JTMB = ExitOnErr(llvm::orc::JITTargetMachineBuilder::detectHost());
+  llvm::errs() << "triple:     " << JTMB.getTargetTriple().str() << "\n";
+  llvm::errs() << "datalayout: "
+               << ExitOnErr(JTMB.getDefaultDataLayoutForTarget())
+                      .getStringRepresentation()
+               << "\n";
   if (OptHostSupportsJit) {
     auto J = llvm::orc::LLJITBuilder().create();
     if (J)

In D96033#2757342, @lhames wrote:

Hi Hubert,

Could you apply the following patch and let me know the output from the failing test? I'm trying to work out whether the JIT is getting the triple or the data layout wrong.

I've started the build. My tree was a bit stale, so it might not be the fastest.

In D96033#2757342, @lhames wrote:

Hi Hubert,

Could you apply the following patch and let me know the output from the failing test? I'm trying to work out whether the JIT is getting the triple or the data layout wrong.

******************** TEST 'Clang :: Interpreter/execute.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   cat /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/clang-repl | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/FileCheck /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp
--
Exit Code: 1

Command Output (stderr):
--
triple:     powerpc64-ibm-aix7.2.0.0
datalayout: E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
/home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp:7:11: error: CHECK: expected string not found in input
// CHECK: i = 42
          ^
<stdin>:1:1: note: scanning from here
clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> 
^
<stdin>:1:9: note: possible intended match here
clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> 
        ^

Input file: <stdin>
Check file: /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl>  
check:7'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
check:7'1             ?                                                                                                                                                                 possible intended match
>>>>>>

--

********************

In D96033#2757930, @hubert.reinterpretcast wrote:

...
Command Output (stderr):

triple: powerpc64-ibm-aix7.2.0.0
datalayout: E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512
error: Added modules have incompatible data layouts: e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)

Thanks @hubert.reinterpretcast!

Ok, looks like the JIT is getting the layout right, but clang-repl is constructing a module with an little-endian layout for some reason. I'm not sure why that would be but it's probably a question for @v.g.vassilev tomorrow.

In the mean time I have conditionally disabled this test on ppc64 in 71a0609a2b.

In D96033#2758189, @lhames wrote:

Ok, looks like the JIT is getting the layout right, but clang-repl is constructing a module with an little-endian layout for some reason.

clang-repl is generating a module consistent with LLVM_DEFAULT_TARGET_TRIPLE:STRING=powerpc64le-linux-gnu.
So it seems this test might need some way of telling clang-repl to use the host triple.

In the mean time I have conditionally disabled this test on ppc64 in 71a0609a2b.

That will help our builds; thanks.

Hi @hubert.reinterpretcast,

Would you mind testing this patch:

diff
diff --git a/clang/lib/Interpreter/Interpreter.cpp b/clang/lib/Interpreter/Interpreter.cpp
index 8de38c0afcd9..79acb5bd6898 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -157,7 +157,7 @@ IncrementalCompilerBuilder::create(std::vector<const char *> &ClangArgv) {
   ParseDiagnosticArgs(*DiagOpts, ParsedArgs, &Diags);
 
   driver::Driver Driver(/*MainBinaryName=*/ClangArgv[0],
-                        llvm::sys::getDefaultTargetTriple(), Diags);
+                        llvm::sys::getProcessTriple(), Diags);
   Driver.setCheckInputsExist(false); // the input comes from mem buffers
   llvm::ArrayRef<const char *> RF = llvm::makeArrayRef(ClangArgv);
   std::unique_ptr<driver::Compilation> Compilation(Driver.BuildCompilation(RF));
diff --git a/clang/test/Interpreter/execute.cpp b/clang/test/Interpreter/execute.cpp
index a9beed5714d0..81ab57e955cf 100644
--- a/clang/test/Interpreter/execute.cpp
+++ b/clang/test/Interpreter/execute.cpp
@@ -1,6 +1,5 @@
 // RUN: cat %s | clang-repl | FileCheck %s
 // REQUIRES: host-supports-jit
-// UNSUPPORTED: powerpc64
 
 extern "C" int printf(const char *, ...);
 int i = 42;

In D96033#2759808, @v.g.vassilev wrote:

Hi @hubert.reinterpretcast,

Would you mind testing this patch:

Running it now. I've applied the first diff here to the base of my previously reported result to minimize build time.

In D96033#2759808, @v.g.vassilev wrote:

Hi @hubert.reinterpretcast,

Would you mind testing this patch:

Does the test try to generate native object files in some way? There is functionality (with some limitations) for that under 32-bit AIX; however, we're running a 64-bit build and we don't have integrated-as capability for that at this time. This is what I'm seeing:

******************** TEST 'Clang :: Interpreter/execute.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   cat /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/clang-repl | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/FileCheck /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp
--
Exit Code: 2

Command Output (stderr):
--
clang-repl: Driver initialization failed. Incremental mode for action is not supported
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/FileCheck /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp

--

********************

In D96033#2760067, @hubert.reinterpretcast wrote:
In D96033#2759808, @v.g.vassilev wrote:

Hi @hubert.reinterpretcast,

Would you mind testing this patch:

Does the test try to generate native object files in some way? There is functionality (with some limitations) for that under 32-bit AIX; however, we're running a 64-bit build and we don't have integrated-as capability for that at this time. This is what I'm seeing:
******************** TEST 'Clang :: Interpreter/execute.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   cat /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/clang-repl | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/FileCheck /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp
--
Exit Code: 2

Command Output (stderr):
--
clang-repl: Driver initialization failed. Incremental mode for action is not supported
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/FileCheck /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp

--

********************

That looks like progress! Thanks a lot for investing the time to do that. I tried to get access to such a machine but we don't have them off the shelf.

The test is supposed to create an incremental Clang and create a JIT. clang-repl will take the original ProgramAction was and try to turn it into an incremental action using the WrappedFrontendAction and add EmitLLVMOnlyAction. That is done here.

The diagnostics tells us that we cannot turn some ProgramAction into an incremental one. I am wondering what that action is. clang-repl -Xcc -v should be able to tell us.

@v.g.vassilev, thanks for working with me on this. I understand it is difficult to handle issues that appear on platforms and build configurations one does not have set up.

I've added -Xcc -v and the output is below. It seems it has to do with the implicit -fno-integrated-as currently used with AIX. I'll paste the result with -Xcc -fintegrated-as in my next comment.

$ cat /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/clang-repl -Xcc -v
clang version 13.0.0
Target: powerpc64-ibm-aix7.2.0.0
Thread model: posix
InstalledDir:
 "" -cc1 -triple powerpc64-ibm-aix7.2.0.0 -S -disable-free -main-file-name "<<< inputs >>>" -mrelocation-model pic -pic-level 2 -mframe-pointer=all -fmath-errno -fno-rounding-math -fno-verbose-asm -no-integrated-as -target-cpu pwr7 -mfloat-abi hard -mllvm -treat-scalable-fixed-error-as-warning -gstrict-dwarf -gno-column-info -debugger-tuning=dbx -v -fdata-sections -fcoverage-compilation-dir=/home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build -resource-dir lib/clang/13.0.0 -internal-isystem lib/clang/13.0.0/include -internal-isystem /usr/include -fdeprecated-macro -fno-dwarf-directory-asm -fdebug-compilation-dir=/home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build -ferror-limit 19 -fno-signed-char -fno-use-cxa-atexit -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fcolor-diagnostics -fxl-pragma-pack -o "/tmp/hstong/auto.2021W19/<<< inputs >>>-e534ce.s" -x c++ "<<< inputs >>>"
 "/usr/bin/as" -a64 -many -o "<<< inputs >>>.o" "/tmp/hstong/auto.2021W19/<<< inputs >>>-e534ce.s"
clang-repl: Driver initialization failed. Incremental mode for action is not supported

Once I add -Xcc -fintegrated-as, we get:

$ cat /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/clang-repl -Xcc -fintegrated-as
fatal error: error in backend: 64-bit XCOFF object files are not supported yet.
clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl>

I am not sure if there's something to avoid this other than to XFAIL the test somehow while the 64-bit XCOFF integrated-as capability is still pending.
If you have some ideas, please let me know. Meanwhile, I am trying out system-aix as the "feature" to XFAIL on.

I still think the switch to the use the process triple in https://reviews.llvm.org/D96033#2759808 is perhaps correct (so even if it does not help currently for the configuration we have running, it could be worthwhile to commit).

hubert.reinterpretcast mentioned this in D102560: [AIX][clang-repl][test] Mark unsupported pending XCOFF64 integrated-as.May 15 2021, 1:34 PM

In D96033#2761428, @hubert.reinterpretcast wrote:

If you have some ideas, please let me know. Meanwhile, I am trying out system-aix as the "feature" to XFAIL on.

https://reviews.llvm.org/D102560 posted to use system-aix so other PPC configurations will run the test. Possible further changes is for the code to choose the process triple, for the test to possibly specify -fintegrated-as, and (alternatively) for the system assembler step to be integrated into this use scenario.

hubert.reinterpretcast mentioned this in rG9ae529d0db2d: [AIX][clang-repl][test] Mark unsupported pending XCOFF64 integrated-as.May 15 2021, 7:42 PM

In D96033#2761581, @hubert.reinterpretcast wrote:

In D96033#2761428, @hubert.reinterpretcast wrote:

If you have some ideas, please let me know. Meanwhile, I am trying out system-aix as the "feature" to XFAIL on.

https://reviews.llvm.org/D102560 posted to use system-aix so other PPC configurations will run the test. Possible further changes is for the code to choose the process triple, for the test to possibly specify -fintegrated-as, and (alternatively) for the system assembler step to be integrated into this use scenario.

Thanks for the fix. Indeed I will commit the change wrt getting the process triple although it may require further adjustments for out-of-process execution.

I'd like to avoid requiring addition of the intricate fintegrated-as in the test (but also from users). It seems that the clang driver creates a program action. In cases where we have fintegrated-as it appends -emit-obj which is then converted to the frontend::EmitObj action. In that case clang-repl, just ignores this and replaces it with frontend::EmitLLVMOnly. IIUC, AIX has a default fno-integrated-as and I am still puzzled (and cannot find the relevant code) what is the program action kind for AIX in that case (frontend::???). FWIW, I tried to add some functionality to write it to the diagnostics message but it requires opening too many interfaces.

Maybe applying that diff can give us a hint:

diff --git a/clang/lib/Interpreter/IncrementalParser.cpp b/clang/lib/Interpreter/IncrementalParser.cpp
index 70baabfeb8fb..4c2292e0bde9 100644
--- a/clang/lib/Interpreter/IncrementalParser.cpp
+++ b/clang/lib/Interpreter/IncrementalParser.cpp
@@ -55,6 +55,7 @@ public:
                 std::errc::state_not_recoverable,
                 "Driver initialization failed. "
                 "Incremental mode for action is not supported");
+            printf("ActionKind=%d\n", CI.getFrontendOpts().ProgramAction);
             return Act;
           case frontend::ASTDump:
             LLVM_FALLTHROUGH;

In D96033#2762056, @v.g.vassilev wrote:

IIUC, AIX has a default fno-integrated-as and I am still puzzled (and cannot find the relevant code) what is the program action kind for AIX in that case (frontend::???).

This is rooted in IsIntegratedAssemblerDefault and the relevant code is more a relevant lack of code. The default implementation, ToolChain::IsIntegratedAssemblerDefault, returns false.

Maybe applying that diff can give us a hint:

Output is:

$ cat /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/clang-repl
ActionKind=7
clang-repl: Driver initialization failed. Incremental mode for action is not supported

In D96033#2762134, @hubert.reinterpretcast wrote:
In D96033#2762056, @v.g.vassilev wrote:

IIUC, AIX has a default fno-integrated-as and I am still puzzled (and cannot find the relevant code) what is the program action kind for AIX in that case (frontend::???).

This is rooted in IsIntegratedAssemblerDefault and the relevant code is more a relevant lack of code. The default implementation, ToolChain::IsIntegratedAssemblerDefault, returns false.

Maybe applying that diff can give us a hint:

Output is:
$ cat /home/hstong/.Liodine/llvmproj/clang/test/Interpreter/execute.cpp | /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/build/bin/clang-repl
ActionKind=7
clang-repl: Driver initialization failed. Incremental mode for action is not supported

Thanks!

That patch should probably get us to a point where we can mark the test as XFAIL: system-aix

diff --git a/clang/lib/Interpreter/IncrementalParser.cpp b/clang/lib/Interpreter/IncrementalParser.cpp
index 70baabfeb8fb..84b4d779d43c 100644
--- a/clang/lib/Interpreter/IncrementalParser.cpp
+++ b/clang/lib/Interpreter/IncrementalParser.cpp
@@ -54,7 +54,8 @@ public:
             Err = llvm::createStringError(
                 std::errc::state_not_recoverable,
                 "Driver initialization failed. "
-                "Incremental mode for action is not supported");
+                "Incremental mode for action %d is not supported",
+                CI.getFrontendOpts().ProgramAction);
             return Act;
           case frontend::ASTDump:
             LLVM_FALLTHROUGH;
@@ -63,6 +64,8 @@ public:
           case frontend::ParseSyntaxOnly:
             Act = CreateFrontendAction(CI);
             break;
+          case frontend::EmitAssembly:
+            LLVM_FALLTHROUGH;
           case frontend::EmitObj:
             LLVM_FALLTHROUGH;
           case frontend::EmitLLVMOnly:

If that works on your platform I will happily open a review for the changes.

In D96033#2762182, @v.g.vassilev wrote:

That patch should probably get us to a point where we can mark the test as XFAIL: system-aix

I've applied that patch to my 64-bit LLVM build and it does cause the object writer to be used (which generates the "64-bit XCOFF object files" error seen before).

If that works on your platform I will happily open a review for the changes.

From the behaviour observed in the 64-bit build, the 32-bit case might not fail similarly. I need to see if I can get a 32-bit build going. For now, the patch would be welcome (without changing from UNSUPPORTED to XFAIL).

In D96033#2764946, @hubert.reinterpretcast wrote:

If that works on your platform I will happily open a review for the changes.

From the behaviour observed in the 64-bit build, the 32-bit case might not fail similarly. I need to see if I can get a 32-bit build going. For now, the patch would be welcome (without changing from UNSUPPORTED to XFAIL).

The 32-bit case fails elsewhere:

clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> Not yet implemented!
UNREACHABLE executed at /home/hstong/.Nrtphome/.Liodine/llcrossbld/dev/llvm-project/llvm/lib/Object/XCOFFObjectFile.cpp:219!
IOT/Abort trap (core dumped)

So, I don't think the 32-bit and 64-bit cases are going to be synchronized.

Looks like this is also failing on s390x:

error: Added modules have incompatible data layouts: E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64 (module) vs E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64 (jit)

The problem here is that on s390x we use a different data layout on machines with vector registers vs. machines without. The (module) string above is the version without vector registers (which is presumably selected because there is no -march= argument and the compiler therefore defaults to an old machine), and the (jit) string is the version with vector registers (which is presumably because the jit auto-detected that it is running on a new machine).

I guess we should either tell the JIT to not autodetect the current processor, or else tell the compiler to target the processor that the JIT autodetected?

v.g.vassilev mentioned this in D102688: [clang-repl] Better match the underlying architecture..May 18 2021, 6:41 AM

@hubert.reinterpretcast, thanks for the feedback. I have created a patch as discussed -- https://reviews.llvm.org/D102688

@uweigand, thanks for reaching out. I believe the patch above should fix your setup. Could you confirm?

In D96033#2765954, @v.g.vassilev wrote:

@hubert.reinterpretcast, thanks for the feedback. I have created a patch as discussed -- https://reviews.llvm.org/D102688

@uweigand, thanks for reaching out. I believe the patch above should fix your setup. Could you confirm?

Unfortunately, it does not. Changing the triple doesn't affect the architecture the compiler generates code for. If you wanted to change the compiler to generate code for the architecture the JIT detects, the easiest way would probably be to use (the equivalent of) "-march=native", which causes the compiler to also auto-detect the current processor in the same way as the JIT does.

We've started seeing LLVM ERROR: out of memory on our 2-stage LTO Linux builders after this change landed. It looks like linking clang-repl always fails on our bot, but I've also seen OOM when linking ClangCodeGenTests and FrontendTests. Do you have any idea why this could be happening? We'd appreciate any help since our bots have been broken for several days now.

In D96033#2766332, @phosek wrote:

We've started seeing LLVM ERROR: out of memory on our 2-stage LTO Linux builders after this change landed. It looks like linking clang-repl always fails on our bot, but I've also seen OOM when linking ClangCodeGenTests and FrontendTests. Do you have any idea why this could be happening? We'd appreciate any help since our bots have been broken for several days now.

Ouch. Are the bot logs public? If not maybe a stacktrace could be useful. clang-repl combines a lot of libraries across llvm and clang that usually are compiled separately. For instance we put in memory most of the clang frontend, the backend and the JIT. Could it be we are hitting some real limit?

In D96033#2766372, @v.g.vassilev wrote:

In D96033#2766332, @phosek wrote:

We've started seeing LLVM ERROR: out of memory on our 2-stage LTO Linux builders after this change landed. It looks like linking clang-repl always fails on our bot, but I've also seen OOM when linking ClangCodeGenTests and FrontendTests. Do you have any idea why this could be happening? We'd appreciate any help since our bots have been broken for several days now.

Ouch. Are the bot logs public? If not maybe a stacktrace could be useful. clang-repl combines a lot of libraries across llvm and clang that usually are compiled separately. For instance we put in memory most of the clang frontend, the backend and the JIT. Could it be we are hitting some real limit?

Yes, they are, see https://luci-milo.appspot.com/p/fuchsia/builders/prod/clang-linux-x64, but there isn't much information in there unfortunately. It's possible that we're hitting some limit, but these bots use 32-core instances with 128GB RAM which I'd hope is enough even for the LTO build.

gulfem added a subscriber: gulfem.May 18 2021, 11:09 AM

v.g.vassilev mentioned this in rG8dd5ef01ef13: [clang-repl] Better match the underlying architecture..May 18 2021, 12:14 PM

In D96033#2766141, @uweigand wrote:

In D96033#2765954, @v.g.vassilev wrote:

@hubert.reinterpretcast, thanks for the feedback. I have created a patch as discussed -- https://reviews.llvm.org/D102688

@uweigand, thanks for reaching out. I believe the patch above should fix your setup. Could you confirm?

Unfortunately, it does not. Changing the triple doesn't affect the architecture the compiler generates code for. If you wanted to change the compiler to generate code for the architecture the JIT detects, the easiest way would probably be to use (the equivalent of) "-march=native", which causes the compiler to also auto-detect the current processor in the same way as the JIT does.

Ah, okay. Could you try this patch:

diff --git a/clang/lib/Interpreter/IncrementalExecutor.cpp b/clang/lib/Interpreter/IncrementalExecutor.cpp
index f999e5eceaed..9a368d9122bc 100644
--- a/clang/lib/Interpreter/IncrementalExecutor.cpp
+++ b/clang/lib/Interpreter/IncrementalExecutor.cpp
@@ -26,12 +26,14 @@
 namespace clang {
 
 IncrementalExecutor::IncrementalExecutor(llvm::orc::ThreadSafeContext &TSC,
-                                         llvm::Error &Err)
+                                         llvm::Error &Err,
+                                         const llvm::Triple &Triple)
     : TSCtx(TSC) {
   using namespace llvm::orc;
   llvm::ErrorAsOutParameter EAO(&Err);
 
-  if (auto JitOrErr = LLJITBuilder().create())
+  auto JTMB = JITTargetMachineBuilder(Triple);
+  if (auto JitOrErr = LLJITBuilder().setJITTargetMachineBuilder(JTMB).create())
     Jit = std::move(*JitOrErr);
   else {
     Err = JitOrErr.takeError();
diff --git a/clang/lib/Interpreter/IncrementalExecutor.h b/clang/lib/Interpreter/IncrementalExecutor.h
index c4e33a390942..b4c6ddec1047 100644
--- a/clang/lib/Interpreter/IncrementalExecutor.h
+++ b/clang/lib/Interpreter/IncrementalExecutor.h
@@ -14,6 +14,7 @@
 #define LLVM_CLANG_LIB_INTERPRETER_INCREMENTALEXECUTOR_H
 
 #include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/Triple.h"
 #include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
 
 #include <memory>
@@ -34,7 +35,8 @@ class IncrementalExecutor {
   llvm::orc::ThreadSafeContext &TSCtx;
 
 public:
-  IncrementalExecutor(llvm::orc::ThreadSafeContext &TSC, llvm::Error &Err);
+  IncrementalExecutor(llvm::orc::ThreadSafeContext &TSC, llvm::Error &Err,
+                      const llvm::Triple &Triple);
   ~IncrementalExecutor();
 
   llvm::Error addModule(std::unique_ptr<llvm::Module> M);
diff --git a/clang/lib/Interpreter/Interpreter.cpp b/clang/lib/Interpreter/Interpreter.cpp
index 79acb5bd6898..025bdb14c54f 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -16,6 +16,7 @@
 #include "IncrementalExecutor.h"
 #include "IncrementalParser.h"
 
+#include "clang/AST/ASTContext.h"
 #include "clang/Basic/TargetInfo.h"
 #include "clang/CodeGen/ModuleBuilder.h"
 #include "clang/CodeGen/ObjectFilePCHContainerOperations.h"
@@ -204,8 +205,11 @@ llvm::Expected<Transaction &> Interpreter::Parse(llvm::StringRef Code) {
 llvm::Error Interpreter::Execute(Transaction &T) {
   assert(T.TheModule);
   if (!IncrExecutor) {
+    const llvm::Triple &Triple =
+      getCompilerInstance()->getASTContext().getTargetInfo().getTriple();
     llvm::Error Err = llvm::Error::success();
-    IncrExecutor = std::make_unique<IncrementalExecutor>(*TSCtx, Err);
+    IncrExecutor = std::make_unique<IncrementalExecutor>(*TSCtx, Err, Triple);
+
     if (Err)
       return Err;
   }

In D96033#2766502, @phosek wrote:

In D96033#2766372, @v.g.vassilev wrote:

In D96033#2766332, @phosek wrote:

We've started seeing LLVM ERROR: out of memory on our 2-stage LTO Linux builders after this change landed. It looks like linking clang-repl always fails on our bot, but I've also seen OOM when linking ClangCodeGenTests and FrontendTests. Do you have any idea why this could be happening? We'd appreciate any help since our bots have been broken for several days now.

Ouch. Are the bot logs public? If not maybe a stacktrace could be useful. clang-repl combines a lot of libraries across llvm and clang that usually are compiled separately. For instance we put in memory most of the clang frontend, the backend and the JIT. Could it be we are hitting some real limit?

Yes, they are, see https://luci-milo.appspot.com/p/fuchsia/builders/prod/clang-linux-x64, but there isn't much information in there unfortunately. It's possible that we're hitting some limit, but these bots use 32-core instances with 128GB RAM which I'd hope is enough even for the LTO build.

I think the specs are fine for just building with LTO, but I am not sure if that's enough to for the worst case when running ninja -j320 with an LTO build (which is what your job is doing). Can you try limiting your link jobs to something like 16 or 32 (e.g., -DLLVM_PARALLEL_LINK_JOBS=32)

(FWIW, your go build script also crashes with OOM errors so you really are running low on memory on that node)

Yes, this patch fixes the problem for me. Thanks!

In D96033#2767884, @teemperor wrote:

In D96033#2766502, @phosek wrote:

In D96033#2766372, @v.g.vassilev wrote:

In D96033#2766332, @phosek wrote:

We've started seeing LLVM ERROR: out of memory on our 2-stage LTO Linux builders after this change landed. It looks like linking clang-repl always fails on our bot, but I've also seen OOM when linking ClangCodeGenTests and FrontendTests. Do you have any idea why this could be happening? We'd appreciate any help since our bots have been broken for several days now.

Ouch. Are the bot logs public? If not maybe a stacktrace could be useful. clang-repl combines a lot of libraries across llvm and clang that usually are compiled separately. For instance we put in memory most of the clang frontend, the backend and the JIT. Could it be we are hitting some real limit?

Yes, they are, see https://luci-milo.appspot.com/p/fuchsia/builders/prod/clang-linux-x64, but there isn't much information in there unfortunately. It's possible that we're hitting some limit, but these bots use 32-core instances with 128GB RAM which I'd hope is enough even for the LTO build.

I think the specs are fine for just building with LTO, but I am not sure if that's enough to for the worst case when running ninja -j320 with an LTO build (which is what your job is doing). Can you try limiting your link jobs to something like 16 or 32 (e.g., -DLLVM_PARALLEL_LINK_JOBS=32)

Right: On my system, linking clang with LTO takes 11.5 GB of RAM while clang-repl takes 8.4 GB. If the system has 128 GB, I agree that the issue is likely too many parallel links; the addition of clang-repl triggers this because it adds another large binary that the build system has to process at some point.

v.g.vassilev mentioned this in D102756: [clang-repl] Tell the LLJIT the exact target triple we use..May 19 2021, 3:51 AM

@uweigand, thanks again for confirming the patch works for you. The differential revision is here https://reviews.llvm.org/D102756

@teemperor and @Hahnfeld, thanks for the diagnosis. I forgot to mention that we had some exchange on Discord and @phosek kindly agreed to try this special flag.

In D96033#2767884, @teemperor wrote:

In D96033#2766502, @phosek wrote:

In D96033#2766372, @v.g.vassilev wrote:

In D96033#2766332, @phosek wrote:

We've started seeing LLVM ERROR: out of memory on our 2-stage LTO Linux builders after this change landed. It looks like linking clang-repl always fails on our bot, but I've also seen OOM when linking ClangCodeGenTests and FrontendTests. Do you have any idea why this could be happening? We'd appreciate any help since our bots have been broken for several days now.

Ouch. Are the bot logs public? If not maybe a stacktrace could be useful. clang-repl combines a lot of libraries across llvm and clang that usually are compiled separately. For instance we put in memory most of the clang frontend, the backend and the JIT. Could it be we are hitting some real limit?

Yes, they are, see https://luci-milo.appspot.com/p/fuchsia/builders/prod/clang-linux-x64, but there isn't much information in there unfortunately. It's possible that we're hitting some limit, but these bots use 32-core instances with 128GB RAM which I'd hope is enough even for the LTO build.

I think the specs are fine for just building with LTO, but I am not sure if that's enough to for the worst case when running ninja -j320 with an LTO build (which is what your job is doing). Can you try limiting your link jobs to something like 16 or 32 (e.g., -DLLVM_PARALLEL_LINK_JOBS=32)

(FWIW, your go build script also crashes with OOM errors so you really are running low on memory on that node)`

-j320 is only used for the first stage compiler which uses distributed compilation and no LTO, the second stage which uses LTO and where we see this issue uses Ninja default, so -j32 in this case.

In D96033#2768940, @phosek wrote:

In D96033#2767884, @teemperor wrote:

In D96033#2766502, @phosek wrote:

In D96033#2766372, @v.g.vassilev wrote:

In D96033#2766332, @phosek wrote:

We've started seeing LLVM ERROR: out of memory on our 2-stage LTO Linux builders after this change landed. It looks like linking clang-repl always fails on our bot, but I've also seen OOM when linking ClangCodeGenTests and FrontendTests. Do you have any idea why this could be happening? We'd appreciate any help since our bots have been broken for several days now.

Ouch. Are the bot logs public? If not maybe a stacktrace could be useful. clang-repl combines a lot of libraries across llvm and clang that usually are compiled separately. For instance we put in memory most of the clang frontend, the backend and the JIT. Could it be we are hitting some real limit?

Yes, they are, see https://luci-milo.appspot.com/p/fuchsia/builders/prod/clang-linux-x64, but there isn't much information in there unfortunately. It's possible that we're hitting some limit, but these bots use 32-core instances with 128GB RAM which I'd hope is enough even for the LTO build.

I think the specs are fine for just building with LTO, but I am not sure if that's enough to for the worst case when running ninja -j320 with an LTO build (which is what your job is doing). Can you try limiting your link jobs to something like 16 or 32 (e.g., -DLLVM_PARALLEL_LINK_JOBS=32)

(FWIW, your go build script also crashes with OOM errors so you really are running low on memory on that node)`

-j320 is only used for the first stage compiler which uses distributed compilation and no LTO, the second stage which uses LTO and where we see this issue uses Ninja default, so -j32 in this case.

I admit I don't really know the CI system on your node, but I assumed you're using -j320 from this output which I got by clicking on "execution details" on the aborted stage of this build:

Executing command [
  '/b/s/w/ir/x/w/cipd/ninja',
  '-j320',
  'stage2-check-clang',
  'stage2-check-lld',
  'stage2-check-llvm',
  'stage2-check-polly',
]
escaped for shell: /b/s/w/ir/x/w/cipd/ninja -j320 stage2-check-clang stage2-check-lld stage2-check-llvm stage2-check-polly
in dir /b/s/w/ir/x/w/staging/llvm_build
at time 2021-05-18T20:53:37.215574

(Tabbed on the Submit button instead of finishing that quote block...)

So I assume the stage2- targets are just invoking some ninja invocations in sequence?

Anyway, what I think it would be helpful to see what link jobs were in progress. But I guess even with 32 jobs you could end up with enough memory in 32 linker invocations that processes start get OOM killed.

anarazel removed a subscriber: anarazel.May 19 2021, 10:57 AM

v.g.vassilev mentioned this in rG49f9532165f0: [clang-repl] Tell the LLJIT the exact target triple we use..May 21 2021, 1:16 AM

v.g.vassilev mentioned this in D104918: [clang-repl] Implement partial translation units and error recovery..Jun 25 2021, 7:34 AM

v.g.vassilev mentioned this in D28563: Initial version of libInterpreter..Jun 27 2021, 10:31 PM

v.g.vassilev mentioned this in rG6775fc6ffa3c: [clang-repl] Implement partial translation units and error recovery..Jul 11 2021, 3:24 AM

v.g.vassilev mentioned this in rG11b47c103a36: Reland "[clang-repl] Implement partial translation units and error recovery.".Jul 12 2021, 8:21 AM

teemperor mentioned this in D106813: [clang-repl] Build and install clang-repl by default.Jul 26 2021, 12:00 PM

We're still hitting the OOMs when building clang-repl with LTO even with -DLLVM_PARALLEL_LINK_JOBS=32. While we don't build this target explicitly in our toolchain, it is built when running tests via stage2-check-clang. Is there perhaps a simple cmake flag that allows us to not run clang-repl tests so we don't build it?

In D96033#2944448, @leonardchan wrote:

We're still hitting the OOMs when building clang-repl with LTO even with -DLLVM_PARALLEL_LINK_JOBS=32. While we don't build this target explicitly in our toolchain, it is built when running tests via stage2-check-clang. Is there perhaps a simple cmake flag that allows us to not run clang-repl tests so we don't build it?

To clarify, this is on a machine with 512GB RAM.

In D96033#2944625, @phosek wrote:

In D96033#2944448, @leonardchan wrote:

We're still hitting the OOMs when building clang-repl with LTO even with -DLLVM_PARALLEL_LINK_JOBS=32. While we don't build this target explicitly in our toolchain, it is built when running tests via stage2-check-clang. Is there perhaps a simple cmake flag that allows us to not run clang-repl tests so we don't build it?

To clarify, this is on a machine with 512GB RAM.

@leonardchan, @phosek, I am not aware of such flag. Do you know how much memory does LTO + clang-repl consume and would it make sense to ping some of the LTO folks for their advice?

@leonardchan, @phosek, I am not aware of such flag. Do you know how much memory does LTO + clang-repl consume and would it make sense to ping some of the LTO folks for their advice?

I played around a bit with adding a flag and it turned out to be not as difficult as I thought it would: D108173. I think making this more of an "opt-in/out" tool like clang-staticanalyzer might be a good idea in general since not all downstream users may want to built all clang tools.

In D96033#2948243, @leonardchan wrote:

@leonardchan, @phosek, I am not aware of such flag. Do you know how much memory does LTO + clang-repl consume and would it make sense to ping some of the LTO folks for their advice?

I played around a bit with adding a flag and it turned out to be not as difficult as I thought it would: D108173. I think making this more of an "opt-in/out" tool like clang-staticanalyzer might be a good idea in general since not all downstream users may want to built all clang tools.

I would rather fix the underlying issue. @samitolvanen, @pcc do you know who can help us with debugging excessive memory use when linking clang-repl using LTO?

Revision Contents

Path

Size

clang/

include/

clang/

CodeGen/

CodeGenAction.h

3 lines

Frontend/

FrontendAction.h

5 lines

Interpreter/

Interpreter.h

71 lines

Transaction.h

39 lines

lib/

CMakeLists.txt

1 line

CodeGen/

CodeGenAction.cpp

4 lines

Frontend/

FrontendAction.cpp

1 line

Interpreter/

CMakeLists.txt

22 lines

IncrementalExecutor.h

46 lines

IncrementalExecutor.cpp

61 lines

IncrementalParser.h

77 lines

IncrementalParser.cpp

254 lines

Interpreter.cpp

220 lines

test/

CMakeLists.txt

1 line

Interpreter/

execute.cpp

14 lines

sanity.c

18 lines

lit.cfg.py

24 lines

tools/

CMakeLists.txt

1 line

clang-repl/

CMakeLists.txt

19 lines

ClangRepl.cpp

98 lines

unittests/

CMakeLists.txt

1 line

CodeGen/

CMakeLists.txt

2 lines

IncrementalProcessingTest.cpp

Interpreter/

CMakeLists.txt

11 lines

IncrementalProcessingTest.cpp

80 lines

InterpreterTest.cpp

122 lines

Diff 345037

clang/include/clang/CodeGen/CodeGenAction.h

Show All 13 Lines

namespace llvm {		namespace llvm {
class LLVMContext;		class LLVMContext;
class Module;		class Module;
}		}

namespace clang {		namespace clang {
class BackendConsumer;		class BackendConsumer;
		class CodeGenerator;

class CodeGenAction : public ASTFrontendAction {		class CodeGenAction : public ASTFrontendAction {
private:		private:
// Let BackendConsumer access LinkModule.		// Let BackendConsumer access LinkModule.
friend class BackendConsumer;		friend class BackendConsumer;

/// Info about module to link into a module we're generating.		/// Info about module to link into a module we're generating.
struct LinkModule {		struct LinkModule {
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	public:

/// Take the generated LLVM module, for use after the action has been run.		/// Take the generated LLVM module, for use after the action has been run.
/// The result may be null on failure.		/// The result may be null on failure.
std::unique_ptr<llvm::Module> takeModule();		std::unique_ptr<llvm::Module> takeModule();

/// Take the LLVM context used by this action.		/// Take the LLVM context used by this action.
llvm::LLVMContext *takeLLVMContext();		llvm::LLVMContext *takeLLVMContext();

		CodeGenerator *getCodeGenerator() const;

BackendConsumer *BEConsumer;		BackendConsumer *BEConsumer;
};		};

class EmitAssemblyAction : public CodeGenAction {		class EmitAssemblyAction : public CodeGenAction {
virtual void anchor();		virtual void anchor();
public:		public:
EmitAssemblyAction(llvm::LLVMContext *_VMContext = nullptr);		EmitAssemblyAction(llvm::LLVMContext *_VMContext = nullptr);
};		};
Show All 34 Lines

clang/include/clang/Frontend/FrontendAction.h

Show First 20 Lines • Show All 228 Lines • ▼ Show 20 Lines	public:
/// be aborted and neither Execute() nor EndSourceFile() should be called.		/// be aborted and neither Execute() nor EndSourceFile() should be called.
bool BeginSourceFile(CompilerInstance &CI, const FrontendInputFile &Input);		bool BeginSourceFile(CompilerInstance &CI, const FrontendInputFile &Input);

/// Set the source manager's main input file, and run the action.		/// Set the source manager's main input file, and run the action.
llvm::Error Execute();		llvm::Error Execute();

/// Perform any per-file post processing, deallocate per-file		/// Perform any per-file post processing, deallocate per-file
/// objects, and run statistics and output file cleanup code.		/// objects, and run statistics and output file cleanup code.
void EndSourceFile();		virtual void EndSourceFile();
		teemperorUnsubmitted Done Reply Inline Actions I think this and the change above requires updating `WrapperFrontendAction` to also forward these functions to the wrapped action. Otherwise wrapping actions would change their behaviour. See the other virtual functions there. teemperor: I think this and the change above requires updating `WrapperFrontendAction` to also forward…
		v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Done. Ideally I would like to avoid making these routines virtual. However, here we call `EndSourceFile` https://github.com/llvm/llvm-project/blob/1f6ec3d08f75dba6c93c291bd92552b807736eb3/clang/lib/Frontend/CompilerInstance.cpp#L952 and this shuts down some of the objects required by the IncrementalAction. In fact, after some checking, it does not make sense to allow overriding `BeginSourceFile` -- it initializes state of the `Action` class and does not make sense to forward it for instance in the `WrapperFrontendAction`. In principle, clients may be interested in initialize differently but for clang-repl we only need the `EndSourceFile` to be virtual. Alternatively, we could somehow make the Inputs to take an "external" iterator (https://github.com/llvm/llvm-project/blob/1f6ec3d08f75dba6c93c291bd92552b807736eb3/clang/lib/Frontend/CompilerInstance.cpp#L942) which will provide us with a hook to treat each input line as a separate Input file. AFAICT, that has two disadvantages: 1) we will initialize/finalize more state of various objects which can hinder performance; 2) each input file is considered as a separate translation unit (TU) which clashes with the concept we introduce of an ever growing single TU. v.g.vassilev: Done. Ideally I would like to avoid making these routines virtual. However, here we call…

/// @}		/// @}
};		};

/// Abstract base class to use for AST consumer-based frontend actions.		/// Abstract base class to use for AST consumer-based frontend actions.
class ASTFrontendAction : public FrontendAction {		class ASTFrontendAction : public FrontendAction {
protected:		protected:
/// Implement the ExecuteAction interface by running Sema on		/// Implement the ExecuteAction interface by running Sema on
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines

/// A frontend action which simply wraps some other runtime-specified		/// A frontend action which simply wraps some other runtime-specified
/// frontend action.		/// frontend action.
///		///
/// Deriving from this class allows an action to inject custom logic around		/// Deriving from this class allows an action to inject custom logic around
/// some existing action's behavior. It implements every virtual method in		/// some existing action's behavior. It implements every virtual method in
/// the FrontendAction interface by forwarding to the wrapped action.		/// the FrontendAction interface by forwarding to the wrapped action.
class WrapperFrontendAction : public FrontendAction {		class WrapperFrontendAction : public FrontendAction {
		protected:
std::unique_ptr<FrontendAction> WrappedAction;		std::unique_ptr<FrontendAction> WrappedAction;

protected:
bool PrepareToExecuteAction(CompilerInstance &CI) override;		bool PrepareToExecuteAction(CompilerInstance &CI) override;
std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,		std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
StringRef InFile) override;		StringRef InFile) override;
bool BeginInvocation(CompilerInstance &CI) override;		bool BeginInvocation(CompilerInstance &CI) override;
bool BeginSourceFileAction(CompilerInstance &CI) override;		bool BeginSourceFileAction(CompilerInstance &CI) override;
void ExecuteAction() override;		void ExecuteAction() override;
		void EndSourceFile() override;
void EndSourceFileAction() override;		void EndSourceFileAction() override;
bool shouldEraseOutputFiles() override;		bool shouldEraseOutputFiles() override;

public:		public:
/// Construct a WrapperFrontendAction from an existing action, taking		/// Construct a WrapperFrontendAction from an existing action, taking
/// ownership of it.		/// ownership of it.
WrapperFrontendAction(std::unique_ptr<FrontendAction> WrappedAction);		WrapperFrontendAction(std::unique_ptr<FrontendAction> WrappedAction);

Show All 11 Lines

clang/include/clang/Interpreter/Interpreter.h

This file was added.

				//===--- Interpreter.h - Incremental Compilation and Execution---- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the component which performs incremental code
				// compilation and execution.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_INTERPRETER_INTERPRETER_H
				#define LLVM_CLANG_INTERPRETER_INTERPRETER_H

				#include "clang/Interpreter/Transaction.h"

				#include "llvm/Support/Error.h"

				#include <memory>
				#include <vector>

				namespace llvm {
				namespace orc {
				class ThreadSafeContext;
				}
				class Module;
				} // namespace llvm

				namespace clang {

				class CompilerInstance;
				class DeclGroupRef;
				class IncrementalExecutor;
				class IncrementalParser;

				/// Create a pre-configured \c CompilerInstance for incremental processing.
				class IncrementalCompilerBuilder {
				public:
				static llvm::Expected<std::unique_ptr<CompilerInstance>>
				create(std::vector<const char *> &ClangArgv);
				};

				/// Provides top-level interfaces for incremental compilation and execution.
				class Interpreter {
				std::unique_ptr<llvm::orc::ThreadSafeContext> TSCtx;
				std::unique_ptr<IncrementalParser> IncrParser;
				std::unique_ptr<IncrementalExecutor> IncrExecutor;

				Interpreter(std::unique_ptr<CompilerInstance> CI, llvm::Error &Err);

				public:
				~Interpreter();
				static llvm::Expected<std::unique_ptr<Interpreter>>
				create(std::unique_ptr<CompilerInstance> CI);
				const CompilerInstance *getCompilerInstance() const;
				llvm::Expected<Transaction &> Parse(llvm::StringRef Code);
				llvm::Error Execute(Transaction &T);
				llvm::Error ParseAndExecute(llvm::StringRef Code) {
				auto ErrOrTransaction = Parse(Code);
				if (auto Err = ErrOrTransaction.takeError())
				return Err;
				if (ErrOrTransaction->TheModule)
				return Execute(*ErrOrTransaction);
				return llvm::Error::success();
				}
				};
				} // namespace clang

				#endif // LLVM_CLANG_INTERPRETER_INTERPRETER_H

clang/include/clang/Interpreter/Transaction.h

This file was added.

				//===--- Transaction.h - Incremental Compilation and Execution---- C++ --===//
				//
				teemperorUnsubmitted Done Reply Inline Actions Could this whole file just be part of `IncrementalParser.h` which is the only user ? `clang::Transaction` seems anyway a bit of a generic name, so maybe this could become `clang::IncrementalParser::Transaction` then. teemperor: Could this whole file just be part of `IncrementalParser.h` which is the only user ? `clang…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions The intent is to expose the Transaction class and if I move it in the `IncrementalParser` I will have to expose the `IncrementalParser.h`. Would it make sense to move it as part of the Interpreter object? Eg. `clang::Interpreter::Transaction`? v.g.vassilev: The intent is to expose the Transaction class and if I move it in the `IncrementalParser` I…
				teemperorUnsubmitted Done Reply Inline Actions Makes sense. I guess moving this into the Interpreter would create some strange dependencies, so I would say we keep it in its own file. Maybe we could put it into a `interpreter` namespace or give it a more unique name `InterpreterTransaction`? I also think this should have some documentation. teemperor: Makes sense. I guess moving this into the Interpreter would create some strange dependencies…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions I added some documentation. Having a long name for the Transaction class will make the future code clunky. This class will be extensively used throughout the interpreter codebase and api. I'd be in favor of going for a namespace but then I think we will have confusing naming for say `clang::interpreter::Interpreter`... v.g.vassilev: I added some documentation. Having a long name for the Transaction class will make the future…
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines utilities tracking the incrementally processed pieces of
				// code.
				//
				teemperorUnsubmitted Done Reply Inline Actions missing `.`. teemperor: missing `.`.
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_INTERPRETER_TRANSACTION_H
				#define LLVM_CLANG_INTERPRETER_TRANSACTION_H

				#include <memory>
				#include <vector>

				namespace llvm {
				class Module;
				}

				namespace clang {

				class DeclGroupRef;

				/// The class keeps track of various objects created as part of processing
				/// incremental inputs.
				struct Transaction {
				/// The decls created for the input.
				std::vector<clang::DeclGroupRef> Decls;

				/// The llvm IR produced for the input.
				std::unique_ptr<llvm::Module> TheModule;
				};
				} // namespace clang

				#endif // LLVM_CLANG_INTERPRETER_TRANSACTION_H

clang/lib/CMakeLists.txt

	Show All 19 Lines
	add_subdirectory(FrontendTool)			add_subdirectory(FrontendTool)
	add_subdirectory(Tooling)			add_subdirectory(Tooling)
	add_subdirectory(DirectoryWatcher)			add_subdirectory(DirectoryWatcher)
	add_subdirectory(Index)			add_subdirectory(Index)
	add_subdirectory(IndexSerialization)			add_subdirectory(IndexSerialization)
	add_subdirectory(StaticAnalyzer)			add_subdirectory(StaticAnalyzer)
	add_subdirectory(Format)			add_subdirectory(Format)
	add_subdirectory(Testing)			add_subdirectory(Testing)
				add_subdirectory(Interpreter)

clang/lib/CodeGen/CodeGenAction.cpp

Show First 20 Lines • Show All 879 Lines • ▼ Show 20 Lines	std::unique_ptr<llvm::Module> CodeGenAction::takeModule() {
return std::move(TheModule);		return std::move(TheModule);
}		}

llvm::LLVMContext *CodeGenAction::takeLLVMContext() {		llvm::LLVMContext *CodeGenAction::takeLLVMContext() {
OwnsVMContext = false;		OwnsVMContext = false;
return VMContext;		return VMContext;
}		}

		CodeGenerator *CodeGenAction::getCodeGenerator() const {
		v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions @rjmccall, we were wondering if there is a better way to ask CodeGen to start a new module. The current approach seems to be drilling hole in a number of abstraction layers. In the past we have touched that area a little in https://reviews.llvm.org/D34444 and the answer may be already there but I fail to connect the dots. Recently, we thought about having a new FrontendAction callback for beginning a new phase when compiling incremental input. We need to keep track of the created objects (needed for error recovery) in our Transaction. We can have a map of `Transaction` to `llvm::Module` in CodeGen. The issue is that new JITs take ownership of the `llvm::Module` which seems to make it impossible to support jitted code removal with that model (cc: @lhames, @rsmith). v.g.vassilev:* @rjmccall, we were wondering if there is a better way to ask CodeGen to start a new module. The…
		sgraenitzUnsubmitted Done Reply Inline Actions When compiling incrementally, doeas a "new phase" mean that all subsequent code will go into a new module from then on? How will dependencies to previous symbols be handled? Are all symbols external? The issue is that new JITs take ownership of the llvm::Module* That's true, but you can still keep a raw pointer to it, which will be valid at least as long as the module wasn't linked. Afterwards it depends on the linker: RuntimeDyld can return ownership of the object's memory range via `NotifyEmittedFunction` JITLink provides the `ReturnObjectBufferFunction` for the same purpose seems to make it impossible to support jitted code removal with that model Can you figure out what symbols are affected and remove these? A la: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/include/llvm/ExecutionEngine/Orc/Core.h#L1020 I think @anarazel has ported a client with code removal to OrcV2 successfully in the past. Maybe there's something we can learn from it. sgraenitz: When compiling incrementally, doeas a "new phase" mean that all subsequent code will go into a…
		v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions When compiling incrementally, doeas a "new phase" mean that all subsequent code will go into a new module from then on? How will dependencies to previous symbols be handled? Are all symbols external? There is some discussion on this here https://reviews.llvm.org/D34444#812418 I think the relevant bit is that 'we have just one ever growing TU [...] which we send to the RuntimeDyLD allowing only JIT to resolve symbols from it. We aid the JIT when resolving symbols with internal linkage by changing all internal linkage to external (We haven't seen issues with that approach)'. The issue is that new JITs take ownership of the llvm::Module* That's true, but you can still keep a raw pointer to it, which will be valid at least as long as the module wasn't linked. That was my first implementation when I upgraded cling to llvm9 where the `shared_ptr`s went to `unique_ptr`s. This was quite problematic for many of the use cases we support as the JIT is somewhat unpredictable to the high-level API user. Afterwards it depends on the linker: RuntimeDyld can return ownership of the object's memory range via `NotifyEmittedFunction` JITLink provides the `ReturnObjectBufferFunction` for the same purpose That's exactly what we ended up doing (I would like to thank Lang here who gave a similar advice). seems to make it impossible to support jitted code removal with that model Can you figure out what symbols are affected and remove these? A la: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/include/llvm/ExecutionEngine/Orc/Core.h#L1020 I think @anarazel has ported a client with code removal to OrcV2 successfully in the past. Maybe there's something we can learn from it. Indeed. That's not yet on my radar as seemed somewhat distant in time. v.g.vassilev: > When compiling incrementally, doeas a "new phase" mean that all subsequent code will go into…
		lhamesUnsubmitted Done Reply Inline Actions Recently, we thought about having a new FrontendAction callback for beginning a new phase when compiling incremental input. We need to keep track of the created objects (needed for error recovery) in our Transaction. We can have a map of Transaction* to llvm::Module* in CodeGen. The issue is that new JITs take ownership of the llvm::Module* which seems to make it impossible to support jitted code removal with that model (cc: @lhames, @rsmith). In the new APIs, in order to enable removable code, you can associate Modules with ResourceTrackers when they're added to the JIT. The ResourceTrackers then allow for removal. Idiomatic usage looks like: auto Mod = /* create module /; auto RT = JD.createResourceTracker(); J.addModule(RT, std::move(Mod)); //... if (auto Err = RT.remove()) / handle Err /; we have just one ever growing TU [...] which we send to RuntimeDyld... So is a TU the same as an llvm::Module in this context? If so, how do you reconcile that with the JIT taking ownership of modules? Are you just copying the Module each time before adding it? We need to keep track of the created objects (needed for error recovery) in our Transaction. Do you need the Module for error recovery? Or just the Decls? lhames: > Recently, we thought about having a new FrontendAction callback for beginning a new phase…
		v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Recently, we thought about having a new FrontendAction callback for beginning a new phase when compiling incremental input. We need to keep track of the created objects (needed for error recovery) in our Transaction. We can have a map of Transaction* to llvm::Module* in CodeGen. The issue is that new JITs take ownership of the llvm::Module* which seems to make it impossible to support jitted code removal with that model (cc: @lhames, @rsmith). In the new APIs, in order to enable removable code, you can associate Modules with ResourceTrackers when they're added to the JIT. The ResourceTrackers then allow for removal. Idiomatic usage looks like: auto Mod = /* create module /; auto RT = JD.createResourceTracker(); J.addModule(RT, std::move(Mod)); //... if (auto Err = RT.remove()) / handle Err /; Nice, thanks! we have just one ever growing TU [...] which we send to RuntimeDyld... So is a TU the same as an llvm::Module in this context? If so, how do you reconcile that with the JIT taking ownership of modules? Are you just copying the Module each time before adding it? Each incremental chunk with which the TU grows has a corresponding `llvm::Module`. Once clang's CodeGen is done for the particular module it transfers the ownership to the `Transaction` which, in turn, hands it to the JIT and once the JIT is done it retains the ownership again. We need to keep track of the created objects (needed for error recovery) in our Transaction. Do you need the Module for error recovery? Or just the Decls? Yes, we need a `llvm::Module` that corresponds to the Decls as sometimes CodeGen will decide not to emit a Decl. v.g.vassilev: > > Recently, we thought about having a new FrontendAction callback for beginning a new phase…
		lhamesUnsubmitted Done Reply Inline Actions Each incremental chunk with which the TU grows has a corresponding llvm::Module. Once clang's CodeGen is done for the particular module it transfers the ownership to the Transaction which, in turn, hands it to the JIT and once the JIT is done it retains the ownership again. Yes, we need a llvm::Module that corresponds to the Decls as sometimes CodeGen will decide not to emit a Decl. Can you elaborate on this? (Or point me to the relevant discussion / code?) Does CodeGen aggregate code into the Module as you CodeGen each incremental chunk? Or do you Link the previously CodeGen'd module into a new one? lhames: > Each incremental chunk with which the TU grows has a corresponding llvm::Module. Once clang's…
		v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Each incremental chunk with which the TU grows has a corresponding llvm::Module. Once clang's CodeGen is done for the particular module it transfers the ownership to the Transaction which, in turn, hands it to the JIT and once the JIT is done it retains the ownership again. Yes, we need a llvm::Module that corresponds to the Decls as sometimes CodeGen will decide not to emit a Decl. Can you elaborate on this? (Or point me to the relevant discussion / code?) Does CodeGen aggregate code into the Module as you CodeGen each incremental chunk? Or do you Link the previously CodeGen'd module into a new one? Cling's "code unloading" rolls back the states of the various objects without any checkpointing. Consider the two subsequent incremental inputs: `int f() { return 12; }` and `int i = f();`; `undo 1`. When we ask CodeGen to generate code for the first input it will not as `f` is not being used. Transaction1 will contain the `FunctionDecl` for `f` but the corresponding llvm::Module will be empty. Then when we get the second input line, the Transaction2 will contain the `VarDecl` but the corresponding llvm::Module will contain both IR definitions -- of `f` and `i`. Having the clang::Decl is useful because we can restore the previous state of the various internal frontend structures such as lookup tables. However, we cannot just drop the llvm::Module as it might contain deferred declarations which were emitted due to a use. That's pretty much the rationale behind this and the design dates back to pre-MCJIT times. I am all for making this more robust but that's what we currently have. The "code unloading" is mostly done in cling's DeclUnloader. There was some useful discussion about the model here quite some time ago v.g.vassilev: > > Each incremental chunk with which the TU grows has a corresponding llvm::Module. Once…
		return BEConsumer->getCodeGenerator();
		}

static std::unique_ptr<raw_pwrite_stream>		static std::unique_ptr<raw_pwrite_stream>
GetOutputStream(CompilerInstance &CI, StringRef InFile, BackendAction Action) {		GetOutputStream(CompilerInstance &CI, StringRef InFile, BackendAction Action) {
switch (Action) {		switch (Action) {
case Backend_EmitAssembly:		case Backend_EmitAssembly:
return CI.createDefaultOutputFile(false, InFile, "s");		return CI.createDefaultOutputFile(false, InFile, "s");
case Backend_EmitLL:		case Backend_EmitLL:
return CI.createDefaultOutputFile(false, InFile, "ll");		return CI.createDefaultOutputFile(false, InFile, "ll");
case Backend_EmitBC:		case Backend_EmitBC:
▲ Show 20 Lines • Show All 243 Lines • Show Last 20 Lines

clang/lib/Frontend/FrontendAction.cpp

Show First 20 Lines • Show All 1,081 Lines • ▼ Show 20 Lines	bool WrapperFrontendAction::BeginSourceFileAction(CompilerInstance &CI) {
auto Ret = WrappedAction->BeginSourceFileAction(CI);		auto Ret = WrappedAction->BeginSourceFileAction(CI);
// BeginSourceFileAction may change CurrentInput, e.g. during module builds.		// BeginSourceFileAction may change CurrentInput, e.g. during module builds.
setCurrentInput(WrappedAction->getCurrentInput());		setCurrentInput(WrappedAction->getCurrentInput());
return Ret;		return Ret;
}		}
void WrapperFrontendAction::ExecuteAction() {		void WrapperFrontendAction::ExecuteAction() {
WrappedAction->ExecuteAction();		WrappedAction->ExecuteAction();
}		}
		void WrapperFrontendAction::EndSourceFile() { WrappedAction->EndSourceFile(); }
void WrapperFrontendAction::EndSourceFileAction() {		void WrapperFrontendAction::EndSourceFileAction() {
WrappedAction->EndSourceFileAction();		WrappedAction->EndSourceFileAction();
}		}
bool WrapperFrontendAction::shouldEraseOutputFiles() {		bool WrapperFrontendAction::shouldEraseOutputFiles() {
return WrappedAction->shouldEraseOutputFiles();		return WrappedAction->shouldEraseOutputFiles();
}		}

bool WrapperFrontendAction::usesPreprocessorOnly() const {		bool WrapperFrontendAction::usesPreprocessorOnly() const {
Show All 22 Lines

clang/lib/Interpreter/CMakeLists.txt

This file was added.

				set(LLVM_LINK_COMPONENTS
				core
				native
				OrcJit
				Target
				)

				add_clang_library(clangInterpreter
				IncrementalExecutor.cpp
				IncrementalParser.cpp
				Interpreter.cpp

				LINK_LIBS
				clangAST
				clangAnalysis
				clangBasic
				clangEdit
				clangLex
				clangSema
				clangCodeGen
				clangFrontendTool
				)

clang/lib/Interpreter/IncrementalExecutor.h

This file was added.

				//===--- IncrementalExecutor.h - Incremental Execution ----------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the class which performs incremental code execution.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_LIB_INTERPRETER_INCREMENTALEXECUTOR_H
				#define LLVM_CLANG_LIB_INTERPRETER_INCREMENTALEXECUTOR_H

				#include "llvm/ADT/StringRef.h"
				#include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"

				#include <memory>

				namespace llvm {
				class Error;
				class Module;
				namespace orc {
				class LLJIT;
				class ThreadSafeContext;
				} // namespace orc
				} // namespace llvm

				namespace clang {
				class IncrementalExecutor {
				using CtorDtorIterator = llvm::orc::CtorDtorIterator;
				std::unique_ptr<llvm::orc::LLJIT> Jit;
				llvm::orc::ThreadSafeContext &TSCtx;

				public:
				IncrementalExecutor(llvm::orc::ThreadSafeContext &TSC, llvm::Error &Err);
				teemperorUnsubmitted Done Reply Inline Actions Should we maybe merge `runCtors` and `addModule`? Not sure if there is a use case for adding a Module but not running Ctors. Also documentation. teemperor: Should we maybe merge `runCtors` and `addModule`? Not sure if there is a use case for adding a…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions The case we have is when there is no JIT -- currently we have such a case in IncrementalProcessingTest I think. Another example, which will show up in future patches, is the registration of atexit handlers. That is, before we `runCtors` we make a pass over the LLVM IR and collect some specific details and (possibly change the IR and then run). I'd rather keep it separate for now if that's okay. v.g.vassilev: The case we have is when there is no JIT -- currently we have such a case in…
				sgraenitzUnsubmitted Done Reply Inline Actions Should we maybe merge runCtors and addModule? +1 even though there may be open questions regarding incremental initialization. The case we have is when there is no JIT -- currently we have such a case in IncrementalProcessingTest Can you run anything if there is no JIT? I think what you have in `IncrementalProcessing.EmitCXXGlobalInitFunc` is `getGlobalInit(llvm::Module)`, which checks for symbol names with a specific prefix. before we runCtors we make a pass over the LLVM IR and collect some specific details and (possibly change the IR and then run). The idiomatic solution for such modifications would use an IRTransformLayer as in: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/examples/OrcV2Examples/LLJITWithOptimizingIRTransform/LLJITWithOptimizingIRTransform.cpp#L108 Another example, which will show up in future patches, is the registration of atexit handlers `atexit` handlers as well as global ctors/dtors should be covered by LLJIT PlatformSupport. The LLJITBuilder will inject a GenericLLVMIRPlatformSupport instance by default: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L125 It's not as comprehensive as e.g. the MachO implementation, but should be sufficient for your use-case as you have IR for all your JITed code. (It would NOT work if you cached object files, reloaded them in a subsequent session and wanted to run their ctors.) So, your below call to `initialize()` should do it already. sgraenitz:* > Should we maybe merge runCtors and addModule? +1 even though there may be open questions…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Should we maybe merge runCtors and addModule? +1 even though there may be open questions regarding incremental initialization. The case we have is when there is no JIT -- currently we have such a case in IncrementalProcessingTest Can you run anything if there is no JIT? I think what you have in `IncrementalProcessing.EmitCXXGlobalInitFunc` is `getGlobalInit(llvm::Module)`, which checks for symbol names with a specific prefix. Yes, I'd think such mode is useful for testing but also for other cases where the user is handed a Transaction and allowed to make some modification before processing the `llvm::Module` before we runCtors we make a pass over the LLVM IR and collect some specific details and (possibly change the IR and then run). The idiomatic solution for such modifications would use an IRTransformLayer as in: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/examples/OrcV2Examples/LLJITWithOptimizingIRTransform/LLJITWithOptimizingIRTransform.cpp#L108 That looks very nice. It assumes the JIT is open to the users, here we open only the `llvm::Module` (I am not arguing if that's a good idea in general). Another example, which will show up in future patches, is the registration of atexit handlers `atexit` handlers as well as global ctors/dtors should be covered by LLJIT PlatformSupport. The LLJITBuilder will inject a GenericLLVMIRPlatformSupport instance by default: https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L125 Does that give me control over when the `atexit` handlers are called? Can the interpreter call them at its choice? It's not as comprehensive as e.g. the MachO implementation, but should be sufficient for your use-case as you have IR for all your JITed code. (It would NOT work if you cached object files, reloaded them in a subsequent session and wanted to run their ctors.) So, your below call to `initialize()` should do it already. v.g.vassilev: > > Should we maybe merge runCtors and addModule? > > +1 even though there may be open…
				lhamesUnsubmitted Done Reply Inline Actions Should we maybe merge runCtors and addModule? +1 even though there may be open questions regarding incremental initialization. In the long term constructors should be run via the Orc runtime (currently planned for initial release in LLVM 13 later this year). I like the idea of keeping "add module" and "run initializers" as two separate steps, with initializers being run only when you execute a top level expression. It would allow for workflows like this: interpreter% :load a.cpp interpreter% :load b.cpp where an initializer in a.cpp depends on code in b.cpp. It would also allow for defining constructors with forward references in the REPL itself. The Orc runtime is currently focused on emulating the usual execution environment: The canonical way to execute initializers is by calling jit_dlopen on the target JITDylib. I think the plan should be to generalize this behavior (either in the jit_dlopen contract, or by introducing a jit_dlopen_repl function) to allow for repeated calls to dlopen, with each subsequent dlopen call executing any discovered-but-not-yet-run initializers. Does that give me control over when the atexit handlers are called? Can the interpreter call them at its choice? It's not as comprehensive as e.g. the MachO implementation, but should be sufficient for your use-case as you have IR for all your JITed code. (It would NOT work if you cached object files, reloaded them in a subsequent session and wanted to run their ctors.) So, your below call to initialize() should do it already. Yep -- initialize should run the constructors, which should call cxa_atexit. The cxa_atexit calls should be interposed by GenericLLVMIRPlatform, and the atexits run when you call LLJIT::deinitialize on the JITDylib. There are some basic regression tests for this, but it hasn't been stress tested yet. GenericLLVMIRPlatform should actually support initializers in cached object files that were compiled from Modules added to LLJIT: The platform replaces llvm.global_ctors with an init function with a known name, then looks for that name in objects loaded for the cache. At least that was the plan, I don't recall whether it has actually been tested. What definitely doesn't work is running initializers in objects produced outside LLJIT. That will be fixed by JITLink/ELF and the Orc Runtime though (and already works for MachO in the runtime prototype). lhames: >> Should we maybe merge runCtors and addModule? > +1 even though there may be open questions…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions @sgraenitz, @lhames, thanks for the clarifications. I am marking your comments as resolved (for easier tracking on my end). If the intent was to change something in this patch could you elaborate a little more what specifically I need to do here? v.g.vassilev: @sgraenitz, @lhames, thanks for the clarifications. I am marking your comments as resolved…
				lhamesUnsubmitted Done Reply Inline Actions I don't think there's anything to do here -- those notes were just background info. lhames: I don't think there's anything to do here -- those notes were just background info.
				~IncrementalExecutor();

				llvm::Error addModule(std::unique_ptr<llvm::Module> M);
				llvm::Error runCtors() const;
				};

				} // end namespace clang

				#endif // LLVM_CLANG_LIB_INTERPRETER_INCREMENTALEXECUTOR_H

clang/lib/Interpreter/IncrementalExecutor.cpp

This file was added.

				//===--- IncrementalExecutor.cpp - Incremental Execution --------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the class which performs incremental code execution.
				//
				//===----------------------------------------------------------------------===//

				#include "IncrementalExecutor.h"

				#include "llvm/ExecutionEngine/ExecutionEngine.h"
				#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
				#include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
				#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
				#include "llvm/ExecutionEngine/Orc/LLJIT.h"
				#include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
				#include "llvm/ExecutionEngine/SectionMemoryManager.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Support/ManagedStatic.h"
				#include "llvm/Support/TargetSelect.h"

				namespace clang {

				IncrementalExecutor::IncrementalExecutor(llvm::orc::ThreadSafeContext &TSC,
				llvm::Error &Err)
				: TSCtx(TSC) {
				using namespace llvm::orc;
				llvm::ErrorAsOutParameter EAO(&Err);

				if (auto JitOrErr = LLJITBuilder().create())
				Jit = std::move(*JitOrErr);
				else {
				Err = JitOrErr.takeError();
				return;
				}

				const char Pref = Jit->getDataLayout().getGlobalPrefix();
				// Discover symbols from the process as a fallback.
				if (auto PSGOrErr = DynamicLibrarySearchGenerator::GetForCurrentProcess(Pref))
				Jit->getMainJITDylib().addGenerator(std::move(*PSGOrErr));
				else {
				Err = PSGOrErr.takeError();
				return;
				}
				}

				IncrementalExecutor::~IncrementalExecutor() {}

				lhamesUnsubmitted Done Reply Inline Actions I think this can be shortened to: using namespace llvm::orc; llvm::ErrorAsOutParameter EAO(&Err); if (auto JitOrErr = LLJITBuilder.create()) Jit = std::move(JitOrErr); else { Err = JitOrErr.takeError(); return; } const auto &DL = Jit->getDataLayout(); if (auto PSGOrErr = DynamicLibrarySearchGenerator::GetForCurrentProcess(DL.getGlobalPrefix())) Jit->getMainJITDylib().addGenerator(std::move(PSGOrErr)); else { Err = PSGOrErr.takeError(); return; } You don't need the call to `llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);` any more: DynamicLibrarySearchGenerator::GetForCurrentProcess does that for you. lhames: I think this can be shortened to: using namespace llvm::orc; llvm::ErrorAsOutParameter EAO…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Cool, thanks! v.g.vassilev: Cool, thanks!
				llvm::Error IncrementalExecutor::addModule(std::unique_ptr<llvm::Module> M) {
				return Jit->addIRModule(llvm::orc::ThreadSafeModule(std::move(M), TSCtx));
				}

				llvm::Error IncrementalExecutor::runCtors() const {
				return Jit->initialize(Jit->getMainJITDylib());
				}

				lhamesUnsubmitted Done Reply Inline Actions This doesn't look right. The ThreadSafeContext has to contain the LLVMContext for the module, but here you're creating a new unrelated ThreadSafeContext. lhames: This doesn't look right. The ThreadSafeContext has to contain the LLVMContext for the module…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Thanks. I think I fixed it now. Can you take a look? v.g.vassilev: Thanks. I think I fixed it now. Can you take a look?
				lhamesUnsubmitted Done Reply Inline Actions Yep -- This looks right now. lhames: Yep -- This looks right now.
				} // end namespace clang

clang/lib/Interpreter/IncrementalParser.h

This file was added.

				//===--- IncrementalParser.h - Incremental Compilation ----------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the class which performs incremental code compilation.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_LIB_INTERPRETER_INCREMENTALPARSER_H
				#define LLVM_CLANG_LIB_INTERPRETER_INCREMENTALPARSER_H

				#include "clang/Interpreter/Transaction.h"

				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/Support/Error.h"

				#include <list>
				#include <memory>
				namespace llvm {
				class LLVMContext;
				}

				namespace clang {
				class ASTConsumer;
				class CompilerInstance;
				class CodeGenerator;
				class DeclGroupRef;
				class FrontendAction;
				class IncrementalAction;
				class Parser;

				/// Provides support for incremental compilation. Keeps track of the state
				/// changes between the subsequent incremental input.
				///
				teemperorUnsubmitted Done Reply Inline Actions Isn't this always an `IncrementalAction` ? The destructor is blindly casting this, so we might as well use the right type here. teemperor: Isn't this always an `IncrementalAction` ? The destructor is blindly casting this, so we might…
				class IncrementalParser {
				/// Long-lived, incremental parsing action.
				std::unique_ptr<IncrementalAction> Act;

				/// Compiler instance performing the incremental compilation.
				std::unique_ptr<CompilerInstance> CI;

				/// Parser.
				std::unique_ptr<Parser> P;

				/// Consumer to process the produced top level decls. Owned by Act.
				ASTConsumer *Consumer = nullptr;
				teemperorUnsubmitted Done Reply Inline Actions Maybe `Counts the number of direct user input lines that have been parsed.` or something like that. That InputCount counts inputs seems like redundant information. teemperor: Maybe `Counts the number of direct user input lines that have been parsed.` or something like…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Yeah, much nicer. v.g.vassilev: Yeah, much nicer.

				/// Counts the number of direct user input lines that have been parsed.
				unsigned InputCount = 0;

				/// List containing every information about every incrementally parsed piece
				/// of code.
				std::list<Transaction> Transactions;

				public:
				IncrementalParser(std::unique_ptr<CompilerInstance> Instance,
				llvm::LLVMContext &LLVMCtx, llvm::Error &Err);
				teemperorUnsubmitted Done Reply Inline Actions Documentation teemperor: Documentation
				~IncrementalParser();

				const CompilerInstance *getCI() const { return CI.get(); }

				/// Parses incremental input by creating an in-memory file.
				///\returns a \c Transaction which holds information about the \c Decls and
				/// \c llvm::Module corresponding to the input.
				llvm::Expected<Transaction &> Parse(llvm::StringRef Input);

				private:
				llvm::Expected<Transaction &> ParseOrWrapTopLevelDecl();
				};
				} // end namespace clang

				#endif // LLVM_CLANG_LIB_INTERPRETER_INCREMENTALPARSER_H

clang/lib/Interpreter/IncrementalParser.cpp

This file was added.

				//===--------- IncrementalParser.cpp - Incremental Compilation -----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the class which performs incremental code compilation.
				//
				//===----------------------------------------------------------------------===//

				#include "IncrementalParser.h"

				#include "clang/CodeGen/BackendUtil.h"
				#include "clang/CodeGen/CodeGenAction.h"
				#include "clang/CodeGen/ModuleBuilder.h"
				#include "clang/Frontend/CompilerInstance.h"
				#include "clang/Frontend/FrontendAction.h"
				#include "clang/FrontendTool/Utils.h"
				#include "clang/Parse/Parser.h"
				#include "clang/Sema/Sema.h"

				#include "llvm/Option/ArgList.h"
				#include "llvm/Support/CrashRecoveryContext.h"
				#include "llvm/Support/Error.h"
				#include "llvm/Support/Timer.h"

				#include <sstream>

				namespace clang {

				/// A custom action enabling the incremental processing functionality.
				///
				/// The usual \p FrontendAction expects one call to ExecuteAction and once it
				/// sees a call to \p EndSourceFile it deletes some of the important objects
				/// such as \p Preprocessor and \p Sema assuming no further input will come.
				///
				/// \p IncrementalAction ensures it keep its underlying action's objects alive
				/// as long as the \p IncrementalParser needs them.
				///
				class IncrementalAction : public WrapperFrontendAction {
				private:
				bool IsTerminating = false;

				public:
				IncrementalAction(CompilerInstance &CI, llvm::LLVMContext &LLVMCtx,
				llvm::Error &Err)
				: WrapperFrontendAction([&]() {
				llvm::ErrorAsOutParameter EAO(&Err);
				std::unique_ptr<FrontendAction> Act;
				switch (CI.getFrontendOpts().ProgramAction) {
				default:
				Err = llvm::createStringError(
				std::errc::state_not_recoverable,
				"Driver initialization failed. "
				"Incremental mode for action is not supported");
				return Act;
				case frontend::ASTDump:
				LLVM_FALLTHROUGH;
				case frontend::ASTPrint:
				LLVM_FALLTHROUGH;
				case frontend::ParseSyntaxOnly:
				Act = CreateFrontendAction(CI);
				teemperorUnsubmitted Done Reply Inline Actions Can this completion code even be used? It doesn't look like it can (and I'm not sure if using the `CodeCompletionAt` flag is really useful in a REPL as you can only specify it once during startup). IMHO this can be left out until we actually can hook up this into the LineEditor (and we have a way to test this). teemperor: Can this completion code even be used? It doesn't look like it can (and I'm not sure if using…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Cling, on which this patch is based on, has a CodeCompleteConsumer and it seems to work. I can leave it out but then we will have to remember where to put it. I have a slight preference to leave it as it is. v.g.vassilev: Cling, on which this patch is based on, has a CodeCompleteConsumer and it seems to work. I can…
				teemperorUnsubmitted Done Reply Inline Actions Alright, given that this is just passing along the CompletionConsumer from Ci to Sema, I think this can stay then. teemperor: Alright, given that this is just passing along the CompletionConsumer from Ci to Sema, I think…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Cool, thanks! v.g.vassilev: Cool, thanks!
				break;
				case frontend::EmitObj:
				LLVM_FALLTHROUGH;
				case frontend::EmitLLVMOnly:
				Act.reset(new EmitLLVMOnlyAction(&LLVMCtx));
				break;
				teemperorUnsubmitted Done Reply Inline Actions Could we do that before creating the Sema? Sema's constructor is doing quite a few things and some of them might depend on this setting in the future. teemperor: Could we do that before creating the Sema? Sema's constructor is doing quite a few things and…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Indeed, should be fixed. v.g.vassilev: Indeed, should be fixed.
				}
				return Act;
				}()) {}
				FrontendAction *getWrapped() const { return WrappedAction.get(); }
				teemperorUnsubmitted Done Reply Inline Actions I assume this function intercepts all EndSourceFile calls so that the preprocessor is 'stuck' in the main file even if it reaches the end of the user input so far? Might be worth a comment. teemperor: I assume this function intercepts all EndSourceFile calls so that the preprocessor is 'stuck'…
				void ExecuteAction() override {
				teemperorUnsubmitted Done Reply Inline Actions No `{ .. }` teemperor: No `{ .. }`
				CompilerInstance &CI = getCompilerInstance();
				assert(CI.hasPreprocessor() && "No PP!");

				// FIXME: Move the truncation aspect of this into Sema, we delayed this till
				// here so the source manager would be initialized.
				if (hasCodeCompletionSupport() &&
				!CI.getFrontendOpts().CodeCompletionAt.FileName.empty())
				CI.createCodeCompletionConsumer();

				// Use a code completion consumer?
				CodeCompleteConsumer *CompletionConsumer = nullptr;
				if (CI.hasCodeCompletionConsumer())
				CompletionConsumer = &CI.getCodeCompletionConsumer();

				Preprocessor &PP = CI.getPreprocessor();
				PP.enableIncrementalProcessing();
				PP.EnterMainSourceFile();

				if (!CI.hasSema())
				CI.createSema(getTranslationUnitKind(), CompletionConsumer);
				}

				// Do not terminate after processing the input. This allows us to keep various
				// clang objects alive and to incrementally grow the current TU.
				void EndSourceFile() override {
				// The WrappedAction can be nullptr if we issued an error in the ctor.
				if (IsTerminating && getWrapped())
				teemperorUnsubmitted Done Reply Inline Actions I think this should be in ClangRepl.cpp where we actually install the handler (I wouldn't expect that creating an `IncrementalParser` object would touch the global error handler). teemperor: I think this should be in ClangRepl.cpp where we actually install the handler (I wouldn't…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Nice catch! Thanks! v.g.vassilev: Nice catch! Thanks!
				WrapperFrontendAction::EndSourceFile();
				}

				void FinalizeAction() {
				assert(!IsTerminating && "Already finalized!");
				IsTerminating = true;
				EndSourceFile();
				}
				};

				IncrementalParser::IncrementalParser(std::unique_ptr<CompilerInstance> Instance,
				llvm::LLVMContext &LLVMCtx,
				llvm::Error &Err)
				: CI(std::move(Instance)) {
				llvm::ErrorAsOutParameter EAO(&Err);
				Act = std::make_unique<IncrementalAction>(*CI, LLVMCtx, Err);
				if (Err)
				return;
				CI->ExecuteAction(*Act);
				Consumer = &CI->getASTConsumer();
				P.reset(
				new Parser(CI->getPreprocessor(), CI->getSema(), /SkipBodies=/false));
				P->Initialize();
				}

				IncrementalParser::~IncrementalParser() { Act->FinalizeAction(); }

				llvm::Expected<Transaction &> IncrementalParser::ParseOrWrapTopLevelDecl() {
				DiagnosticsEngine &Diags = getCI()->getDiagnostics();

				if (Diags.hasErrorOccurred())
				llvm::report_fatal_error("Previous input had errors, "
				"recovery not yet supported",
				/GenCrashDiag=/false);

				teemperorUnsubmitted Not Done Reply Inline Actions Could we test this? teemperor: Could we test this?
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Not directly. A test case should be #pragma weak f void f() {} This is tested by standard, we "just" record some decls in the vector. Do you have a specific test in mind? v.g.vassilev: Not directly. A test case should be ``` #pragma weak f void f() {} ``` This is tested by…
				// Recover resources if we crash before exiting this method.
				Sema &S = CI->getSema();
				llvm::CrashRecoveryContextCleanupRegistrar<Sema> CleanupSema(&S);
				Sema::GlobalEagerInstantiationScope GlobalInstantiations(S, /Enabled=/true);
				Sema::LocalEagerInstantiationScope LocalInstantiations(S);

				// Skip previous eof due to last incremental input.
				if (P->getCurToken().is(tok::eof))
				P->ConsumeToken();

				Transactions.emplace_back(Transaction());
				Transaction &LastTransaction = Transactions.back();

				Parser::DeclGroupPtrTy ADecl;
				for (bool AtEOF = P->ParseFirstTopLevelDecl(ADecl); !AtEOF;
				AtEOF = P->ParseTopLevelDecl(ADecl)) {
				// If we got a null return and something was parsed, ignore it. This
				// is due to a top-level semicolon, an action override, or a parse error
				// skipping something.
				if (ADecl && !Consumer->HandleTopLevelDecl(ADecl.get()))
				return llvm::make_error<llvm::StringError>("Parsing failed. "
				"The consumer rejected a decl",
				std::error_code());
				lhamesUnsubmitted Done Reply Inline Actions Wherever this CodeGenAction is created it's probably the missing piece of the ThreadSafeContext puzzle: CodeGenAction's constructor takes an LLVMContext, creating a new LLVMContext if the argument is null. In your system (at least to start with) I guess you will want to create one ThreadSafeModule associated with the interpreter and pass a pointer to that context to all CodeGenActions. lhames:* Wherever this CodeGenAction is created it's probably the missing piece of the ThreadSafeContext…
				LastTransaction.Decls.push_back(ADecl.get());
				}

				// Process any TopLevelDecls generated by #pragma weak.
				for (Decl *D : S.WeakTopLevelDecls()) {
				DeclGroupRef DGR(D);
				teemperorUnsubmitted Done Reply Inline Actions I think this is OK for this patch, but if the file is called "input_line_" then Clang would still emit the actual line information (which is 1) and we get `input_line_1:1:9: error: ...` diagnostics. Since D83038 Clang can hide line numbers in diagnostics, so we could selectively turn this setting on when we generate diagnostics for `input_line` files. (But as said, this is fine for this patch). teemperor: I think this is OK for this patch, but if the file is called "input_line_" then Clang would…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions That is good to know. I will try to remember this and use it if needed. v.g.vassilev: That is good to know. I will try to remember this and use it if needed.
				LastTransaction.Decls.push_back(DGR);
				Consumer->HandleTopLevelDecl(DGR);
				}

				LocalInstantiations.perform();
				GlobalInstantiations.perform();

				Consumer->HandleTranslationUnit(S.getASTContext());

				if (Diags.hasErrorOccurred())
				teemperorUnsubmitted Done Reply Inline Actions I think overwriting the \0 in the buffer isn't necessary. So `MBStart[InputSize] = '\n';` should be enough. teemperor: I think overwriting the \0 in the buffer isn't necessary. So `MBStart[InputSize] = '\n';`…
				return llvm::make_error<llvm::StringError>("Parsing failed.",
				std::error_code());

				return LastTransaction;
				}

				static CodeGenerator getCodeGen(FrontendAction Act) {
				IncrementalAction IncrAct = static_cast<IncrementalAction >(Act);
				FrontendAction *WrappedAct = IncrAct->getWrapped();
				teemperorUnsubmitted Done Reply Inline Actions The `+ 2` here is probably not what you want. This will just give you a pointer into Clang's source buffers and will eventually point to random source buffers (or worse) once InputCount is large enough. I feel like the proper solution is to just use the StartOfFile Loc and don't add any offset to it. I think Clang should still be able to figure out a reasonable ordering for overload candidates etc. (I thought I already commented on this line before, but I can't see my comment or any replies on Phab so I'm just commenting again). teemperor: The `+ 2` here is probably not what you want. This will just give you a pointer into Clang's…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions The `+ 2` here is probably not what you want. This will just give you a pointer into Clang's source buffers and will eventually point to random source buffers (or worse) once InputCount is large enough. Indeed. I feel like the proper solution is to just use the StartOfFile Loc and don't add any offset to it. I think Clang should still be able to figure out a reasonable ordering for overload candidates etc. That particular part of the input processing has been causing a lot of troubles for cling over the years. If we use the StartOfFile each new line will appear before the previous which can be problematic for as you say diagnostics but also template instantiations. Cling ended up solving this by introducing a virtual file with impossible to allocate size of `1U << 15U` (https://github.com/vgvassilev/cling/blob/da1bb78f3dea4d2bf19b383aeb1872e9f2b117ad/lib/Interpreter/CIFactory.cpp#L1516-L1527 and https://github.com/vgvassilev/cling/blob/da1bb78f3dea4d2bf19b383aeb1872e9f2b117ad/lib/Interpreter/IncrementalParser.cpp#L394) Then we are save to get Loc + 1 (I do not remember how that + 2 came actually) and you should be still safe. I wonder if that's something we should do here? (I thought I already commented on this line before, but I can't see my comment or any replies on Phab so I'm just commenting again). v.g.vassilev: > The `+ 2` here is probably not what you want. This will just give you a pointer into Clang's…
				teemperorUnsubmitted Done Reply Inline Actions I think my point then is: If we would change Clang's behaviour to consider the insertion order in the include tree when deciding the SourceLocation order, wouldn't that fix the problem? IIUC that's enough to make this case work the intended way without requiring some fake large source file. It would also make this work in other projects such as LLDB. So IMHO we could just use StartOfFile as the loc here and then consider the wrong ordering as a Clang bug (and add a FIXME for that here). teemperor: I think my point then is: If we would change Clang's behaviour to consider the insertion order…
				if (!WrappedAct->hasIRSupport())
				return nullptr;
				return static_cast<CodeGenAction *>(WrappedAct)->getCodeGenerator();
				}

				teemperorUnsubmitted Done Reply Inline Actions I think we could assert that this succeeds. teemperor: I think we could assert that this succeeds.
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions assert on the FID you mean? v.g.vassilev: assert on the FID you mean?
				teemperorUnsubmitted Done Reply Inline Actions I meant the `EnterSourceFile` call has a `bool` error it returns (along with a diagnostic). I think the only way it should fail is if the previous code got somehow messed up, hence the assert suggestion. teemperor: I meant the `EnterSourceFile` call has a `bool` error it returns (along with a diagnostic). I…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Got it now, somehow overlooked that `EnterSourceFile` returns true on a failure. I decided to return an error. v.g.vassilev: Got it now, somehow overlooked that `EnterSourceFile` returns true on a failure. I decided to…
				llvm::Expected<Transaction &> IncrementalParser::Parse(llvm::StringRef input) {
				Preprocessor &PP = CI->getPreprocessor();
				assert(PP.isIncrementalProcessingEnabled() && "Not in incremental mode!?");

				std::ostringstream SourceName;
				SourceName << "input_line_" << InputCount++;
				teemperorUnsubmitted Not Done Reply Inline Actions Could we throw in a test that at least covers this code path? teemperor: Could we throw in a test that at least covers this code path?
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions I can work on that but I'd be in favor of not holding up that patch due to this issue. I see two options: remove the code -- that'd make it harder for cling to reuse this piece of code. keep it as is -- the risk is that we have an untested codepath. Either way, that may be some bug in clang that we need to fix and drop these lines altogether... That particular issue is tracked here https://github.com/root-project/root/pull/7078 (some of the related context is here https://paste.ubuntu.com/p/bM2VRgmqSG/ in a very obscure way :) ) v.g.vassilev: I can work on that but I'd be in favor of not holding up that patch due to this issue. I see…

				// Create an uninitialized memory buffer, copy code in and append "\n"
				size_t InputSize = input.size(); // don't include trailing 0
				// MemBuffer size should not include terminating zero
				std::unique_ptr<llvm::MemoryBuffer> MB(
				llvm::WritableMemoryBuffer::getNewUninitMemBuffer(InputSize + 1,
				SourceName.str()));
				char MBStart = const_cast<char >(MB->getBufferStart());
				memcpy(MBStart, input.data(), InputSize);
				MBStart[InputSize] = '\n';

				SourceManager &SM = CI->getSourceManager();

				// FIXME: Create SourceLocation, which will allow clang to order the overload
				// candidates for example
				SourceLocation NewLoc = SM.getLocForStartOfFile(SM.getMainFileID());

				// Create FileID for the current buffer.
				FileID FID = SM.createFileID(std::move(MB), SrcMgr::C_User, /LoadedID=/0,
				/LoadedOffset=/0, NewLoc);

				// NewLoc only used for diags.
				if (PP.EnterSourceFile(FID, /DirLookup=/0, NewLoc))
				return llvm::make_error<llvm::StringError>("Parsing failed. "
				"Cannot enter source file.",
				std::error_code());

				auto ErrOrTransaction = ParseOrWrapTopLevelDecl();
				if (auto Err = ErrOrTransaction.takeError())
				return std::move(Err);

				if (PP.getLangOpts().DelayedTemplateParsing) {
				// Microsoft-specific:
				// Late parsed templates can leave unswallowed "macro"-like tokens.
				// They will seriously confuse the Parser when entering the next
				// source file. So lex until we are EOF.
				Token Tok;
				do {
				PP.Lex(Tok);
				} while (Tok.isNot(tok::eof));
				rsmithUnsubmitted Not Done Reply Inline Actions What are these tokens, exactly? Are we sure it's safe to discard them rather than parsing them? rsmith: What are these tokens, exactly? Are we sure it's safe to discard them rather than parsing them?
				}

				Token AssertTok;
				PP.Lex(AssertTok);
				assert(AssertTok.is(tok::eof) &&
				"Lexer must be EOF when starting incremental parse!");

				if (CodeGenerator *CG = getCodeGen(Act.get())) {
				std::unique_ptr<llvm::Module> M(CG->ReleaseModule());
				CG->StartModule("incr_module_" + std::to_string(Transactions.size()),
				M->getContext());

				ErrOrTransaction->TheModule = std::move(M);
				}

				return ErrOrTransaction;
				}
				} // end namespace clang

clang/lib/Interpreter/Interpreter.cpp

This file was added.

				//===------ Interpreter.cpp - Incremental Compilation and Execution -------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the component which performs incremental code
				// compilation and execution.
				//
				//===----------------------------------------------------------------------===//

				#include "clang/Interpreter/Interpreter.h"

				#include "IncrementalExecutor.h"
				#include "IncrementalParser.h"

				#include "clang/Basic/TargetInfo.h"
				#include "clang/CodeGen/ModuleBuilder.h"
				#include "clang/CodeGen/ObjectFilePCHContainerOperations.h"
				#include "clang/Driver/Compilation.h"
				#include "clang/Driver/Driver.h"
				#include "clang/Driver/Job.h"
				#include "clang/Driver/Options.h"
				#include "clang/Driver/Tool.h"
				#include "clang/Frontend/CompilerInstance.h"
				#include "clang/Frontend/TextDiagnosticBuffer.h"
				#include "clang/Lex/PreprocessorOptions.h"

				#include "llvm/IR/Module.h"
				#include "llvm/Support/Host.h"

				using namespace clang;

				// FIXME: Figure out how to unify with namespace init_convenience from
				// tools/clang-import-test/clang-import-test.cpp and
				// examples/clang-interpreter/main.cpp
				namespace {
				/// Retrieves the clang CC1 specific flags out of the compilation's jobs.
				/// \returns NULL on error.
				static llvm::Expected<const llvm::opt::ArgStringList *>
				GetCC1Arguments(DiagnosticsEngine *Diagnostics,
				driver::Compilation *Compilation) {
				teemperorUnsubmitted Done Reply Inline Actions I think the comment here/above and some of the empty lines aren't really needed here. teemperor: I think the comment here/above and some of the empty lines aren't really needed here.
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Yeah, the include section surrounded with `//` is to keep track of the includes required by `IncrementalCompilerBuilder::create`. I am not sure where this code should live... Ideally, I'd like to have it in a place where it can be included by other clients which require compiler instance set up for incremental compilation. v.g.vassilev: Yeah, the include section surrounded with `//` is to keep track of the includes required by…
				teemperorUnsubmitted Done Reply Inline Actions I see. I don't really see a better place for this (I guess this depends on what these other clients actually need such a CompilerInstance for). IMHO just having this in the normal includes is fine for now. FWIW, I think some of these includes can go: VerifyDiagnosticConsumer.h -> not used TextDiagnosticPrinter.h -> not used ObjectFilePCHContainerOperations.h -> only needed for the mysterious code below ErrorHandling.h -> not used teemperor: I see. I don't really see a better place for this (I guess this depends on what these other…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Let's keep it here for now and we can always move somewhere else. Fixing the includes. Thanks! v.g.vassilev: Let's keep it here for now and we can always move somewhere else. Fixing the includes. Thanks!
				// We expect to get back exactly one Command job, if we didn't something
				// failed. Extract that job from the Compilation.
				const driver::JobList &Jobs = Compilation->getJobs();
				if (!Jobs.size() \|\| !isa<driver::Command>(*Jobs.begin()))
				return llvm::createStringError(std::errc::state_not_recoverable,
				"Driver initialization failed. "
				"Unable to create a driver job");

				// The one job we find should be to invoke clang again.
				const driver::Command Cmd = cast<driver::Command>(&(Jobs.begin()));
				if (llvm::StringRef(Cmd->getCreator().getName()) != "clang")
				return llvm::createStringError(std::errc::state_not_recoverable,
				"Driver initialization failed");

				return &Cmd->getArguments();
				}

				static llvm::Expected<std::unique_ptr<CompilerInstance>>
				CreateCI(const llvm::opt::ArgStringList &Argv) {
				std::unique_ptr<CompilerInstance> Clang(new CompilerInstance());
				IntrusiveRefCntPtr<DiagnosticIDs> DiagID(new DiagnosticIDs());

				// Register the support for object-file-wrapped Clang modules.
				// FIXME: Clang should register these container operations automatically.
				auto PCHOps = Clang->getPCHContainerOperations();
				PCHOps->registerWriter(std::make_unique<ObjectFilePCHContainerWriter>());
				PCHOps->registerReader(std::make_unique<ObjectFilePCHContainerReader>());

				// Buffer diagnostics from argument parsing so that we can output them using
				// a well formed diagnostic object.
				IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts = new DiagnosticOptions();
				TextDiagnosticBuffer *DiagsBuffer = new TextDiagnosticBuffer;
				DiagnosticsEngine Diags(DiagID, &*DiagOpts, DiagsBuffer);
				bool Success = CompilerInvocation::CreateFromArgs(
				Clang->getInvocation(), llvm::makeArrayRef(Argv.begin(), Argv.size()),
				Diags);

				// Infer the builtin include path if unspecified.
				if (Clang->getHeaderSearchOpts().UseBuiltinIncludes &&
				teemperorUnsubmitted Done Reply Inline Actions Can you add a FIXME that this should be removed (as I think we keep adding these two lines to all kind of Clang tools)? I assume this is only needed because of Clang errors when you encounter -gmodules produced pcms ? teemperor: Can you add a FIXME that this should be removed (as I think we keep adding these two lines to…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions I do not understand the problem entirely, could you propose wording for the FIXME? v.g.vassilev: I do not understand the problem entirely, could you propose wording for the FIXME?
				teemperorUnsubmitted Done Reply Inline Actions The PCM files generated with `-gmodules` are object-files that contain the Clang AST inside them. To deal with the object-file wrapping and get to the AST inside, we need the `ObjectFilePCHContainerReader`. But that class is part of CodeGen which isn't a dependency of the normal parsing logic, so these classes can't be hooked up automatically by Clang like the other 'ContainerReaders' classes (well, there is only one other classes which just opens the file normally). So to work around the fact that to parse code we now need a CodeGen dependency, every clang tool (which usually depend on CodeGen + parsing code) has to manually register the `ObjectFilePCHContainer` classes. My point is just that it's not sustainable to have everyone copy around these three lines as otherwise Clang won't handle any PCM file that was generated with -gmodules. I think the FIXME for this could be: `FIXME: Clang should register these container operations automatically.`. teemperor:* The PCM files generated with `-gmodules` are object-files that contain the Clang AST inside…
				Clang->getHeaderSearchOpts().ResourceDir.empty())
				Clang->getHeaderSearchOpts().ResourceDir =
				CompilerInvocation::GetResourcesPath(Argv[0], nullptr);

				// Create the actual diagnostics engine.
				Clang->createDiagnostics();
				if (!Clang->hasDiagnostics())
				return llvm::createStringError(std::errc::state_not_recoverable,
				"Initialization failed. "
				"Unable to create diagnostics engine");

				sgraenitzUnsubmitted Done Reply Inline Actions It looks like `clang-repl` always dumps errors to stdout currently. This is fine for the interactive use case, but it will be impractical for input/output tests. As a result unit tests e.g. dump: Note: Google Test filter = InterpreterTest.Errors [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from InterpreterTest [ RUN ] InterpreterTest.Errors In file included from <built-in>:0: input_line_0:1:1: error: unknown type name 'intentional_error' intentional_error v1 = 42; ^ [ OK ] InterpreterTest.Errors (9024 ms) [----------] 1 test from InterpreterTest (9024 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (9025 ms total) [ PASSED ] 1 test. It would be useful to have an option for streaming diagnostics to an in-memory buffer (and maybe append them to returned llvm::Error instances in the future). Instead of `createDiagnostics()` you could pass a TextDiagnosticPrinter via `setDiagnostics()` here to accomplish it. Not insisting on having it in this review, but it would be a good follow-up task at least. sgraenitz: It looks like `clang-repl` always dumps errors to stdout currently. This is fine for the…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions I should have addressed it now I get: [==========] Running 5 tests from 2 test cases. [----------] Global test environment set-up. [----------] 1 test from IncrementalProcessing [ RUN ] IncrementalProcessing.EmitCXXGlobalInitFunc [ OK ] IncrementalProcessing.EmitCXXGlobalInitFunc (17 ms) [----------] 1 test from IncrementalProcessing (17 ms total) [----------] 4 tests from InterpreterTest [ RUN ] InterpreterTest.Sanity [ OK ] InterpreterTest.Sanity (8 ms) [ RUN ] InterpreterTest.IncrementalInputTopLevelDecls [ OK ] InterpreterTest.IncrementalInputTopLevelDecls (9 ms) [ RUN ] InterpreterTest.Errors [ OK ] InterpreterTest.Errors (91 ms) [ RUN ] InterpreterTest.DeclsAndStatements [ OK ] InterpreterTest.DeclsAndStatements (8 ms) [----------] 4 tests from InterpreterTest (116 ms total) [----------] Global test environment tear-down [==========] 5 tests from 2 test cases ran. (133 ms total) [ PASSED ] 5 tests. v.g.vassilev: I should have addressed it now I get: ``` [==========] Running 5 tests from 2 test cases.
				DiagsBuffer->FlushDiagnostics(Clang->getDiagnostics());
				if (!Success)
				return llvm::createStringError(std::errc::state_not_recoverable,
				"Initialization failed. "
				"Unable to flush diagnostics");

				// FIXME: Merge with CompilerInstance::ExecuteAction.
				llvm::MemoryBuffer *MB = llvm::MemoryBuffer::getMemBuffer("").release();
				Clang->getPreprocessorOpts().addRemappedFile("<<< inputs >>>", MB);

				Clang->setTarget(TargetInfo::CreateTargetInfo(
				Clang->getDiagnostics(), Clang->getInvocation().TargetOpts));
				if (!Clang->hasTarget())
				return llvm::createStringError(std::errc::state_not_recoverable,
				"Initialization failed. "
				"Target is missing");

				Clang->getTarget().adjust(Clang->getLangOpts());

				return std::move(Clang);
				}

				} // anonymous namespace

				llvm::Expected<std::unique_ptr<CompilerInstance>>
				IncrementalCompilerBuilder::create(std::vector<const char *> &ClangArgv) {

				// If we don't know ClangArgv0 or the address of main() at this point, try
				// to guess it anyway (it's possible on some platforms).
				std::string MainExecutableName =
				llvm::sys::fs::getMainExecutable(nullptr, nullptr);

				ClangArgv.insert(ClangArgv.begin(), MainExecutableName.c_str());

				// Prepending -c to force the driver to do something if no action was
				// specified. By prepending we allow users to override the default
				// action and use other actions in incremental mode.
				// FIXME: Print proper driver diagnostics if the driver flags are wrong.
				ClangArgv.insert(ClangArgv.begin() + 1, "-c");

				if (!llvm::is_contained(ClangArgv, " -x")) {
				// We do C++ by default; append right after argv[0] if no "-x" given
				ClangArgv.push_back("-x");
				ClangArgv.push_back("c++");
				}

				// Put a dummy C++ file on to ensure there's at least one compile job for the
				teemperorUnsubmitted Done Reply Inline Actions `llvm::find(ClangArgv, " -x")` teemperor: `llvm::find(ClangArgv, " -x")`
				teemperorUnsubmitted Done Reply Inline Actions Sorry, I actually wanted to recommend `llvm::is_contained`, my bad! teemperor: Sorry, I actually wanted to recommend `llvm::is_contained`, my bad!
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions No worries, fixed. v.g.vassilev: No worries, fixed.
				// driver to construct.
				ClangArgv.push_back("<<< inputs >>>");

				CompilerInvocation Invocation;
				// Buffer diagnostics from argument parsing so that we can output them using a
				// well formed diagnostic object.
				IntrusiveRefCntPtr<DiagnosticIDs> DiagID(new DiagnosticIDs());
				IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts = new DiagnosticOptions();
				TextDiagnosticBuffer *DiagsBuffer = new TextDiagnosticBuffer;
				DiagnosticsEngine Diags(DiagID, &*DiagOpts, DiagsBuffer);
				unsigned MissingArgIndex, MissingArgCount;
				const llvm::opt::OptTable &Opts = driver::getDriverOptTable();
				llvm::opt::InputArgList ParsedArgs =
				Opts.ParseArgs(ArrayRef<const char *>(ClangArgv).slice(1),
				MissingArgIndex, MissingArgCount);
				ParseDiagnosticArgs(*DiagOpts, ParsedArgs, &Diags);

				driver::Driver Driver(/MainBinaryName=/ClangArgv[0],
				llvm::sys::getDefaultTargetTriple(), Diags);
				Driver.setCheckInputsExist(false); // the input comes from mem buffers
				llvm::ArrayRef<const char *> RF = llvm::makeArrayRef(ClangArgv);
				std::unique_ptr<driver::Compilation> Compilation(Driver.BuildCompilation(RF));

				if (Compilation->getArgs().hasArg(driver::options::OPT_v))
				Compilation->getJobs().Print(llvm::errs(), "\n", /Quote=/false);

				auto ErrOrCC1Args = GetCC1Arguments(&Diags, Compilation.get());
				if (auto Err = ErrOrCC1Args.takeError())
				return std::move(Err);

				return CreateCI(**ErrOrCC1Args);
				}

				Interpreter::Interpreter(std::unique_ptr<CompilerInstance> CI,
				llvm::Error &Err) {
				llvm::ErrorAsOutParameter EAO(&Err);
				auto LLVMCtx = std::make_unique<llvm::LLVMContext>();
				TSCtx = std::make_unique<llvm::orc::ThreadSafeContext>(std::move(LLVMCtx));
				IncrParser = std::make_unique<IncrementalParser>(std::move(CI),
				*TSCtx->getContext(), Err);
				}

				Interpreter::~Interpreter() {}

				llvm::Expected<std::unique_ptr<Interpreter>>
				Interpreter::create(std::unique_ptr<CompilerInstance> CI) {
				llvm::Error Err = llvm::Error::success();
				auto Interp =
				std::unique_ptr<Interpreter>(new Interpreter(std::move(CI), Err));
				if (Err)
				return std::move(Err);
				return std::move(Interp);
				}

				const CompilerInstance *Interpreter::getCompilerInstance() const {
				return IncrParser->getCI();
				}

				llvm::Expected<Transaction &> Interpreter::Parse(llvm::StringRef Code) {
				return IncrParser->Parse(Code);
				}

				llvm::Error Interpreter::Execute(Transaction &T) {
				assert(T.TheModule);
				if (!IncrExecutor) {
				llvm::Error Err = llvm::Error::success();
				IncrExecutor = std::make_unique<IncrementalExecutor>(*TSCtx, Err);
				if (Err)
				return Err;
				}
				// FIXME: Add a callback to retain the llvm::Module once the JIT is done.
				if (auto Err = IncrExecutor->addModule(std::move(T.TheModule)))
				return Err;

				if (auto Err = IncrExecutor->runCtors())
				return Err;

				return llvm::Error::success();
				}

clang/test/CMakeLists.txt

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	list(APPEND CLANG_TEST_DEPS
clang		clang
clang-resource-headers		clang-resource-headers
clang-format		clang-format
clang-tblgen		clang-tblgen
clang-offload-bundler		clang-offload-bundler
clang-import-test		clang-import-test
clang-rename		clang-rename
clang-refactor		clang-refactor
		clang-repl
clang-diff		clang-diff
clang-scan-deps		clang-scan-deps
diagtool		diagtool
hmaptool		hmaptool
)		)

if(CLANG_ENABLE_STATIC_ANALYZER)		if(CLANG_ENABLE_STATIC_ANALYZER)
list(APPEND CLANG_TEST_DEPS		list(APPEND CLANG_TEST_DEPS
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

clang/test/Interpreter/execute.cpp

This file was added.

				// RUN: cat %s \| clang-repl \| FileCheck %s
				// REQUIRES: host-supports-jit

				extern "C" int printf(const char *, ...);
				int i = 42;
				auto r1 = printf("i = %d\n", i);
				// CHECK: i = 42

				struct S { float f = 1.0; S *m = nullptr;} s;

				auto r2 = printf("S[f=%f, m=0x%llx]\n", s.f, reinterpret_cast<unsigned long long>(s.m));
				// CHECK-NEXT: S[f=1.000000, m=0x0]

				quit

clang/test/Interpreter/sanity.c

This file was added.

				// RUN: cat %s \| \
				// RUN: clang-repl -Xcc -fno-color-diagnostics -Xcc -Xclang -Xcc -ast-dump \
				// RUN: -Xcc -Xclang -Xcc -ast-dump-filter -Xcc -Xclang -Xcc Test 2>&1\| \
				// RUN: FileCheck %s

				int TestVar = 12;
				// CHECK: Dumping TestVar:
				// CHECK-NEXT: VarDecl [[var_ptr:0x[0-9a-f]+]] <{{.*}} TestVar 'int' cinit
				// CHECK-NEXT: IntegerLiteral {{.*}} 'int' 12

				void TestFunc() { ++TestVar; }
				// CHECK: Dumping TestFunc:
				// CHECK-NEXT: FunctionDecl {{.*}} TestFunc 'void ()'
				// CHECK-NEXT: CompoundStmt{{.*}}
				// CHECK-NEXT: UnaryOperator{{.*}} 'int' lvalue prefix '++'
				// CHECK-NEXT: DeclRefExpr{{.*}} 'int' lvalue Var [[var_ptr]] 'TestVar' 'int'

				quit

clang/test/lit.cfg.py

	Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines

	# For each occurrence of a clang tool name, replace it with the full path to			# For each occurrence of a clang tool name, replace it with the full path to
	# the build directory holding that tool. We explicitly specify the directories			# the build directory holding that tool. We explicitly specify the directories
	# to search to ensure that we get the tools just built and not some random			# to search to ensure that we get the tools just built and not some random
	# tools that might happen to be in the user's PATH.			# tools that might happen to be in the user's PATH.
	tool_dirs = [config.clang_tools_dir, config.llvm_tools_dir]			tool_dirs = [config.clang_tools_dir, config.llvm_tools_dir]

	tools = [			tools = [
	'apinotes-test', 'c-index-test', 'clang-diff', 'clang-format',			'apinotes-test', 'c-index-test', 'clang-diff', 'clang-format', 'clang-repl',
	'clang-tblgen', 'opt', 'llvm-ifs', 'yaml2obj',			'clang-tblgen', 'opt', 'llvm-ifs', 'yaml2obj',
	ToolSubst('%clang_extdef_map', command=FindTool(			ToolSubst('%clang_extdef_map', command=FindTool(
	'clang-extdef-mapping'), unresolved='ignore'),			'clang-extdef-mapping'), unresolved='ignore'),
	]			]

	if config.clang_examples:			if config.clang_examples:
	config.available_features.add('examples')			config.available_features.add('examples')
	tools.append('clang-interpreter')			tools.append('clang-interpreter')

				def have_host_jit_support():
				clang_repl_exe = lit.util.which('clang-repl', config.clang_tools_dir)

				if not clang_repl_exe:
				print('clang-repl not found')
				return False

				try:
				clang_repl_cmd = subprocess.Popen(
				[clang_repl_exe, '--host-supports-jit'], stdout=subprocess.PIPE)
				except OSError:
				print('could not exec clang-repl')
				return False

				clang_repl_out = clang_repl_cmd.stdout.read().decode('ascii')
				clang_repl_cmd.wait()

				return 'true' in clang_repl_out

				if have_host_jit_support():
				config.available_features.add('host-supports-jit')

				sgraenitzUnsubmitted Done Reply Inline Actions I couldn't test this on a host that doesn't support JIT, but it looks like a nice "duck typing" way of testing for the feature. sgraenitz: I couldn't test this on a host that doesn't support JIT, but it looks like a nice "duck typing"…
	if config.clang_staticanalyzer:			if config.clang_staticanalyzer:
	config.available_features.add('staticanalyzer')			config.available_features.add('staticanalyzer')
	tools.append('clang-check')			tools.append('clang-check')

	if config.clang_staticanalyzer_z3 == '1':			if config.clang_staticanalyzer_z3 == '1':
	config.available_features.add('z3')			config.available_features.add('z3')

	check_analyzer_fixit_path = os.path.join(			check_analyzer_fixit_path = os.path.join(
	▲ Show 20 Lines • Show All 138 Lines • Show Last 20 Lines

clang/tools/CMakeLists.txt

	create_subdirectory_options(CLANG TOOL)			create_subdirectory_options(CLANG TOOL)

	add_clang_subdirectory(diagtool)			add_clang_subdirectory(diagtool)
	add_clang_subdirectory(driver)			add_clang_subdirectory(driver)
	add_clang_subdirectory(apinotes-test)			add_clang_subdirectory(apinotes-test)
	add_clang_subdirectory(clang-diff)			add_clang_subdirectory(clang-diff)
	add_clang_subdirectory(clang-format)			add_clang_subdirectory(clang-format)
	add_clang_subdirectory(clang-format-vs)			add_clang_subdirectory(clang-format-vs)
	add_clang_subdirectory(clang-fuzzer)			add_clang_subdirectory(clang-fuzzer)
	add_clang_subdirectory(clang-import-test)			add_clang_subdirectory(clang-import-test)
	add_clang_subdirectory(clang-offload-bundler)			add_clang_subdirectory(clang-offload-bundler)
	add_clang_subdirectory(clang-offload-wrapper)			add_clang_subdirectory(clang-offload-wrapper)
	add_clang_subdirectory(clang-scan-deps)			add_clang_subdirectory(clang-scan-deps)
				add_clang_subdirectory(clang-repl)

	add_clang_subdirectory(c-index-test)			add_clang_subdirectory(c-index-test)

	add_clang_subdirectory(clang-rename)			add_clang_subdirectory(clang-rename)
	add_clang_subdirectory(clang-refactor)			add_clang_subdirectory(clang-refactor)
	# For MinGW we only enable shared library if LLVM_LINK_LLVM_DYLIB=ON.			# For MinGW we only enable shared library if LLVM_LINK_LLVM_DYLIB=ON.
	# Without that option resulting library is too close to 2^16 DLL exports limit.			# Without that option resulting library is too close to 2^16 DLL exports limit.
	if(UNIX OR (MINGW AND LLVM_LINK_LLVM_DYLIB))			if(UNIX OR (MINGW AND LLVM_LINK_LLVM_DYLIB))
	Show All 26 Lines

clang/tools/clang-repl/CMakeLists.txt

This file was added.

				set( LLVM_LINK_COMPONENTS
				${LLVM_TARGETS_TO_BUILD}
				Option
				Support
				teemperorUnsubmitted Done Reply Inline Actions Commented out by mistake? teemperor: Commented out by mistake?
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Does not seem to be required. I will remove it for now... v.g.vassilev: Does not seem to be required. I will remove it for now...
				teemperorUnsubmitted Done Reply Inline Actions I meant that shouldn't have been commented out (as I think we don't get the targets symbols from any other library so this fails to link) teemperor: I meant that shouldn't have been commented out (as I think we don't get the targets symbols…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions It compiles just fine for me. Should we add it back once we find the setup that requires it? v.g.vassilev: It compiles just fine for me. Should we add it back once we find the setup that requires it?
				teemperorUnsubmitted Done Reply Inline Actions On my Linux setup the dependencies were required (and it seems logical that they are required), so I would just add them. teemperor: On my Linux setup the dependencies were required (and it seems logical that they are required)…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Ok, readded. v.g.vassilev: Ok, readded.
				)

				add_clang_executable(clang-repl
				EXCLUDE_FROM_ALL
				ClangRepl.cpp
				)

				target_link_libraries(clang-repl PUBLIC
				clangInterpreter
				clangTooling
				LLVMLineEditor
				)

				install(TARGETS clang-repl
				RUNTIME DESTINATION bin)

clang/tools/clang-repl/ClangRepl.cpp

This file was added.

				//===--- tools/clang-repl/ClangRepl.cpp - clang-repl - the Clang REPL -----===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements a REPL tool on top of clang.
				//
				//===----------------------------------------------------------------------===//

				#include "clang/Basic/Diagnostic.h"
				#include "clang/Frontend/CompilerInstance.h"
				#include "clang/Frontend/FrontendDiagnostic.h"
				#include "clang/Interpreter/Interpreter.h"

				#include "llvm/ExecutionEngine/Orc/LLJIT.h"
				#include "llvm/LineEditor/LineEditor.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/ManagedStatic.h" // llvm_shutdown
				#include "llvm/Support/Signals.h"
				#include "llvm/Support/TargetSelect.h" // llvm::Initialize*

				static llvm::cl::list<std::string>
				ClangArgs("Xcc", llvm::cl::ZeroOrMore,
				llvm::cl::desc("Argument to pass to the CompilerInvocation"),
				llvm::cl::CommaSeparated);
				static llvm::cl::opt<bool> OptHostSupportsJit("host-supports-jit",
				llvm::cl::Hidden);

				static void LLVMErrorHandler(void *UserData, const std::string &Message,
				bool GenCrashDiag) {
				auto &Diags = static_cast<clang::DiagnosticsEngine >(UserData);

				Diags.Report(clang::diag::err_fe_error_backend) << Message;

				// Run the interrupt handlers to make sure any special cleanups get done, in
				// particular that we remove files registered with RemoveFileOnSignal.
				llvm::sys::RunInterruptHandlers();

				// We cannot recover from llvm errors. When reporting a fatal error, exit
				// with status 70 to generate crash diagnostics. For BSD systems this is
				// defined as an internal software error. Otherwise, exit with status 1.

				exit(GenCrashDiag ? 70 : 1);
				}

				llvm::ExitOnError ExitOnErr;
				int main(int argc, const char **argv) {
				ExitOnErr.setBanner("clang-repl: ");
				llvm::cl::ParseCommandLineOptions(argc, argv);

				std::vector<const char *> ClangArgv(ClangArgs.size());
				std::transform(ClangArgs.begin(), ClangArgs.end(), ClangArgv.begin(),
				[](const std::string &s) -> const char * { return s.data(); });
				llvm::InitializeNativeTarget();
				llvm::InitializeNativeTargetAsmPrinter();

				if (OptHostSupportsJit) {
				teemperorUnsubmitted Done Reply Inline Actions This is probably worth a comment why this is needed. Also because it seems important and only gets added when the flags are empty, this probably means that settingany Clang arg (`-std=XYZ`) would cause troubles? I think this could just be unconditionally added (as the last flag has the highest priority over whatever the user passed in before). teemperor: This is probably worth a comment why this is needed. Also because it seems important and only…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Good point! v.g.vassilev: Good point!
				auto J = llvm::orc::LLJITBuilder().create();
				if (J)
				llvm::outs() << "true\n";
				else {
				llvm::consumeError(J.takeError());
				llvm::outs() << "false\n";
				}
				return 0;
				}

				// FIXME: Investigate if we could use runToolOnCodeWithArgs from tooling. It
				// can replace the boilerplate code for creation of the compiler instance.
				auto CI = ExitOnErr(clang::IncrementalCompilerBuilder::create(ClangArgv));

				// Set an error handler, so that any LLVM backend diagnostics go through our
				// error handler.
				llvm::install_fatal_error_handler(LLVMErrorHandler,
				static_cast<void *>(&CI->getDiagnostics()));

				auto Interp = ExitOnErr(clang::Interpreter::create(std::move(CI)));
				llvm::LineEditor LE("clang-repl");
				// FIXME: Add LE.setListCompleter
				while (llvm::Optional<std::string> Line = LE.readLine()) {
				if (*Line == "quit")
				break;
				if (auto Err = Interp->ParseAndExecute(*Line))
				llvm::logAllUnhandledErrors(std::move(Err), llvm::errs(), "error: ");
				}

				// Our error handler depends on the Diagnostics object, which we're
				// potentially about to delete. Uninstall the handler now so that any
				// later errors use the default handling behavior instead.
				llvm::remove_fatal_error_handler();

				llvm::llvm_shutdown();

				return 0;
				}

clang/unittests/CMakeLists.txt

	Show All 29 Lines
	add_subdirectory(CrossTU)			add_subdirectory(CrossTU)
	add_subdirectory(Tooling)			add_subdirectory(Tooling)
	add_subdirectory(Introspection)			add_subdirectory(Introspection)
	add_subdirectory(Format)			add_subdirectory(Format)
	add_subdirectory(Frontend)			add_subdirectory(Frontend)
	add_subdirectory(Rewrite)			add_subdirectory(Rewrite)
	add_subdirectory(Sema)			add_subdirectory(Sema)
	add_subdirectory(CodeGen)			add_subdirectory(CodeGen)
				add_subdirectory(Interpreter)
	# FIXME: libclang unit tests are disabled on Windows due			# FIXME: libclang unit tests are disabled on Windows due
	# to failures, mostly in libclang.VirtualFileOverlay_*.			# to failures, mostly in libclang.VirtualFileOverlay_*.
	if(NOT WIN32 AND CLANG_TOOL_LIBCLANG_BUILD)			if(NOT WIN32 AND CLANG_TOOL_LIBCLANG_BUILD)
	add_subdirectory(libclang)			add_subdirectory(libclang)
	endif()			endif()
	add_subdirectory(DirectoryWatcher)			add_subdirectory(DirectoryWatcher)
	add_subdirectory(Rename)			add_subdirectory(Rename)
	add_subdirectory(Index)			add_subdirectory(Index)
	add_subdirectory(Serialization)			add_subdirectory(Serialization)

clang/unittests/CodeGen/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	Core			Core
	Support			Support
	)			)

	add_clang_unittest(ClangCodeGenTests			add_clang_unittest(ClangCodeGenTests
	BufferSourceTest.cpp			BufferSourceTest.cpp
	CodeGenExternalTest.cpp			CodeGenExternalTest.cpp
	IncrementalProcessingTest.cpp
	TBAAMetadataTest.cpp			TBAAMetadataTest.cpp
	CheckTargetFeaturesTest.cpp			CheckTargetFeaturesTest.cpp
	)			)

	clang_target_link_libraries(ClangCodeGenTests			clang_target_link_libraries(ClangCodeGenTests
	PRIVATE			PRIVATE
	clangAST			clangAST
	clangBasic			clangBasic
	clangCodeGen			clangCodeGen
	clangFrontend			clangFrontend
				clangInterpreter
	clangLex			clangLex
	clangParse			clangParse
	clangSerialization			clangSerialization
	)			)

clang/unittests/CodeGen/IncrementalProcessingTest.cpp

This file was deleted.

	//=== unittests/CodeGen/IncrementalProcessingTest.cpp - IncrementalCodeGen ===//
	//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//
	//===----------------------------------------------------------------------===//

	#include "TestCompiler.h"

	#include "clang/AST/ASTConsumer.h"
	#include "clang/AST/ASTContext.h"
	#include "clang/AST/RecursiveASTVisitor.h"
	#include "clang/Basic/TargetInfo.h"
	#include "clang/CodeGen/ModuleBuilder.h"
	#include "clang/Frontend/CompilerInstance.h"
	#include "clang/Lex/Preprocessor.h"
	#include "clang/Parse/Parser.h"
	#include "clang/Sema/Sema.h"
	#include "llvm/ADT/Triple.h"
	#include "llvm/IR/LLVMContext.h"
	#include "llvm/IR/Module.h"
	#include "llvm/Support/Host.h"
	#include "llvm/Support/MemoryBuffer.h"
	#include "gtest/gtest.h"

	#include <memory>

	using namespace llvm;
	using namespace clang;

	namespace {

	// Incremental processing produces several modules, all using the same "main
	// file". Make sure CodeGen can cope with that, e.g. for static initializers.
	const char TestProgram1[] =
	"extern \"C\" int funcForProg1() { return 17; }\n"
	"struct EmitCXXGlobalInitFunc1 {\n"
	" EmitCXXGlobalInitFunc1() {}\n"
	"} test1;";

	const char TestProgram2[] =
	"extern \"C\" int funcForProg2() { return 42; }\n"
	"struct EmitCXXGlobalInitFunc2 {\n"
	" EmitCXXGlobalInitFunc2() {}\n"
	"} test2;";


	/// An incremental version of ParseAST().
	static std::unique_ptr<llvm::Module>
	IncrementalParseAST(CompilerInstance& CI, Parser& P,
	CodeGenerator& CG, const char* code) {
	static int counter = 0;
	struct IncreaseCounterOnRet {
	~IncreaseCounterOnRet() {
	++counter;
	}
	} ICOR;

	Sema& S = CI.getSema();
	clang::SourceManager &SM = S.getSourceManager();
	if (!code) {
	// Main file
	SM.setMainFileID(SM.createFileID(
	llvm::MemoryBuffer::getMemBuffer(" "), clang::SrcMgr::C_User));

	S.getPreprocessor().EnterMainSourceFile();
	P.Initialize();
	} else {
	FileID FID = SM.createFileID(
	llvm::MemoryBuffer::getMemBuffer(code), clang::SrcMgr::C_User);
	SourceLocation MainStartLoc = SM.getLocForStartOfFile(SM.getMainFileID());
	SourceLocation InclLoc = MainStartLoc.getLocWithOffset(counter);
	S.getPreprocessor().EnterSourceFile(FID, 0, InclLoc);
	}

	ExternalASTSource *External = S.getASTContext().getExternalSource();
	if (External)
	External->StartTranslationUnit(&CG);

	Parser::DeclGroupPtrTy ADecl;
	for (bool AtEOF = P.ParseFirstTopLevelDecl(ADecl); !AtEOF;
	AtEOF = P.ParseTopLevelDecl(ADecl)) {
	// If we got a null return and something was parsed, ignore it. This
	// is due to a top-level semicolon, an action override, or a parse error
	// skipping something.
	if (ADecl && !CG.HandleTopLevelDecl(ADecl.get()))
	return nullptr;
	}

	// Process any TopLevelDecls generated by #pragma weak.
	for (Decl *D : S.WeakTopLevelDecls())
	CG.HandleTopLevelDecl(DeclGroupRef(D));

	CG.HandleTranslationUnit(S.getASTContext());

	std::unique_ptr<llvm::Module> M(CG.ReleaseModule());
	// Switch to next module.
	CG.StartModule("incremental-module-" + std::to_string(counter),
	M->getContext());
	return M;
	}

	const Function* getGlobalInit(llvm::Module& M) {
	for (const auto& Func: M)
	if (Func.hasName() && Func.getName().startswith("_GLOBAL__sub_I_"))
	return &Func;

	return nullptr;
	}

	TEST(IncrementalProcessing, EmitCXXGlobalInitFunc) {
	clang::LangOptions LO;
	LO.CPlusPlus = 1;
	LO.CPlusPlus11 = 1;
	TestCompiler Compiler(LO);
	clang::CompilerInstance &CI = Compiler.compiler;
	CI.getPreprocessor().enableIncrementalProcessing();
	CI.setASTConsumer(std::move(Compiler.CG));
	clang::CodeGenerator& CG =
	static_cast<clang::CodeGenerator&>(CI.getASTConsumer());
	CI.createSema(clang::TU_Prefix, nullptr);

	Sema& S = CI.getSema();

	std::unique_ptr<Parser> ParseOP(new Parser(S.getPreprocessor(), S,
	/SkipFunctionBodies/ false));
	Parser &P = *ParseOP.get();

	std::array<std::unique_ptr<llvm::Module>, 3> M;
	M[0] = IncrementalParseAST(CI, P, CG, nullptr);
	ASSERT_TRUE(M[0]);

	M[1] = IncrementalParseAST(CI, P, CG, TestProgram1);
	ASSERT_TRUE(M[1]);
	ASSERT_TRUE(M[1]->getFunction("funcForProg1"));

	M[2] = IncrementalParseAST(CI, P, CG, TestProgram2);
	ASSERT_TRUE(M[2]);
	ASSERT_TRUE(M[2]->getFunction("funcForProg2"));
	// First code should not end up in second module:
	ASSERT_FALSE(M[2]->getFunction("funcForProg1"));

	// Make sure global inits exist and are unique:
	const Function* GlobalInit1 = getGlobalInit(*M[1]);
	ASSERT_TRUE(GlobalInit1);

	const Function* GlobalInit2 = getGlobalInit(*M[2]);
	ASSERT_TRUE(GlobalInit2);

	ASSERT_FALSE(GlobalInit1->getName() == GlobalInit2->getName());

	}

	} // end anonymous namespace

clang/unittests/Interpreter/CMakeLists.txt

This file was added.

				set(LLVM_LINK_COMPONENTS
				)

				add_clang_unittest(ClangReplInterpreterTests
				IncrementalProcessingTest.cpp
				InterpreterTest.cpp
				)
				target_link_libraries(ClangReplInterpreterTests PUBLIC
				clangInterpreter
				clangFrontend
				)

clang/unittests/Interpreter/IncrementalProcessingTest.cpp

This file was added.

				//=== unittests/CodeGen/IncrementalProcessingTest.cpp - IncrementalCodeGen ===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "clang/AST/ASTConsumer.h"
				#include "clang/AST/ASTContext.h"
				#include "clang/AST/RecursiveASTVisitor.h"
				#include "clang/Basic/TargetInfo.h"
				#include "clang/CodeGen/ModuleBuilder.h"
				#include "clang/Frontend/CompilerInstance.h"
				#include "clang/Interpreter/Interpreter.h"
				#include "clang/Lex/Preprocessor.h"
				#include "clang/Parse/Parser.h"
				#include "clang/Sema/Sema.h"
				#include "llvm/ADT/Triple.h"
				#include "llvm/IR/LLVMContext.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Support/Host.h"
				#include "llvm/Support/MemoryBuffer.h"
				#include "gtest/gtest.h"

				#include <memory>

				using namespace llvm;
				using namespace clang;

				namespace {

				// Incremental processing produces several modules, all using the same "main
				// file". Make sure CodeGen can cope with that, e.g. for static initializers.
				const char TestProgram1[] = "extern \"C\" int funcForProg1() { return 17; }\n"
				"struct EmitCXXGlobalInitFunc1 {\n"
				" EmitCXXGlobalInitFunc1() {}\n"
				"} test1;";

				const char TestProgram2[] = "extern \"C\" int funcForProg2() { return 42; }\n"
				"struct EmitCXXGlobalInitFunc2 {\n"
				" EmitCXXGlobalInitFunc2() {}\n"
				"} test2;";

				const Function getGlobalInit(llvm::Module M) {
				for (const auto &Func : *M)
				if (Func.hasName() && Func.getName().startswith("_GLOBAL__sub_I_"))
				return &Func;

				return nullptr;
				}

				TEST(IncrementalProcessing, EmitCXXGlobalInitFunc) {
				std::vector<const char *> ClangArgv = {"-Xclang", "-emit-llvm-only"};
				auto CI = llvm::cantFail(IncrementalCompilerBuilder::create(ClangArgv));
				auto Interp = llvm::cantFail(Interpreter::create(std::move(CI)));

				std::array<clang::Transaction *, 2> Transactions;

				Transactions[0] = &llvm::cantFail(Interp->Parse(TestProgram1));
				ASSERT_TRUE(Transactions[0]->TheModule);
				ASSERT_TRUE(Transactions[0]->TheModule->getFunction("funcForProg1"));

				Transactions[1] = &llvm::cantFail(Interp->Parse(TestProgram2));
				ASSERT_TRUE(Transactions[1]->TheModule);
				ASSERT_TRUE(Transactions[1]->TheModule->getFunction("funcForProg2"));
				// First code should not end up in second module:
				ASSERT_FALSE(Transactions[1]->TheModule->getFunction("funcForProg1"));

				// Make sure global inits exist and are unique:
				const Function *GlobalInit1 = getGlobalInit(Transactions[0]->TheModule.get());
				ASSERT_TRUE(GlobalInit1);

				const Function *GlobalInit2 = getGlobalInit(Transactions[1]->TheModule.get());
				ASSERT_TRUE(GlobalInit2);

				ASSERT_FALSE(GlobalInit1->getName() == GlobalInit2->getName());
				}

				} // end anonymous namespace

clang/unittests/Interpreter/InterpreterTest.cpp

This file was added.

				//===- unittests/Interpreter/InterpreterTest.cpp --- Interpreter tests ----===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// Unit tests for Clang's Interpreter library.
				//
				teemperorUnsubmitted Done Reply Inline Actions `our` -> `Clang's` ? teemperor: `our` -> `Clang's` ?
				//===----------------------------------------------------------------------===//

				#include "clang/Interpreter/Interpreter.h"

				#include "clang/AST/Decl.h"
				#include "clang/AST/DeclGroup.h"
				#include "clang/Frontend/CompilerInstance.h"
				teemperorUnsubmitted Done Reply Inline Actions Could be merged with the includes above teemperor: Could be merged with the includes above
				#include "clang/Frontend/TextDiagnosticPrinter.h"

				#include "llvm/ADT/ArrayRef.h"

				#include "gmock/gmock.h"
				#include "gtest/gtest.h"

				using namespace clang;

				namespace {
				using Args = std::vector<const char *>;
				static std::unique_ptr<Interpreter>
				createInterpreter(const Args &ExtraArgs = {},
				sgraenitzUnsubmitted Done Reply Inline Actions Warning: `std::move` prevents copy elision sgraenitz: Warning: `std::move` prevents copy elision
				DiagnosticConsumer *Client = nullptr) {
				Args ClangArgs = {"-Xclang", "-emit-llvm-only"};
				ClangArgs.insert(ClangArgs.end(), ExtraArgs.begin(), ExtraArgs.end());
				auto CI = cantFail(clang::IncrementalCompilerBuilder::create(ClangArgs));
				if (Client)
				CI->getDiagnostics().setClient(Client, /ShouldOwnClient=/false);
				return cantFail(clang::Interpreter::create(std::move(CI)));
				}

				TEST(InterpreterTest, Sanity) {
				std::unique_ptr<Interpreter> Interp = createInterpreter();
				Transaction &R1(cantFail(Interp->Parse("void g(); void g() {}")));
				EXPECT_EQ(2U, R1.Decls.size());

				Transaction &R2(cantFail(Interp->Parse("int i;")));
				EXPECT_EQ(1U, R2.Decls.size());
				}

				static std::string DeclToString(DeclGroupRef DGR) {
				return llvm::cast<NamedDecl>(DGR.getSingleDecl())->getQualifiedNameAsString();
				}

				TEST(InterpreterTest, IncrementalInputTopLevelDecls) {
				std::unique_ptr<Interpreter> Interp = createInterpreter();
				auto R1OrErr = Interp->Parse("int var1 = 42; int f() { return var1; }");
				// gtest doesn't expand into explicit bool conversions.
				EXPECT_TRUE(!!R1OrErr);
				auto R1 = R1OrErr->Decls;
				EXPECT_EQ(2U, R1.size());
				EXPECT_EQ("var1", DeclToString(R1[0]));
				EXPECT_EQ("f", DeclToString(R1[1]));

				auto R2OrErr = Interp->Parse("int var2 = f();");
				EXPECT_TRUE(!!R2OrErr);
				auto R2 = R2OrErr->Decls;
				EXPECT_EQ(1U, R2.size());
				EXPECT_EQ("var2", DeclToString(R2[0]));
				}

				TEST(InterpreterTest, Errors) {
				teemperorUnsubmitted Done Reply Inline Actions I think that's usually used for testing asserts but here it's an invalid memory access (which might even work out just if the stars align correctly). What about either: Shutting down clang-repl cleanly after we hit a diagnostic. Making an assert that we can't codegen a TU that already had any error diagnostic (in which case you can surround it with the following): #if !defined(NDEBUG) && GTEST_HAS_DEATH_TEST teemperor: I think that's usually used for testing asserts but here it's an invalid memory access (which…
				v.g.vassilevAuthorUnsubmitted Done Reply Inline Actions Would it make sense just to not test that case? v.g.vassilev: Would it make sense just to not test that case?
				teemperorUnsubmitted Done Reply Inline Actions IIUC this would be fixed by just early-exiting from the parse logic when we see that there were previous parsing errors. That would also be easier to test and is better than crashing due to accessing free'd memory. teemperor: IIUC this would be fixed by just early-exiting from the parse logic when we see that there were…
				Args ExtraArgs = {"-Xclang", "-diagnostic-log-file", "-Xclang", "-"};

				// Create the diagnostic engine with unowned consumer.
				std::string DiagnosticOutput;
				llvm::raw_string_ostream DiagnosticsOS(DiagnosticOutput);
				auto DiagPrinter = std::make_unique<TextDiagnosticPrinter>(
				DiagnosticsOS, new DiagnosticOptions());

				auto Interp = createInterpreter(ExtraArgs, DiagPrinter.get());
				auto Err = Interp->Parse("intentional_error v1 = 42; ").takeError();
				using ::testing::HasSubstr;
				EXPECT_THAT(DiagnosticsOS.str(),
				HasSubstr("error: unknown type name 'intentional_error'"));
				EXPECT_EQ("Parsing failed.", llvm::toString(std::move(Err)));

				#ifdef GTEST_HAS_DEATH_TEST
				EXPECT_DEATH((void)Interp->Parse("int var1 = 42;"), "");
				teemperorUnsubmitted Done Reply Inline Actions You still need a `#ifdef GTEST_HAS_DEATH_TEST` around this (death tests aren't supported on some specific setups) teemperor: You still need a `#ifdef GTEST_HAS_DEATH_TEST` around this (death tests aren't supported on…
				#endif
				}

				// Here we test whether the user can mix declarations and statements. The
				// interpreter should be smart enough to recognize the declarations from the
				// statements and wrap the latter into a declaration, producing valid code.
				TEST(InterpreterTest, DeclsAndStatements) {
				Args ExtraArgs = {"-Xclang", "-diagnostic-log-file", "-Xclang", "-"};

				// Create the diagnostic engine with unowned consumer.
				std::string DiagnosticOutput;
				llvm::raw_string_ostream DiagnosticsOS(DiagnosticOutput);
				auto DiagPrinter = std::make_unique<TextDiagnosticPrinter>(
				DiagnosticsOS, new DiagnosticOptions());

				auto Interp = createInterpreter(ExtraArgs, DiagPrinter.get());
				auto R1OrErr = Interp->Parse(
				"int var1 = 42; extern \"C\" int printf(const char*, ...);");
				// gtest doesn't expand into explicit bool conversions.
				EXPECT_TRUE(!!R1OrErr);

				auto R1 = R1OrErr->Decls;
				EXPECT_EQ(2U, R1.size());

				// FIXME: Add support for wrapping and running statements.
				auto R2OrErr = Interp->Parse("var1++; printf(\"var1 value %d\\n\", var1);");
				EXPECT_FALSE(!!R2OrErr);
				using ::testing::HasSubstr;
				EXPECT_THAT(DiagnosticsOS.str(),
				HasSubstr("error: unknown type name 'var1'"));
				auto Err = R2OrErr.takeError();
				EXPECT_EQ("Parsing failed.", llvm::toString(std::move(Err)));
				}

				} // end anonymous namespace

This is an archive of the discontinued LLVM Phabricator instance.

[clang-repl] Land initial infrastructure for incremental parsingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 345037

clang/include/clang/CodeGen/CodeGenAction.h

clang/include/clang/Frontend/FrontendAction.h

clang/include/clang/Interpreter/Interpreter.h

clang/include/clang/Interpreter/Transaction.h

clang/lib/CMakeLists.txt

clang/lib/CodeGen/CodeGenAction.cpp

clang/lib/Frontend/FrontendAction.cpp

clang/lib/Interpreter/CMakeLists.txt

clang/lib/Interpreter/IncrementalExecutor.h

clang/lib/Interpreter/IncrementalExecutor.cpp

clang/lib/Interpreter/IncrementalParser.h

clang/lib/Interpreter/IncrementalParser.cpp

clang/lib/Interpreter/Interpreter.cpp

clang/test/CMakeLists.txt

clang/test/Interpreter/execute.cpp

clang/test/Interpreter/sanity.c

clang/test/lit.cfg.py

clang/tools/CMakeLists.txt

clang/tools/clang-repl/CMakeLists.txt

clang/tools/clang-repl/ClangRepl.cpp

clang/unittests/CMakeLists.txt

clang/unittests/CodeGen/CMakeLists.txt

clang/unittests/CodeGen/IncrementalProcessingTest.cpp

clang/unittests/Interpreter/CMakeLists.txt

clang/unittests/Interpreter/IncrementalProcessingTest.cpp

clang/unittests/Interpreter/InterpreterTest.cpp

[clang-repl] Land initial infrastructure for incremental parsing
ClosedPublic