Based on the design doc at
https://docs.google.com/document/d/112O-Q_XrbrU1I4P-oiLCN9u86Qg_BYBdcDsmh7Pn9Nw/edit?usp=sharing,
it handles non-member C++ const objects.
Extensions are discussed in the design doc and will be submitted separately.
Details
Diff Detail
Event Timeline
This definitely needs to be split up into a series of smaller patches to
be reviewable. I strongly suggest starting small, getting through one
cycle of review, then posting a couple more and recursing. Reading
through the diff, I see a lot of points which will need
discussion/clarification.
Also, a bit of background on what you're trying to accomplish is
required. A link to a private google doc is not sufficient.
Philip
include/llvm/IR/GlobalVariable.h | ||
---|---|---|
32–34 | Alphabetize. | |
48 | This is a change to the concept of what a global variable in LLVM IR is, and therefore would require a matching LangRef change (and bitcode reader/writer change). If you need to store information about a GV for a pass, please create a map<GlobalVariable*, IntrinsicInst*> in your pass. | |
include/llvm/IR/Instructions.h | ||
36–38 | Alphabetize. | |
79 | Again, this is a fundamental change to what an alloca instruction is. Question for you, what if there's more than one invariant call on this alloca? Note that this doesn't form a use, so what if someone deletes the invariant intrinsic? This is left hanging and pointing to nothing, or worse someone could reallocate a new different instruction at the same address. | |
lib/Analysis/BasicAliasAnalysis.cpp | ||
499 | What if the GV has an invariant start instruction, but the query is from before the GV initializer has finished running? A "false negative" in this if-expression fails unsafe, skipping this block will cause us to go into code which is written to assume GV->isConstant is true. If that were sufficient, then there's no need to have getInvariantStartInstruction at all, you could just set isConstant instead. You need to check that the invariant has already executed, and the way to do that is to see that it dominates (which currently we can only do in a function-local sense). I don't know what cases you're solving this way, but my stab-in-the-dark guess is that you should probably be getting them with changes to MemoryDependenceAnalysis instead; if you see an invariant.end you can skip the scan of instructions up to the matching invariant.start which it conveniently takes as its first argument. That way, you never even get into the point of querying basicaa about the instructions in between, therefore they don't appear to be potentially conflicting for any user of memdep (GVN, DSE, Memcpyopt). | |
lib/IR/Instructions.cpp | ||
1155–1158 | What about II->getParent() == this->getParent()? | |
lib/Transforms/IPO/GlobalOpt.cpp | ||
2246–2249 | I don't think you need this? GlobalOpt performs symbolic execution of each instruction in order. It will see, and "execute" an invariant start instruction by inserting it into the "Invariants" set above. This is a more accurate model, since before the execution occurs it is not yet constant, and it becomes constant afterwards. I'm not sure what cases processInvariantIntrinsics is catching which the existing code is not, and I'd like to review the GlobalOpt changes independently. | |
2640–2641 | This is what the Evaluator does (except not just for invariant intrinsics). This code is a second evaluator inside the evaluator, please don't do that, just integrate with the rest of GlobalOpt. | |
2646–2648 | for (Function::iterator BB = F->begin(), BE = F->end(); BB != BE; ++BB) { // Function::iterator implicitly converts into a BasicBlock*, so "BB->..." works fine } | |
2648 | "BasicBlock* NextBB" --> "BasicBlock *NextBB". | |
2683 | So the only things Eval.getVal() will return non-null for are Arguments and GlobalValues, because you've entered processInvariantIntrinsics() before evaluating the function. Good news, that's not wrong, it just doesn't have the values. The alternative of running it afterwards would fill in the values, but be actively wrong in a case where you're inside a loop; a loop value incrementing %x from 0 to 5 would show as '5' even if it were used inside the loop. | |
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
1966 | I don't think that's right. Proposed counterexample: declare {}* @llvm.invariant.start(i64 %size, i8* nocapture %ptr) declare void @llvm.invariant.end({}* %start, i64 %size, i8* nocapture %ptr) declare i8* @malloc(i32) define void @foo() { %p = call i8* @malloc(i32 1) %inv = call {}* @llvm.invariant.start(i64 -1, i8* %p) %dead = icmp eq i8* %p, i8* null call void @llvm.invariant.end({}* %inv, i64 -1, i8* %p) ret void } If "isAllocSiteRemovable" is true, we should just delete invariant starts and ends. There may be a pile of them, like: %inv1 = call {}* @llvm.invariant.start(i64 -1, i8* %p) call void @llvm.invariant.end({}* %inv1, i64 -1, i8* %p) %inv2 = call {}* @llvm.invariant.start(i64 -1, i8* %p) call void @llvm.invariant.end({}* %inv2, i64 -1, i8* %p) %inv3 = call {}* @llvm.invariant.start(i64 -1, i8* %p) call void @llvm.invariant.end({}* %inv3, i64 -1, i8* %p) and we don't care since we already checked that it's safe to delete them all. Just nuke away! Also, don't dyn_cast + assert the result, just use cast<>. It has the assert inside. |
Hi Philip -- This *is* the initial patch. I can't think of another way to break it down more effectively.
Thanks for pointing out the inaccessibility of the design doc.
It is not meant to be private. If one is unable to access, I believe s/he should be able to request access, which I will promptly grant (?).
Alternatively, I have made a copy available from my personal (and more public-capable) email account.
Please let me know if you encounter any more issue.
include/llvm/IR/GlobalVariable.h | ||
---|---|---|
48 | I have been troubleshooting how to do this better. Will see what I can come up with... Thanks for the pointer. | |
include/llvm/IR/Instructions.h | ||
79 | Same as above. | |
lib/IR/Instructions.cpp | ||
1155–1158 | Good point. Thanks! | |
lib/Transforms/IPO/GlobalOpt.cpp | ||
2246–2249 | True. I was using it as a temporary informational tool. |
Er, no. This patch currently involves changes to 5 different transform passes and 3 analysis passes. This can and must be split into smaller changes. I'd suggest striping out everything not required for *one* of the transform passes and iterate on that.
Thanks for pointing out the inaccessibility of the design doc.
It is not meant to be private. If one is unable to access, I believe s/he should be able to request access, which I will promptly grant (?).
Alternatively, I have made a copy available from my personal (and more public-capable) email account.
Please let me know if you encounter any more issue.
Thank you for sharing the doc publicly. It did help to explain the problem you're trying to solve, but it didn't really give me any insight into how your trying to solve it. Can you sketch out an architectural level summary of how you intent this to work?
I'm with Philip on this one.
This patch looks like it can be split up.
As a rule, you don't have to make everything go at once, you can do it
piece by piece.
(IE it's fine to modify one transform/analysis pass, even if nothing
but tests use it yet, and then add the use in a separate patch).
More to the point, the feedback you are getting says: "If you want
this reviewed by reviewers, you should split it up more".
So unless you can point to concrete reasons it can't be split up, you
should do that.
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
1966 | I'm confused by this comment. What is wrong about this? Note that I here represents one of the users collected by "isAllocSiteRemovable" and which are being iterated through... |
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
1966 | You're right that my examples don't work because I misunderstood what 'I' is. However, I think I can still break this assert: %inv1 = call {}* @llvm.invariant.start(i64 -1, i8* %p) br i1 undef, label %cond_true, label %cond_false cond_true: call void @llvm.invarient.end({}* %inv1, i64 -1, i8* %p) unreachable cond_false; call void @llvm.invarient.end({}* %inv1, i64 -1, i8* %p) unreachable In this case 'I' is %inv1 (one of the users of %p) and it's safe to remove according to isAllocSiteRemovable, yet it has neither zero nor one uses. |
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
1966 | Ah. Now I understand this one. Thanks! |
lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
2640–2641 | The real reason why this version of processInvariantIntrinsics() is necessary is that EvaluateBlock() could exit before processing the invariant intrinsic call (and marking global variables 'writeonce'), e.g., if a call to a function declaration that we cannot constant fold occurs before the intrinsic call. This processInvariantIntrinsics() complements Evaluator::EvaluateBlock() by pre-traversing the call-stack tree on the same Evaluator instance. The alternative would be to delay exits from Evaluator::EvaluateBlock() while remembering whether to exit with 'false' or 'true'; which would be a messier extension than separating the functions. I'm rewriting this function with respect to other comments below anyway... |
I think I've addressed all the comment by now. I'll be pushing in split-up patches (with corrections) next.
lib/Analysis/BasicAliasAnalysis.cpp | ||
---|---|---|
499 | GVs, like AIs, are marked "writeonce written" (via 'setInvariantStartInstruction') immediately after initialization. |
lib/IR/Instructions.cpp | ||
---|---|---|
1155–1158 | Actually, I'm not sure that this works. For example, take an alloca outside of a branch that contains an invariant_start... |
lib/IR/Instructions.cpp | ||
---|---|---|
1155–1158 | Sorry, you're right. To check that they're the same function, it's "II->getParent()->getParent() == getParent()->getParent()". |
lib/IR/Instructions.cpp | ||
---|---|---|
1155–1158 | Thanks. |
Alphabetize.