I'm working on extending the OptimizeGlobalAddressOfMalloc to handle some more general cases. This is to add support of the ConstantExpr use of the global variables. The function allUsesOfLoadedValueWillTrapIfNull is now iterative with the added CE use of GV. Also, the recursive function valueIsOnlyUsedLocallyOrStoredToOneGlobal is changed to iterative using a worklist with the GEP case added.
Details
Diff Detail
Unit Tests
Event Timeline
llvm/lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
737 | You are right. Will change this to only consider bitcast and gep. | |
962 | I prefer to use for loops as we need add additional code to handle CE uses, but I can change back to use while loop, and add some code to remove those CE uses to ensure the loop runs correctly. | |
1063 | Thanks for the suggestion. This is the original code as it is, I can change it as you suggested - but looks to me no difference as SI is available. |
llvm/lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
962 | In general, I don't like the early-increment pattern on constants because it's easy to mess up if constant expressions are involved. | |
1063 | The SI->getOperand(1)->stripPointerCasts() != GV seems very suspicious to me; I don't understand why we want to special-case storing the address of a global to itself. For the SI->getOperand(0) == V part, I guess it doesn't really matter. |
This is to address comments: (1) limit the case for ConstantExpr to only bitcast and gep in allUsesOfLoadedValueWillTrapIfNull, (2) change back using while loop to traverse uses of GV in OptimizeGlobalAddressOfMalloc
llvm/lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
1063 | I think this is to make sure the malloc site for GV is only for GV. |
llvm/lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
738 | I'm not sure constant folding ensures that CE->stripPointerCasts() looks through exactly one pointer cast? | |
981 | A constant can have more than one use? | |
1063 | In case it wasn't clear, I meant, "For the SI->getOperand(0) == V part, I guess it doesn't really matter how you write it. SI->getOperand(0) == V is essentially equivalent to VUse.getOperandNo() == 0." I don't follow your explanation for the SI->getOperand(1)->stripPointerCasts() != GV part; can you describe it in a bit more detail? |
llvm/lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
738 | Yes, Should be if (CE->stripPointerCasts() != V). | |
981 | Good point. Will rewrite the code to consider multiple uses of ConstantExpr. | |
1063 | V is from malloc. For stores, V can be used in the following 3 forms:
Case 1 is ok, as it stores through it. Case 2 is the one we care about - store to GV - we are trying to optimize this kind of malloc calls. This check is for case 3 - the malloc value is also stored to memory other than GV - simply return false, as there is no further code currently to handle this for safe transformation. |
This is to address comments: (1) fix for the ConstantExpr case in allUsesOfLoadedValueWillTrapIfNull, (2) Consider multiple uses of ConstantExpr for the loop to traverse uses of GV in OptimizeGlobalAddressOfMalloc. The load/store uses are collected in a set first and then loop over the set.
llvm/lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
735 | SI->getPointerOperand() != P is alright. It's ok to store into global (actually the StoredOnce). | |
llvm/lib/Transforms/Utils/GlobalStatus.cpp | ||
111 | I only see one user of GlobalStatus - that is processGlobal in GlobalOpt.cpp. I checked all the transformations under this function. Other than tryToOptimizeStoreOfMallocToGlobal, this added code should be ok for all other transformations, either ConstantExpr is handled already or bailout early if there is any use of ConstantExpr. There is only one to bailout early - TryToShrinkGlobalToBoolean - I'll leave this as it is for now as I don't see any real case for this with ConstantExpr. |
llvm/lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
735 | My concern is that with opaque pointers, it will become possible to write something like store ptr @g, ptr @g, at which point the difference between SI->getPointerOperand() != P and SI->getValueOperand() == P will start to matter. | |
llvm/lib/Transforms/Utils/GlobalStatus.cpp | ||
111 | Okay, thanks for checking. |
llvm/lib/Transforms/IPO/GlobalOpt.cpp | ||
---|---|---|
735 | When this function is called, we know there is only one store to the global, and it is from malloc. There are no other stores to the global. So we don't need to worry about "store @g @g". |
This input might look weird, but I don't think the pass should hit an assert anyway, which it does with this patch:
; RUN: opt < %s -passes='globalopt' -S | FileCheck %s @a162 = internal global i16* null, align 1 define void @f363() { entry: %0 = load i16*, i16** @a162, align 1 store i16 0, i16* bitcast (i16** @a162 to i16*), align 1 ret void }
I get
opt: ../lib/IR/Constants.cpp:2224: static llvm::Constant *llvm::ConstantExpr::getBitCast(llvm::Constant *, llvm::Type *, bool): Assertion `CastInst::castIsValid(Instruction::BitCast, C, DstTy) && "Invalid constantexpr bitcast!"' failed.
in
#9 0x00000000023e5478 processInternalGlobal(llvm::GlobalVariable*, llvm::GlobalStatus const&, llvm::function_ref<llvm::TargetLibraryInfo& (llvm::Function&)>, llvm::function_ref<llvm::DominatorTree& (llvm::Function&)>) GlobalOpt.cpp:0:0 #10 0x00000000023e4438 processGlobal(llvm::GlobalValue&, llvm::function_ref<llvm::TargetLibraryInfo& (llvm::Function&)>, llvm::function_ref<llvm::DominatorTree& (llvm::Function&)>) GlobalOpt.cpp:0:0 #11 0x00000000023e2394 optimizeGlobalsInModule(llvm::Module&, llvm::DataLayout const&, llvm::function_ref<llvm::TargetLibraryInfo& (llvm::Function&)>, llvm::function_ref<llvm::TargetTransformInfo& (llvm::Function&)>, llvm::function_ref<llvm::BlockFrequencyInfo& (llvm::Function&)>, llvm::function_ref<llvm::DominatorTree& (llvm::Function&)>) GlobalOpt.cpp:0:0 #12 0x00000000023e08ce llvm::GlobalOptPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&)
This patch caused the following error for me:
error: Explicit load/store type does not match pointee type of pointer operand (Producer: 'LLVMgoogle3-trunk' Reader: 'LLVM google3-trunk')
Unfortunately, the error gives no source-code line and the input file is giant and highly optimized. The fix in D10730 doesn't help.
I'm working on a reduced test case, but it might take a while. If the error sparks any ideas, it would be good to hear them.
Thanks for your information and working on a reduced test case. I'm not sure what the problem is, I'll go through the code (transformation part) a little bit more to see if there is anything suspicious. But it would be great if you could provide a reduced case.
It looks like this patch introduced an assertion in GlobalOpt.
I filed a bug with reproducer here: https://bugs.llvm.org/show_bug.cgi?id=51608
Could you please take a look?
Instead of SI->getPointerOperand() != P, should be SI->getValueOperand() == P?