Add support for llvm.is.constant intrinsic (PR4898)
This adds llvm-side support for post-inline evaluation of the __builtin_constant_p GCC intrinsic.
Differential D4276
Added llvm.is.constant intrinsic jyknight on Jun 24 2014, 8:44 AM. Authored by
Details Add support for llvm.is.constant intrinsic (PR4898) This adds llvm-side support for post-inline evaluation of the __builtin_constant_p GCC intrinsic.
Diff Detail Event TimelineComment Actions I'd add a bit more of tests, especially those that inspired this change... --renato
Comment Actions Thanks Janusz, I'm adding Owen to review this patch, since it's not my area, but overall, the patch looks good. About the tests, though the inline check is important, I was thinking something more elaborate. Some examples that come to mind:
Those examples that *should* be constant and are not, should be marked as FIXME. Those examples that *should* NOT be constant and are, should be fixed before commit. cheers,
Comment Actions I agree that we should have more tests. I'd prefer to have an agreement that we want to add llvm.is.constant intrinsic before spending too much time on writing tests. A follow up work is to add support for the new intrinsic in clang (__builtin_constant_p -> llvm.is.constant) and write more tests in C. I have a few questions.
%a = zext i1 %b to i32 %v = call i1 @llvm.is.constant.i32(i32 %a) and %a = trunc i256 %b to i32 %v = call i1 @llvm.is.constant.i32(i32 %a) will do the job. Is there a preference in the llvm community?
Comment Actions Moved lowering llvm.is.constant to False from LegalizeDAG to SelectionDAG. Code to promote or expand llvm.is.constant is not required any more and the intrinsic works fine with different input integer types.
Comment Actions Depends on how hard would it be to infer constantness at a lower level with the added complexity. I'd say, as a general idea, leaving the IR untouched and adding logic to infer that later would be better, but I'm not sure of all the cases.
Yes, that'd be great. I think we could teach even Clang about that, so that pre-processor constantness is explited at the front-end level. Though, I understand if those things take their time to get implemented. Comment Actions Rewrote patch significantly.
Updated and added more tests; Comment Actions It seems like you're waiting to fold llvm.is.constant until really late in the optimization pipeline; we probably want to fold it sometime in the "middle", so we get better optimization. Maybe just after function simplification passes; at that point, we're unlikely to get any more useful information about whether the argument is constant, and we want to simplify the code as much as possible before we run transforms like loop vectorization.
Comment Actions The most important constraint is that it needs to occur after inlining. However, we already have llvm.objectsize intrinsic, and I wanted to handle the final failure in the same place, so that __builtin_constant_p works when given __builtin_object_size as an argument, even when the __builtin_object_size is unevaluable (and returns -1). E.g. in GCC, this code: int f1(char* x) { if (__builtin_constant_p(__builtin_object_size(x, 0))) { return __builtin_object_size(x, 0); } return -9999; } int f2() { char x[1000]; return f1(x); } compiles such that: f1 -> returns (constant) -1. f2 -> returns (constant) 1000. It's possible that both of them should move somewhere else, though...
Comment Actions I guess lowering llvm.objectsize and llvm.is.constant at the same time makes sense; okay. They should probably both be lowered earlier, though.
Comment Actions Ran clang-format.
Comment Actions I have no more concerns with this patch. Eli, Are you okay to approve this?
Comment Actions This feature would be very useful to us and testing this patch against our code base gave us an interesting case where llvm.is.constant (used in a conditional) is lowered to false in CodeGenPrepare and the dead branch that arises out of this lowering does not get eliminated. The code that gets generated after CodeGenPrepare pass has "br i1 false .." with both true and false conditions preserved and this propagates further and remains the same in the final assembly code. In CodeGenPrepare::runOnFunction, ConstantFoldTerminator (which handles the br i1 false condition) is called only once and if after the transformation of ConstantFoldTerminator() and DeleteDeadBlock(), if we end up with code like "br i1 false", there is no further opportunity to clean them up. So calling the code under ' if (!DisableBranchOpts)' in a loop that is conditioned on MadeChange (similar to other places), fixes the issue. Is this reasonable ? I wrote a patch to test this and I have one regression with llvm/test/CodeGen/AMDGPU/nested-loop-conditions.ll. I'm having some difficulty to determine if this warrants rewriting the test case to adjust with the new results or if this is a real issue. Any pointers would be greatly appreciated. My simple fix (without any indentation changes) is: --- a/llvm/lib/CodeGen/CodeGenPrepare.cpp +++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp @@ -316,7 +316,9 @@ bool CodeGenPrepare::runOnFunction(Function &F) { SunkAddrs.clear(); + MadeChange = true; if (!DisableBranchOpts) { + while (MadeChange) { MadeChange = false; SmallPtrSet<BasicBlock*, 8> WorkList; for (BasicBlock &BB : F) { @@ -352,6 +354,7 @@ bool CodeGenPrepare::runOnFunction(Function &F) { EverMadeChange |= MadeChange; } + } if (!DisableGCOpts) { SmallVector<Instruction *, 2> Statepoints; Thanks. Comment Actions What's the status of this patch? I'm running into a situation where evaluating __builtin_constant_p through inlining may be useful. Comment Actions Picking this back up again... I'm ready to actually submit this now that Bill has the clang-side code that'll use it almost in place. Woo! :) The SCCP bug is already be exhibited by the tests in test/CodeGen/Generic/is-constant.ll which pass a struct. Since I believe llvm.is.constant is the only thing we can constant-fold which takes struct args at the moment anyhow, that's really the only viable test. So I think there's nothing else needed here. It may make sense to iterate, as you say, but that can/should be a different patch. Comment Actions
I'd prefer a testcase that explicitly calls "opt -sccp", as opposed to implicitly relying on the fact that opt -O2 includes SCCP. (You can just add a small test to test/Transforms/SCCP/ipsccp-basic.ll .) Comment Actions
D49103 should take care of that, FWIW. Nothing to block this patch on (or vice-versa) IMO Comment Actions Done. And verified that without the SCCP change, the test fails, and with it, passes. |
a more robust way would be to have two run lines, one with O0, another with O3, and use FileCheck for the True/False results.