As shown in PR45084:
https://bugs.llvm.org/show_bug.cgi?id=45084
...we failed to combine a gep with constant indexes with a pointer operand that is a select of constants.
Details
Diff Detail
Event Timeline
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
1816 | Should the select be limited to one use? I'm not sure whether this transform is favorable if we need to keep both selects. In either case, this should be tested. | |
1822 | It is better to construct these GEPs via IRBuilder, so the resulting constants get folded right away (GEPs are target-dependent). Also, aren't we losing inbounds here? | |
1823 | We can preserve branch weight metadata here. |
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
1816 | Unless there's evidence that the select-of-constants form is less favorable, we want to allow the transform because it eliminates a use of the existing select. I'll add a test. | |
1822 | Can you describe the benefit of IRBuilder vs. ConstantExpr::get()? I'm not seeing any difference in the debug output for the examples that I have. I'm happy to change it (but we should probably change a bunch of existing similar code too?). I'm not clear on the inbounds rules here (it's a constant, so is it not required to be inbounds?). But yes, I'll fix this to inherit from the existing instruction. | |
1823 | Yes, good catch - I'll add to the test. |
Patch updated:
- Propagate metadata from select (test adjusted).
- Propagate 'inbounds' from gep (not sure how to test non-inbounds case).
- Added test for >1 use of the select.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
1822 |
Sure. Take something like %struct.f = type { i32, i32 } @g0 = internal unnamed_addr constant %struct.f zeroinitializer, align 4 define i32* @PR45084(i1 %cond) { %sel = select i1 %cond, %struct.f* @g0, %struct.f* null %gep = getelementptr inbounds %struct.f, %struct.f* %sel, i64 0, i32 1 ret i32* %gep } In the log we see: IC: Old = %gep = getelementptr inbounds %struct.f, %struct.f* %sel, i64 0, i32 1 New = <badref> = select i1 %cond, i32* getelementptr inbounds (%struct.f, %struct.f* @g0, i64 0, i32 1), i32* getelementptr inbounds (%struct.f, %struct.f* null, i64 0, i32 1) ... IC: ConstFold operand of: %gep = select i1 %cond, i32* getelementptr inbounds (%struct.f, %struct.f* @g0, i64 0, i32 1), i32* getelementptr inbounds (%struct.f, %struct.f* null, i64 0, i32 1) Old = i32* getelementptr inbounds (%struct.f, %struct.f* null, i64 0, i32 1) New = i32* inttoptr (i64 4 to i32*) This is a target-depending fold, so it is not performed during ConstantExpr construction. It will be automatically performed if you go through IRBuilder, and save the need to do this on the next IC iteration.
Existing code seems to get this mostly right, though I did fix one occurrence in 9b5de84e274fe9df971e5d32cc4ffe1ab972519d.
If that were the case, I would assume that the inbounds flag wouldn't exist on ConstantExpr GEP in the first place. I think it's fine to have non-inbounds GEP on constants. E.g. my example above is wrong, the GEP on null can't be inbounds. I don't know if any of the ConstantExpr actually optimizes on inbounds though, I didn't find anything from a quick search. |
LGTM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
1829 | Surprised that IRBuilder has no overloads accepting bool... |
Should the select be limited to one use? I'm not sure whether this transform is favorable if we need to keep both selects. In either case, this should be tested.