Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
LGTM, but this should be further reviewed by someone with more familiarity of the inliner cost model.
test/Transforms/Inline/gep-cost.ll | ||
---|---|---|
1 | I apologize for my lack of understanding. Assuming this isn't obvious to someone more familiar with the inliner cost model, would you mind adding a few comments to this test case explaining exactly what you're testing? I diffed the output from this test with and without your change and the only cost that was changed was for inner1. Thus, I'm trying to understand the significance of inner2. |
test/Transforms/Inline/gep-cost.ll | ||
---|---|---|
1 | Thank you, Chad. I will add appropriate comments. The GEP in inner2() is reg+imm+reg which is not legal addressing mode for AArch64. I expect to see only ret can be simplified. The current implementation only considers GEP with all constant indices are foldable, so the GEP in inner2() is considered not foldable even without my patch. I add this test for completeness. The GEP in inner1(), however, is reg+reg and is legal for AArch64. It can be detected only with my patch. |
While the approach we have of doing cost modeling here has some problems, this does fit in well with what we're doing for zero extends, sign extends, and related. So I'm happy with that side. I'll defer the code side to Chad.
test/Transforms/Inline/gep-cost.ll | ||
---|---|---|
1 | Ah, I see. Suggestions below. | |
11 | Here you could probably say something like: // The GEP in inner1() is [insert legal addressing mode], which is a legal addressing mode for AArch64. Thus, both the gep and ret can be simplified. | |
19 | Here you could probably say something like: // The GEP in inner2() is reg+imm+reg, which is not a legal addressing mode for AArch64. Thus, only the ret can be simplified and not the gep. |
LGTM, assuming you've done the necessary performance testing for both AArch64 and X86.
I apologize for my lack of understanding. Assuming this isn't obvious to someone more familiar with the inliner cost model, would you mind adding a few comments to this test case explaining exactly what you're testing?
I diffed the output from this test with and without your change and the only cost that was changed was for inner1. Thus, I'm trying to understand the significance of inner2.