The Load/Store Optimizer runs before Machine Block Placement. At O3 the Tail Duplication Threshold is set to 4 instructions and this can create new opportunities for the Load/Store Optimizer. It seems worthwhile to run it once again.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
I've tested the patch with native builds of the llvm-test-suite on an AArch64 Cortex-A72 and couldn't spot anything interesting in terms of compilation time.
Comment Actions
This looks find to me then, especially at aggressive opt level. IIRC, there is a test for the pass pipeline I would expect needs updating.
Other than that, LGTM
test/CodeGen/AArch64/ldst-opt-after-block-placement.ll | ||
---|---|---|
31 | %2 is never used here. |
Comment Actions
Indeed that is test/CodeGen/AArch64/O3-pipeline.ll. Fixed now.
test/CodeGen/AArch64/ldst-opt-after-block-placement.ll | ||
---|---|---|
31 | Good catch. Removed now. |
%2 is never used here.