When calling a function and passing large argument by value, and when argument size is above certain threshold, memcpy is used to copy argument on stack instead of sequence of loads and stores. In that case callseq* nodes for memcpy are nested inside callseq* nodes for the called function. This patch corrects this behavior by moving callseq_start of the called function after arguments calculation to temporary registers, so that callseq* nodes in resulting DAG are linear.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
+llvm-commits
That list should have been added when the patch was uploaded.
Can you re-upload the patch with full context: http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface
Can you also add -verify-machineinstrs to test/CodeGen/Mips/largeimmprinting.ll ? This is the existing test case that breaks the machine verifier.
test/CodeGen/Mips/callseq_order.ll | ||
---|---|---|
18–52 | Rather than matching SelectionDAG's output, can you instead match the end output of -debug-only=isel and ensure that the output is a sequence of ADJCALLSTACKDOWN, ADJCALLSTACKUP which are not nested and match the memcpy calls where they occur. |
LGTM with nits addressed.
test/CodeGen/Mips/callseq_order.ll | ||
---|---|---|
2 | You can drop the CPU specification portion of the llc invocations here, the defaults of 32r2 and 64r2 should sufficient to test the logic that has been changed. | |
test/CodeGen/Mips/llvm-ir/mul.ll | ||
271 | You don't need to bind the register number to a FileCheck variable in this case, as it's unused afterwards. Either match it with {{[0-9a-z]+}} or drop the register portion as we're interested in matching the: lw $25, %call16(__multi3) part of the instruction. This applies to the following test changes as well. |
You can drop the CPU specification portion of the llc invocations here, the defaults of 32r2 and 64r2 should sufficient to test the logic that has been changed.