- User Since
- Mar 18 2020, 10:22 PM (96 w, 18 h)
Sep 13 2020
Sep 11 2020
Sep 9 2020
Sep 2 2020
I was wondering if either of these approaches look good ?
(a) Add a default param to TargetInstrInfo::foldMemoryOperandImpl, say bool &AddMemoryOperands, and the target specific hook sets it accordingly.
This will require modifying other target hooks to set AddMemoryOperands to true. Or alternatively return std::pair ?
It seems the issue comes from TargetInstrInfo::foldMemoryOperand, which adds the memory operands of LoadMI to newly created MI.
It calls foldMemoryOperandImpl:
Sep 1 2020
Sorry for the breakage, I will take a look.
Aug 29 2020
Aug 28 2020
Added check in this version to return nullptr if CPI >= MCP->getConstants().size().
Does the patch look OK ? Tested with make check-llvm.
Aug 26 2020
Thanks for the catch! I tried your example, and it crashed llc with my patch. In the attached patch, I followed your suggestion to dyn_cast CPE.Val.ConstVal to Function * and that worked.
Aug 19 2020
Aug 11 2020
Great, thanks for testing! Is this patch OK to commit ?
Aug 10 2020
Thanks for the suggestions. I finally was able to build llvm natively on arm. Benchmarking SPEC2006 for code-size with -march=armv6 revealed minimal differences with previous patch. It works for the original test case from tinycrypt library, but causes a code-size regression of ~10% for another function from same library, resulting in overall worse code-size. Likewise disabling the transform, gives similar results.
Should we go instead with the approach suggested by John (reattaching in current patch) ?
Implement foldMemoryOperand hook to fold tLDRpci, tBLX pair to tBL.
This has one issue above if COPY insns come in between and foldMemoryOperand gets passed tLDRpci and COPY instead.
I am not sure how to handle that. However, this approach improves somewhat on the original behavior without causing regressions for any test-cases. Benchmarking showed minimal difference in code-size for SPEC2006, but improves the original test-case in tinycrypt and avoids the 10% code-size regression.
Jul 28 2020
Hi, Sorry for late response. Unfortunately, we are having some infra issues for benchmarking code-size natively on SPEC, and am getting consistent build failures (without patch) while building llvm natively on arm. Do you have any suggestions on how to cross compile LLVM testsuite with -march=armv6-m enabled ? I did a cross compilation with SPEC, and code-size difference across all object files that got built was around 6.14%. I will report more descriptive results for SPEC after our infra issues get resolved. In the interim, I am also trying to test my patch for chromium.
Jul 15 2020
Jul 6 2020
Sorry for late response. In the attached patch, I added a couple of more constraints if we're compiling for Thumb1:
- Number of args passed + caller's num of args < total number of available regs
- Each arg to callee, is either a "pass thru" arg, OR 8-bit imm OR a constant load. The intent is to allow only those args that need a single register for computation.
Jun 22 2020
Jun 16 2020
Jun 13 2020
Thanks for the review, and sorry I forgot to add the test earlier.
Does the attached version look OK ?
I verified that it works to remove indirect calls for the original test-case uECC_shared_secret from tinycrypt.
Tested with make check-llvm showing no unexpected failures.
Jun 8 2020
Thanks for the review and I am sorry for late response, I was on a leave.
I have the updated the patch with suggested changes, do they look in right direction ?
Testing with make check-llvm shows no unexpected failures.