Hi,
Compiling following test-case (reduced for uECC_shared_secret function from tinycrypt library) with -Oz on armv6-m:
typedef unsigned char uint8_t;
extern uint8_t x1;
extern uint8_t x2;
void foo(uint8_t *, unsigned, unsigned);
void uECC_shared_secret(uint8_t *private_key, unsigned num_bytes, unsigned num_words)
{
foo(private_key, num_bytes, num_words);
foo(private_key, num_bytes, num_words);
foo(private_key, num_bytes, num_words);
}
results in ldr of function's address before each blx call:
ldr r3, .LCPI0_0 blx r3 mov r0, r6 mov r1, r5 mov r2, r4 ldr r3, .LCPI0_0 blx r3 ldr r3, .LCPI0_0 mov r0, r6 mov r1, r5 mov r2, r4 blx r3
.LCPI0_0:
.long foo
As suggested by John Brawn in http://lists.llvm.org/pipermail/llvm-dev/2020-April/140712.html,
this happens because:
(1) ARMTargetLowering::LowerCall prefers indirect call for 3 or more functions in same basic block.
(2) For thumb1, we have only 3 callee-saved registers available (r4-r6).
(3) The function has 3 arguments and needs one extra register to hold it's address.
(4) So we have to end up spilling one of the registers. The register holding function's address gets split since it can be rematerialized, and we end up with ldr before each call.
As per the suggestion, the patch implements foldMemoryOperand hook in Thumb1InstrInfo, to convert back to bl in case of a spill.
Does the patch look OK ?
make check-llvm shows no unexpected failures.
Thanks,
Prathamesh
I wrote a check for the number of arguments in a different way at https://reviews.llvm.org/D49465 ; maybe you can borrow that? Checking the number of operands in the IR is a very inaccurate approximation.