This is an archive of the discontinued LLVM Phabricator instance.

[mips] Optimize stack pointer adjustments.
ClosedPublic

Authored by sdardis on Jun 14 2016, 2:17 AM.

Details

Summary

Instead of always using addu to adjust the stack pointer when the
size out is of the range of an addiu instruction, use subu so that
a smaller constant can be generated.

This can give savings of ~3 instructions whenever a function has a
a stack frame whose size is out of range of an addiu instruction.

This change may break some naive stack unwinders.

Partially resolves PR/26291.

Thanks to David Chisnall for reporting the issue.

Diff Detail

Event Timeline

sdardis updated this revision to Diff 60659.Jun 14 2016, 2:17 AM
sdardis retitled this revision from to [mips] Optimize stack pointer adjustments..
sdardis updated this object.
sdardis added a reviewer: dsanders.
sdardis set the repository for this revision to rL LLVM.
sdardis added subscribers: llvm-commits, theraven.

I mis-spoke there, this partially resolves PR/26291. It fixes cases where functions with a large stack frame have a long prolog section for decrementing the stack pointer.

dsanders edited edge metadata.Jun 14 2016, 4:05 AM

Added Nitesh, Mohit, Sagar, and Bhushan in case this requires changes to lldb.

Just one question about a fixme and a few minor nits.

lib/Target/Mips/MipsSEInstrInfo.cpp
446–456

Could you put braces around this now that it's multiple lines?

457–458

These suggestions are for another patch but just to mention them:

  • MIPS32R6/MIPS64R6 can add the immediate without materializing it first using AUI/DAUI/DATI/DAHI.
  • MIPS32R5/MIPS64R5 with MSA, and MIPS32R6/MIPS64R6 can improve on this using LSA/DLSA to add 17-20 bit immediates in two instructions instead of three as long as the amount is appropriately aligned (which is always true for 17-19 bit, and true on N32/N64 for 20-bit).
test/CodeGen/Mips/cstmaterialization/stack.ll
3–4

Could you add the N32 case?

30–32

Can you clarify what needs fixing here? Is it just the duplication or is there something else?

test/CodeGen/Mips/eh-dwarf-cfa.ll
16

Could you add a colon to each of these to reduce the chance of an accidental match on something like $f1 or 0xf1?

test/CodeGen/Mips/largeimm1.ll
9

This will match the 'f' in '.file ...' instead of the function label 'f:'

sdardis updated this revision to Diff 60666.Jun 14 2016, 5:21 AM
sdardis updated this object.
sdardis edited edge metadata.
sdardis removed rL LLVM as the repository for this revision.
sdardis marked 3 inline comments as done.

Addressed review comments

lib/Target/Mips/MipsSEInstrInfo.cpp
457–458

I hadn't thought about using (d)lsa to synthesize constants, but that's changes to MipsAnalyzeImmediate. This patch is to make a relatively tiny change to avoid some bad cases.

I'll look at R6ifying the return sequence after R6ifying constant synthesis.

test/CodeGen/Mips/cstmaterialization/stack.ll
30–32

For mips64 we repeatedly synthesize a large offset of the current stack pointer:

lui   $5, 16
daddu $5, $sp, $5
sd    $ra, 24($5)             # 8-byte Folded Spill
lui   $ra, 16
daddu $ra, $sp, $ra
sd    $gp, 16($ra)            # 8-byte Folded Spill

The second spill could have re-used $5 with the offset of 16. This also occurs when those values are reloaded. Turns out I missed one of them.

dsanders accepted this revision.Jun 14 2016, 5:34 AM
dsanders edited edge metadata.

LGTM with one more nit.

test/CodeGen/Mips/cstmaterialization/stack.ll
30–32

Thanks. In that case, can we phrase the comment in terms of an action to take in the future (e.g. 'fix the duplicated address generation').

This revision is now accepted and ready to land.Jun 14 2016, 5:34 AM
sdardis closed this revision.Jun 14 2016, 6:55 AM

Thanks, changed in question to:

; FIXME:
; These are here to match other lui's used in address computations. We need to
; investigate why address computations are not CSE'd. Or implement it.

Committed as rL272666.