This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use single add/sub for large stack offsets
ClosedPublic

Authored by rob.lougher on Jan 28 2015, 8:29 AM.

Details

Summary

The following code generates an object file which is over 30 MB in size:

void bar(char *a);
void foo (void) {
    char b[5000000000000000];
    bar(b);
}

The reason is that the code to push the space on the stack consists of over 2 million subtract
instructions of the form:

subq    $2147483647, %rsp       # imm = 0x7FFFFFFF

Instead of doing multiple immediate mode subtracts, the compiler should place the final amount to
be subtracted into a register and do a single subtraction.

Diff Detail

Repository
rL LLVM

Event Timeline

rob.lougher retitled this revision from to [X86] Use single add/sub for large stack offsets.
rob.lougher updated this object.
rob.lougher edited the test plan for this revision. (Show Details)
rob.lougher added a reviewer: grosbach.
rob.lougher added a subscriber: Unknown Object (MLST).
asl added a subscriber: asl.Jan 28 2015, 9:07 AM
asl added inline comments.
lib/Target/X86/X86FrameLowering.cpp
202 ↗(On Diff #18891)

This clobbers rax/eax w/o checking whether it's a livein or not. This may be a problem for functions with inreg attributes. I believe the code should always use findDeadCallerSavedReg().

rob.lougher added inline comments.Jan 28 2015, 9:53 AM
lib/Target/X86/X86FrameLowering.cpp
202 ↗(On Diff #18891)

I copied that from the code that generates a push/pop (lines 223-225), so the existing code always uses rax/eax in the prologue for a push. If this is a problem for inreg attributes, this code will also need to be fixed.

asl added inline comments.Jan 28 2015, 9:57 AM
lib/Target/X86/X86FrameLowering.cpp
202 ↗(On Diff #18891)

It's not a problem for push below, because we just need to put something on stack (we can use any register for this). For pop'ing stuff we need to find free register, since we're going to clobber it.

Situation here is completely different, because the code in the patch materializes a constant in the register thus clobbering it always.

rob.lougher added inline comments.Jan 28 2015, 10:29 AM
lib/Target/X86/X86FrameLowering.cpp
202 ↗(On Diff #18891)

OK, point taken - in the push case it's pushing whatever value is in rax/eax (as it's just making space, the value doesn't matter). I tried changing the code to always use findDeadCallerSavedReg(), but it returns zero when used in the prologue, so we fall back to using multiple instructions (as before). I need to do some more investigation.

I have uploaded a new diff which hopefully addresses the inreg attribute issue. I have also updated the tests to include a test for it.

asl added a comment.Jan 29 2015, 7:17 AM

LGTM after the fix of the test.

test/CodeGen/X86/huge-stack-offset.ll
15 ↗(On Diff #18956)

Typo? Should be CHECK-32-LABEL as I can see.

This revision was automatically updated to reflect the committed changes.

Thanks for the review! Typo fixed, and committed in r227458.