This is an archive of the discontinued LLVM Phabricator instance.

Account for partial stack slot spills (PR30821)
ClosedPublic

Authored by jmorse on Mar 22 2018, 8:15 AM.

Details

Summary

Currently, _any_ store or load instruction is considered to be operating
on a spill if it has a frameindex as an operand, and thus are fair game
for optimisations such as "StackSlotColoring". This usually works, except
on architectures where spills can be partially restored, for example on
X86 where a spilt vector can have a single component loaded (zeroing the
rest of the target register). This can be mis-interpreted and the zero
extension unsoundly eliminated, see pr30821.

To avoid this, optionally provide the caller to isLoadFromStackSlot and
isStoreToStackSlot with the number of bytes spilt/loaded by the given
instruction. Optimisations can then determine that a full spill followed
by a partial load (or vice versa), for example, cannot necessarily be
commuted.

(Adds regression test too)

Diff Detail

Repository
rL LLVM

Event Timeline

jmorse created this revision.Mar 22 2018, 8:15 AM
craig.topper added inline comments.Mar 26 2018, 12:24 PM
lib/Target/X86/X86InstrInfo.cpp
4270 ↗(On Diff #139446)

Why can't we always get this from the opcode or always from RegSize? What's the criteria for which way this is done? Why are the KMOV instructions implemented differently between Load and Store?

jmorse added inline comments.Mar 27 2018, 3:25 AM
lib/Target/X86/X86InstrInfo.cpp
4270 ↗(On Diff #139446)

Ah, I'd incorrectly assumed that information wasn't in the opcode at this stage, which turns out to be wrong, I'll revise. As for the KMOVs, that's pure error alas.

jmorse updated this revision to Diff 139902.Mar 27 2018, 5:21 AM

Revised to use instruction opcode information to select insn read/write size instead of register; fix erroneous KMOV handling.

craig.topper added inline comments.Apr 12 2018, 10:16 AM
lib/Target/X86/X86InstrInfo.cpp
3947 ↗(On Diff #139902)

These already exist at the bottom of the switch. This shouldn't even build.

4193 ↗(On Diff #139902)

Can we just have one version of these functions and just make the caller pass a dummy variable in for MemBytes when it doesn't need it? Or alternatively, pass MemBytes by pointer and pass a null pointer when it isn't needed. But that means a null pointer check before every assignment.

jmorse updated this revision to Diff 142363.Apr 13 2018, 2:23 AM

Revision here removes muppetry and duplicate isLoad.../isStore... logic. Callers now have to provide a MemBytes reference, dummies installed where it's not needed.

craig.topper accepted this revision.Apr 15 2018, 6:41 PM

LGTM

lib/Target/X86/X86InstrInfo.cpp
4126 ↗(On Diff #142363)

Capitalize variable names. Same for the other instances below.

This revision is now accepted and ready to land.Apr 15 2018, 6:41 PM
jmorse updated this revision to Diff 143269.Apr 20 2018, 2:47 AM

Corrects code style in accepted revision; NB, I'm not a committer, and would need someone else to commit this.

andreadb requested changes to this revision.Apr 23 2018, 4:08 AM

Hi Jeremy.

I was going to commit this patch for you. However, there are two problems with the test case.

First of all, the test should not be an LLI test. Ideally, you would want a MIR test (see this document: https://llvm.org/docs/MIRLangRef.html) to only test the changes introduced by the stack slot coloring pass. You can start with a simple mir test obtained from llc -stop-before=stack-slot-coloring.

Second: your test clearly requires an x86 target. So, you should have had a REQUIRES: x86-registered-target line in your test.
That being said, I think the LLI test should be replaced with a MIR test.

This revision now requires changes to proceed.Apr 23 2018, 4:08 AM
jmorse updated this revision to Diff 143764.Apr 24 2018, 9:16 AM

Changed test to use MIR, and put it in the X86 test directory.

The MIR test is definitely simpler and more specific, which is good, but requires tricking the register allocator to exercise the appropriate code path -- see inline comments in the MIR file. The net effect is just that we have to run three passes instead of just stack-slot-colouring, and have a few assignments copy-and-pasted.

This revision is now accepted and ready to land.Apr 24 2018, 12:07 PM
This revision was automatically updated to reflect the committed changes.