This is an archive of the discontinued LLVM Phabricator instance.

[ARM][LowOverheadLoops] Make some stack spills valid for tail predication
ClosedPublic

Authored by samtebbs on Jul 5 2021, 11:34 AM.

Details

Summary

This patch makes vector spills valid for tail predication when all loads from the same stack slot are within the loop.

Diff Detail

Event Timeline

samtebbs created this revision.Jul 5 2021, 11:34 AM
samtebbs requested review of this revision.Jul 5 2021, 11:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2021, 11:34 AM
dmgreen added inline comments.Jul 7 2021, 12:20 AM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
1222

Is it worth splitting this out into a new function? To keep this stack logic separate in case it needs to grow even more sophisticated in the future.

1226

I think we may need to distinguish between a load that accesses SP and a stack spill slot. A SP address could be some other (potentially escaping) stack object. I think that info is usually available through the MMO, maybe via FrameInfo::isSpillSlotObjectIndex.

I was looking around for a way to get all the MachineInstr's that access a specific stack slot, but couldn't fine one better than searching through a lot of instructions.

1229–1231

I think this comment would be better if we explain what we do do, not why we don't use RDA (or maybe both). Explaining that we are searching all blocks after the loop for another access to the same stack slot.

1236

Use I or It or Idx or something?

samtebbs added inline comments.Jul 7 2021, 9:27 AM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
1222

Good idea. Do you think a lambda or proper function would be best?

1226

Good spot, I'll have a look into that. Regarding other ways to find instructions that use a specific stack slot, I haven't been able to find a better way either.

1229–1231

👍

1236

👍

samtebbs updated this revision to Diff 357936.Jul 12 2021, 7:08 AM
samtebbs marked 4 inline comments as done.

Break-out code to function, check frame index and rename loop counter.

The searching feels gnarly, like it can end up searching a lot of instructions. I don't know of a better idea though, and it should be rare that it needs to search.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
1087–1165

This function can be made static.

1090–1092

getPseudoValue can return nullptr on failure, which needs to be checked.
The type should be const PseudoSourceValue *, not auto (the type isn't obvious from the function call).
Then you might be able use if (const auto *FS = dyn_cast<FixedStackPseudoSourceValue>(PseudoValue))

Or you may be able to use if (const auto *FS = dyn_cast_or_null<FixedStackPseudoSourceValue>(PseudoValue)) to get the same effect?

1106

This will have to check that it has at least one memoperand.

1121

Frame indexes look like they are usually int's.

Does GetFrameIndex return the value by parameter reference because FrameIndexes can be negative? So we can't return -1 for not found?

1122

Check for isSpillSlotObjectIndex(FI) too?

1124

I think you can remove the ", 1", and let it choose a default small size. Same below.

1139

If we find another stack store, does that mean we can stop the search?

1148

Should this check that Succ is not already in Frontier too?

1222

As a function looks good. It's nice and self-contained.

samtebbs updated this revision to Diff 358296.Jul 13 2021, 9:16 AM
samtebbs marked 5 inline comments as done.

Make ValidateMVEStore static, use dyn_cast, check isSpillSlotObjectIndex, make GetFrameIndex return an int, remove the size for the SmallVector and check the size of memoperands.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
1090–1092

The dyn_cast<...> option looks good to me. 👍

1106

Ah I thought that would be guaranteed by the instruction being a V[STR|LDR]WU32. I'll add a check.

1121

I chose for it to be taken as reference so that a boolean could be returned to indicate success, but if it can just return the frame index or -1 in case of error then that would be great. From looking through the codebase (this is a great use-case for more function documentation) it looks like frame indexes will be positive so I think returning -1 in case of failure is a good option.

1124

SmallPtrSet requires an initial size but I can remove it for SmallVector.

1139

If we have found another store to the same stack slot then we stop the search as any further loads in the same BB won't invalidate the store we're validating, since they are dependent on that second store instead.

1148

Yeah that can't hurt, however I can't find a nice contains function in SmallVector. I wish container classes had the same interface.

Thanks. This looks good, minus some mostly Nitpicks.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
1090

const PseudoSourceValue* PseudoValue =
It still needs to check it's not null too. The store could have a non-pseudo MMO, for which this would return nullptr.

1104

== 1 is probably fine. I wouldn't expect it to be higher.

1121

I think they may be able to go negative, but a negative argument means a parameter passed via the stack (I think). So treating it the same as an invalid value sounds like it should work fine.

1124

Oh of course, yeah. I hadn't realized it didn't have the same interface.

SmallPtrSet sounds good, but you may want to increase the default to 4. It won't make a large difference, but 4 pointers are not going to hurt anyone.

1131

unsigned int -> unsigned, is more common in llvm.

1139

Sure. I was thinking in that case it wouldn't have to add any Successors, so we could potentially save from looking through a lot of other BB's.

1148

There is an is_contained(Frontier, Succ) in ADT, that is a wrapper around std::find.

samtebbs updated this revision to Diff 358985.Jul 15 2021, 8:26 AM
samtebbs marked 11 inline comments as done.

Check for null pseudo value, change smallptrset size, check for just one memoperand and avoid checking successors when a VSTRW is found.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
1090

Cheers, I'll add that.

1139

Ah yes that is a very good point! I'll make it account for that.

1148

Sweet, thanks

dmgreen accepted this revision.Jul 15 2021, 8:38 AM

Thanks. LGTM

This revision is now accepted and ready to land.Jul 15 2021, 8:38 AM
This revision was landed with ongoing or failed builds.Jul 15 2021, 11:24 AM
This revision was automatically updated to reflect the committed changes.