Page MenuHomePhabricator

[TTI][ARM][MVE] Refine gather/scatter cost model
ClosedPublic

Authored by anwel on Mar 3 2020, 7:31 AM.

Details

Summary

This patch refines the gather/scatter cost model, but also changes the TTI function getIntrinsicInstrCost to as an additional parameter accept the instruction that is evaluated, which is needed for the gather/scatter cost evaluation.
The latter did require trivial changes in some non-ARM backends to adopt the new parameter.

Gathers can be extending, so if their result is sext or zext extended, the extend can be integrated. This ensures that in these cases the cost model looks at the type after extension and calculates the cost accordingly.
Similarly, a trunc of a value can be integrated into the scatter that stores it.

Diff Detail

Event Timeline

anwel created this revision.Mar 3 2020, 7:31 AM
RKSimon added inline comments.Mar 3 2020, 10:22 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
968–970

Please can you add the doxygen param entry?

1011–1013

Explain what I is

The fact that the cost of zext/sext's change so much with context is pretty annoying. This "per single instruction" costmodel is having trouble keeping up. We might need to eventually do something about that, come up with something better at a higher level. I'm not sure what that would look like though.

This looks like a nice improvement over what was already there.

llvm/include/llvm/Analysis/TargetTransformInfo.h
1016–1017

Is it possible to keep I last and give it a default argument of nullptr? Or does that cause other problems?

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
897

check -> Check

907

only -> Only

921

I don't think this I->getNumUses() == 0 is needed?

llvm/test/Analysis/CostModel/ARM/mve-gather-scatter-cost.ll
177–187

Can you move these into gep_v4i16. And add a trunc to the scatters. Maybe something like:

%res6 = call <4 x i16> @llvm.masked.gather.v4i16.v4p0i16(<4 x i16*> %gep5, i32 2, <4 x i1> %mask, <4 x i16> undef)
%res6sext = sext <4 x i16> %res6 to <4 x i32>
%res6trunc = trunc <4 x i32> %res6sext to <4 x i16>
call void @llvm.masked.scatter.v4i32.v4p0i32(<4 x i16> %res6trunc, <4 x i32*> %gep5.2, i32 4, <4 x i1> %mask)

If I haven't got that wrong, those scatters should be cheap(ish). It's probably worth adding similar v4i8 and v8i8 tests in too.

samparker added inline comments.Mar 4 2020, 2:28 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
910

Could we just return the ScalarCost if this isn't true?

921

Do we really need to check the uses of a store?

anwel marked 10 inline comments as done.Mar 4 2020, 6:01 AM
anwel added inline comments.
llvm/include/llvm/Analysis/TargetTransformInfo.h
968–970

Yes, thanks for the reminder.

1016–1017

I have a slight aversion against multiple default arguments in one function, as they might lead to confusion.
I've checked though that C++ warns in this case if no VF but an instruction is passed, so as it reduces the amount of necessary changes I'll take your advice and add a default value.

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
921

No we don't :)

anwel updated this revision to Diff 248161.Mar 4 2020, 6:05 AM
anwel marked 3 inline comments as done.

Implemented some of the changes suggested in comments.

Added the new instruction parameter to the getIntrinsicInstrCost version that only gets types, too, so the functions stay roughly in-sync and templates putting T instead of Value/Type are unaffected.

anwel updated this revision to Diff 249050.Mar 9 2020, 3:29 AM
anwel marked 4 inline comments as done.

Moved around some tests and add two more.
Also made sure that extends or truncs to 'incomplete' vector types are not considered valid candidates to be merged into the gather.

anwel added inline comments.Mar 9 2020, 3:29 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
910

Not always. A v8i16 gather that for some reason is extended to v8i32 would be checked here, too, and fail - although it's perfectly fine to build that gather

llvm/test/Analysis/CostModel/ARM/mve-gather-scatter-cost.ll
177–187

Have been moved. I moved those for v8i8, too, and created a set for v4i8.

dmgreen accepted this revision.Mar 10 2020, 3:13 AM

LGTM

llvm/include/llvm/Analysis/TargetTransformInfo.h
1025

Add an I doxygen comment again?

This revision is now accepted and ready to land.Mar 10 2020, 3:13 AM
anwel marked 2 inline comments as done.Mar 11 2020, 2:39 AM
anwel added inline comments.
llvm/include/llvm/Analysis/TargetTransformInfo.h
1025

Yes of course ;)

This revision was automatically updated to reflect the committed changes.
anwel marked an inline comment as done.