This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Basic gather scatter cost model
ClosedPublic

Authored by dmgreen on Jan 20 2020, 2:51 AM.

Details

Summary

This is a very basic MVE gather/scatter cost model, based roughly on the code that we will currently produce. It does not handle truncating scatters or extending loads correctly yet, as it is difficult to tell that they are going to be correctly extended/truncated from the limited information in the cost function.

This can be improved as we extend support for these in the future.

Based on code originally written by David Sherwood.

Diff Detail

Event Timeline

dmgreen created this revision.Jan 20 2020, 2:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2020, 2:51 AM
SjoerdMeijer accepted this revision.Jan 20 2020, 7:50 AM

Looks very reasonable to me.

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
912

nit: formatting?

This revision is now accepted and ready to land.Jan 20 2020, 7:50 AM
anwel added inline comments.Jan 20 2020, 7:54 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
897

nit: An ?

914

nit: needs

918–920

The gather/scatter pass can also handle a sext here, in the manner that it will just leave it the way it is and use the result if it's of the correct type. That increases VectorCost by the cost of the sext, but doesn't necessarily mean that building a gather is more expensive than expanding.

dmgreen updated this revision to Diff 239301.Jan 21 2020, 6:18 AM
dmgreen marked 8 inline comments as done.
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
897

Ah, I moved some code around and it no longer made sense.

912

This is apparently how clang-format wants this. But I think this line was wrong anyway. I've re-done it using DL.

918–920

We will only get to this point if we are dealing with either an i16 or i8 gather, in which case we need the address to be zext to be folded into the instruction otherwise we need to expand.

For i32's, I think we can always produce something. So up above there's a check that if EltSize == 32, we return VectorCost.

It is true that the zext here will still show up as a cost where it shouldn't, the instruction will fold it in. That is something that we can fix later I think, maybe at the same time that we improve the sext/zext/trunc handling.

This revision was automatically updated to reflect the committed changes.