This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Use custom isel for vector indexed load/store intrinsics.
ClosedPublic

Authored by craig.topper on Feb 18 2021, 10:03 PM.

Details

Summary

There are many legal combinations of index and data VTs supported
for these intrinsics. This results in a lot of isel patterns in
RISCVGenDAGISel.inc.

By adding a separate table similar to what we use for segment
load/stores, we can more efficiently manually select these
intrinsics. We should also be able to reuse this table scalable
vector gather/scatter.

This reduces the llc binary size by ~56K.

Diff Detail

Event Timeline

craig.topper created this revision.Feb 18 2021, 10:03 PM
craig.topper requested review of this revision.Feb 18 2021, 10:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2021, 10:03 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
khchen accepted this revision.Feb 19 2021, 12:41 AM

LGTM!!

I'm little curious, what's benefit to have two GenericTables for load/store?
Why not add a isLoad field in FilterClass and using one GenericTable for VLX and VSX?
because we have two identical FilterClass, RISCVVLX and RISCVVSX.

This revision is now accepted and ready to land.Feb 19 2021, 12:41 AM

I'm wondering if this has any implications on the masked scatter/gather intrinsics. I had previously copied this TableGen, knowing that we only use a small subset of the valid permutation of VTs for scatter/gather. I was going to look at that later. However, with this change I can see that an alternative lowering scheme would be to lower them to the intrinsics. Any thoughts about that?

LGTM!!

I'm little curious, what's benefit to have two GenericTables for load/store?
Why not add a isLoad field in FilterClass and using one GenericTable for VLX and VSX?
because we have two identical FilterClass, RISCVVLX and RISCVVSX.

I might do that. I think there is a free byte of padding in each row of the table so it doesn’t cost anything.

I'm wondering if this has any implications on the masked scatter/gather intrinsics. I had previously copied this TableGen, knowing that we only use a small subset of the valid permutation of VTs for scatter/gather. I was going to look at that later. However, with this change I can see that an alternative lowering scheme would be to lower them to the intrinsics. Any thoughts about that?

That’s a good idea. I’ve been thinking about the same thing for the other ISD opcodes we’ve added. But I haven’t come to any conclusions. Having 3 different ways of selecting the same instruction seems wasteful for the isel table. The isel table isn’t very efficient either for the number of types we have.

This revision was landed with ongoing or failed builds.Feb 19 2021, 10:18 AM
This revision was automatically updated to reflect the committed changes.

I'm wondering if this has any implications on the masked scatter/gather intrinsics. I had previously copied this TableGen, knowing that we only use a small subset of the valid permutation of VTs for scatter/gather. I was going to look at that later. However, with this change I can see that an alternative lowering scheme would be to lower them to the intrinsics. Any thoughts about that?

That’s a good idea. I’ve been thinking about the same thing for the other ISD opcodes we’ve added. But I haven’t come to any conclusions. Having 3 different ways of selecting the same instruction seems wasteful for the isel table. The isel table isn’t very efficient either for the number of types we have.

If we go the route of reusing these intrinsics for gather/scatter we should probably fix them to carry a memory operand into machine IR properly so we don't drop the one from the gather/scatter node.