If the legalized type is a legal interleaved access type (i.e. there's a
supported vlseg/vsseg instruction for it), the interleaved access pass
will pick any interleaved memory op (wide load + shuffles) and lower it
into a vlseg/vsseg intrinsic.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp | ||
---|---|---|
394–396 | If we just use the un-legalized type to cost model, then interleaves of <6 x i8> etc. which are common with Factor=3 are reported as really expensive, when in fact they can be selected as vlseg/vsseg. |
LGTM, though please add a FIXME with a short description of the illegal memory op cost bit. i.e. explain why the if is needed in the code since it's non-obvious.
Your observation about the memory op cost for <6 x i8> is something we should follow up on. That does sound surprising, and I affect it is negatively impacting e.g. SLP vectorization of short vectors. Once you've done that, we can resimplify this code.
If we just use the un-legalized type to cost model, then interleaves of <6 x i8> etc. which are common with Factor=3 are reported as really expensive, when in fact they can be selected as vlseg/vsseg.
Perhaps there's a better way to account for this though: I was surprised that getMemoryOpCost reported such a high cost (num elements + 1) for these types.