Extend shuffle canonicalization and conversion of shuffles fed by vectorized scalars to big endian subtargets. For big endian subtargets, loads and direct moves of scalars into vector registers put the data in the correct element for SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is narrower, the value still ends up in the wrong place - althouth a different wrong place than on little endian targets.
This patch extends the combine that keeps values where they are if they feed a shuffle to big endian targets.
Details
- Reviewers
pjeeva01 jsji - Group Reviewers
Restricted Project - Commits
- rG03e7fefff8ca: [PowerPC] Canonicalize shuffles on big endian targets as well
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
LGTM with some nits.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
9566 | Do we need to update the comments to be more specific about 8 bytes? | |
9583 | Is this affecting little endian as well? If so, any test that can show the behavior change? | |
14220 | unsigned ElemSizeInBytes? | |
14252 | Is it possible that getVectorNumElements is 1? Then we will overflow the NewMask here? | |
14265 | comments need update? | |
14309 | should be unsigned here.? getScalarSizeInBits returning unsigned. | |
14349 | nit: Should we use local var instead? bool isLittleEndian = Subtarget.isLittleEndian(); | |
14420–14421 | nit: How about something like: if(isLittleEndian) RHS = TheSplat; else LHS = TheSplat; Res = DAG.getVectorShuffle(SVN->getValueType(0), dl, LHS, RHS, ShuffV); |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
9566 | I will change it to: Unexpected size for permuted load on big endian target | |
9583 | In theory, yes. However, we cannot get any offset other than 0 here on LE. If the splat and load are 8 bytes, SplatIdx must be 1 because otherwise we wouldn't be splatting the value we loaded. If the splat and load are 4 bytes, SplatIdx must be 3 because otherwise we wouldn't be splatting the value we loaded. So the conditional above will ensure that 1 - SplatIdx/3 - SplatIdx is 0 respectively. | |
14220 | Sounds good. | |
14252 | I will add an assert for that. | |
14265 | I'll add a note about what this does on BE. | |
14272–14276 | // On big endian targets, this is still useful for SCALAR_TO_VECTOR // nodes with elements smaller than doubleword because all the ways // of getting scalar data into a vector register put the value in the // rightmost element of the left half of the vector. | |
14309 | OK. | |
14349 | Absolutely, will do. Thanks. | |
14420–14421 | Makes sense. |
Do we need to update the comments to be more specific about 8 bytes?