Given concat(zip1(a, b), zip2(a, b)), we can convert that to a 128bit zip1(a, b) if we widen a and b out first.
Fixes #54226
Paths
| Differential D121088
[AArch64] Concat zip1 and zip2 is a wider zip1 ClosedPublic Authored by dmgreen on Mar 7 2022, 12:50 AM.
Details Summary Given concat(zip1(a, b), zip2(a, b)), we can convert that to a 128bit zip1(a, b) if we widen a and b out first. Fixes #54226
Diff Detail
Event TimelineComment Actions This combine seems a little fragile; I'd prefer a solution where generate the zip1 earlier. I'm concerned some combine will hide the specific concat(zip1, zip2) pattern this patch checks for. But not sure hard that would be; the current patch is clearly an improvement, in any case. Comment Actions For example, consider the following, based on a testcase from https://github.com/llvm/llvm-project/issues/54226 u16 zip3(u8 a, u8 b) { char z; return u16{ a[0],b[0], a[1],b[1], a[2],b[2], a[3],b[3], a[4],b[4], a[5],b[5], a[6],b[6], a[7],z }; } Comment Actions @efriedma I wrote this quickly and didn't have a lot of time to look into doing it differently. I got another report of the same thing. What do you think about getting this version in? It seems like a simple improvement, especially from intrinsic code which is unlikely to include undef. And for test like @combine2_v8i16 they already include the concat from the beginning in a separate shuffle. The patch still applies as-is (with updated tests), and I could add a new test for the extra case you mention with a FIXME comment? This revision is now accepted and ready to land.Feb 14 2023, 3:15 AM This revision was landed with ongoing or failed builds.Feb 18 2023, 11:54 AM Closed by commit rG8e3dc1366fb8: [AArch64] Concat zip1 and zip2 is a wider zip1 (authored by dmgreen). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 498608 llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/arm64-zip.ll
llvm/test/CodeGen/AArch64/complex-deinterleaving-mixed-cases.ll
llvm/test/CodeGen/AArch64/complex-deinterleaving-uniform-cases.ll
|