Refer to commit ccffc27, the remaining types <2 x s8> and <4 x s8> should
also be promoted to <2 x s32> and <4 x s16>.
Details
Diff Detail
Event Timeline
This test, whilst technically correct, doesn't look right. I don't think it can just generate a SUBREG_TO_REG in the same way it does for integer. Can you change the test to return <i16 0, i16 1>, and make sure the returned value would be the same as SDAG.
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp | ||
---|---|---|
420 ↗ | (On Diff #533117) | Could you instead query OldLLT and NewLLT whether they are isScalar()? Looks odd to query MVTs in GISel. |
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp | ||
---|---|---|
420 ↗ | (On Diff #533117) | It is about integer and floats? |
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp | ||
---|---|---|
420 ↗ | (On Diff #533117) | Yes, It is about integer and floats (not about the scalar and vector). |
If I'm reading the test correctly, then SDAG version is returning (as bytes) 0,0,0,0,1,0,0,0. i.e the v2i16 is promoted to a v2i32. The GlobalISel version is returning 0,0,1,0,0,0,0,0, as the v2i16 is widened to a v4i16.
I don't think we can have a difference in the calling convention between the two, it would mean they are ABI incompatible. Either they both need to change to widen (which looks like a lot of work, including an ABI break for SDAG), or the GISel code need return values in the same way as SDAG does.
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp | ||
---|---|---|
3230 ↗ | (On Diff #533117) | I don't think this will handle vector extends correctly. |
Thanks, I find it seems to be an optimization in SDAG version. So GISel version needs to do something like
SelectionDAG::FoldConstantArithmetic and TryToFoldExtendOfConstant for const vector constant ?
I believe that you are mixing optimizations with ABI. The SDAG ABI result has to win indepent of the inefficient GISel code.
Can you add a MIR testcase?
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp | ||
---|---|---|
3212 ↗ | (On Diff #538557) | Might be good to update this comment Scalar G_ANYEXT on bank... |
5608 ↗ | (On Diff #538557) | Comment? |
5615 ↗ | (On Diff #538557) | Could use a comment explaining that you're looking for G_BUILD_VECTORs with all constant source operands? |
5634 ↗ | (On Diff #538557) | emitConstantVector should return a nullptr on failure, right? So then we can save one LOC: // Try to replace ExtI with a constant vector. MachineInstr *MaybeCVec = emitConstantVector(ExtI.getOperand(0).getReg(), CV, MIB, MRI); if (MaybeCVec) ExtI.eraseFromParent(); return MaybeCVec; |
address comment and add a new mir test llvm/test/CodeGen/AArch64/GlobalISel/select-neon-vector-const.mir
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp | ||
---|---|---|
5621 ↗ | (On Diff #538896) | can you add a testcase that shows what happens when one of the G_BUILD_VECTOR sources is a constant? e.g %x = G_BUILD_VECTOR %constant, %not_a_constant |
5638 ↗ | (On Diff #538896) | can you add a test to the MIR testcase that shows what happens when emitConstantVector returns nullptr? |
llvm/test/CodeGen/AArch64/GlobalISel/select-neon-vector-const.mir | ||
3 ↗ | (On Diff #538896) | you can delete the IR portion |
17 ↗ | (On Diff #538896) | you can delete the registers section |
22 ↗ | (On Diff #538896) | if you delete the IR section, then this will need to be renamed so that it does not reference the IR |
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp | ||
---|---|---|
5621 ↗ | (On Diff #538896) | it will crash before regbankselect, so it seems another independent issue, https://gcc.godbolt.org/z/e3Wq9Mdar, so I fire a issue https://github.com/llvm/llvm-project/issues/63826 |
5638 ↗ | (On Diff #538896) | I add this function refer to above function tryOptConstantBuildVec, and I don't have the idea how to construct a constant that meets the scenario where returns null , do you have any suggestion ? |
I haven't forgotten about this patch, I just need to find some time to look into this issue. In the mean time: as a rule, anything that's produced by the translator must be correct by itself, it can't rely on any optimizations to run in order to generate the correct code. It's easiest to check this by writing an additional MIR test for the irtranslator change. It should be clear from that test whether or not the change is correct. The optimizations, if needed and appropriate, can be a separate patch.
So I had a look at this particular test case, and from what I can tell there's nothing we're doing wrong in the IRTranslator. lowerReturn() is correctly widening the <2 x i16> return type to <2 x i32>. This leaves the following MIR:
body: | bb.1 (%ir-block.0): %1:_(s16) = G_CONSTANT i16 0 %2:_(s16) = G_CONSTANT i16 1 %0:_(<2 x s16>) = G_BUILD_VECTOR %1(s16), %2(s16) %3:_(<2 x s32>) = G_ANYEXT %0(<2 x s16>) $d0 = COPY %3(<2 x s32>) RET_ReallyLR implicit $d0
I think the problem is that G_BUILD_VECTOR of <2 x i16> needs to be widened to a supported type. Since this was a trivial change, I went ahead and did it in ccffc2705054
Thanks anyway for taking a look at it!
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp | ||
---|---|---|
723–729 | Now that we're going to do this multiple times, I think it's worth factoring out the logic to make it easier to re-use. The underlying logic is I believe: "vectors must be at least 64 bits wide", right? I think we could make this easier by adding a new action/predicates in LegalizerInfo.h, so that we could do something like: .promoteVectoreEltsToVectorMinSize(0, 64) I think there are other places in this file that could also use this new action to simplify the code. P.S. please attach more context to your diffs (-U9999 works). |
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp | ||
---|---|---|
723–729 | thanks for your detail suggestion, apply your comment. |
llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h | ||
---|---|---|
944 ↗ | (On Diff #543173) | Nice! s/Ty/VectorSize/ . |
Now that we're going to do this multiple times, I think it's worth factoring out the logic to make it easier to re-use. The underlying logic is I believe: "vectors must be at least 64 bits wide", right? I think we could make this easier by adding a new action/predicates in LegalizerInfo.h, so that we could do something like:
I think there are other places in this file that could also use this new action to simplify the code.
P.S. please attach more context to your diffs (-U9999 works).