This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] Legalize <2 x s8> and <4 x s8> for G_BUILD_VECTOR
ClosedPublic

Authored by Allen on Jun 20 2023, 7:01 PM.

Details

Summary

Refer to commit ccffc27, the remaining types <2 x s8> and <4 x s8> should
also be promoted to <2 x s32> and <4 x s16>.

Fixes https://github.com/llvm/llvm-project/issues/58274

Diff Detail

Event Timeline

Allen created this revision.Jun 20 2023, 7:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2023, 7:01 PM
Allen requested review of this revision.Jun 20 2023, 7:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2023, 7:01 PM

This test, whilst technically correct, doesn't look right. I don't think it can just generate a SUBREG_TO_REG in the same way it does for integer. Can you change the test to return <i16 0, i16 1>, and make sure the returned value would be the same as SDAG.

Allen updated this revision to Diff 533117.Jun 20 2023, 8:49 PM

update test case according comment

tschuett added inline comments.
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
420

Could you instead query OldLLT and NewLLT whether they are isScalar()? Looks odd to query MVTs in GISel.

tschuett added inline comments.Jun 25 2023, 12:51 AM
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
420

It is about integer and floats?

Allen added inline comments.Jun 27 2023, 5:48 AM
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
420

Yes, It is about integer and floats (not about the scalar and vector).

If I'm reading the test correctly, then SDAG version is returning (as bytes) 0,0,0,0,1,0,0,0. i.e the v2i16 is promoted to a v2i32. The GlobalISel version is returning 0,0,1,0,0,0,0,0, as the v2i16 is widened to a v4i16.

I don't think we can have a difference in the calling convention between the two, it would mean they are ABI incompatible. Either they both need to change to widen (which looks like a lot of work, including an ABI break for SDAG), or the GISel code need return values in the same way as SDAG does.

llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
3230

I don't think this will handle vector extends correctly.

Allen added a comment.Jul 8 2023, 1:03 AM

If I'm reading the test correctly, then SDAG version is returning (as bytes) 0,0,0,0,1,0,0,0. i.e the v2i16 is promoted to a v2i32. The GlobalISel version is returning 0,0,1,0,0,0,0,0, as the v2i16 is widened to a v4i16.

I don't think we can have a difference in the calling convention between the two, it would mean they are ABI incompatible. Either they both need to change to widen (which looks like a lot of work, including an ABI break for SDAG), or the GISel code need return values in the same way as SDAG does.

Thanks, I find it seems to be an optimization in SDAG version. So GISel version needs to do something like
SelectionDAG::FoldConstantArithmetic and TryToFoldExtendOfConstant for const vector constant ?

tschuett added a comment.EditedJul 8 2023, 1:29 AM

If I'm reading the test correctly, then SDAG version is returning (as bytes) 0,0,0,0,1,0,0,0. i.e the v2i16 is promoted to a v2i32. The GlobalISel version is returning 0,0,1,0,0,0,0,0, as the v2i16 is widened to a v4i16.

I don't think we can have a difference in the calling convention between the two, it would mean they are ABI incompatible. Either they both need to change to widen (which looks like a lot of work, including an ABI break for SDAG), or the GISel code need return values in the same way as SDAG does.

Thanks, I find it seems to be an optimization in SDAG version. So GISel version needs to do something like
SelectionDAG::FoldConstantArithmetic and TryToFoldExtendOfConstant for const vector constant ?

I believe that you are mixing optimizations with ABI. The SDAG ABI result has to win indepent of the inefficient GISel code.

Allen updated this revision to Diff 538557.Jul 10 2023, 2:41 AM
Allen added a reviewer: tschuett.

Add tryToFoldExtendOfVectorConstant to adjust the ABI

Allen edited the summary of this revision. (Show Details)Jul 10 2023, 3:51 AM

Can you add a MIR testcase?

llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
3212

Might be good to update this comment

Scalar G_ANYEXT on bank...

5608

Comment?

5615

Could use a comment explaining that you're looking for G_BUILD_VECTORs with all constant source operands?

5634

emitConstantVector should return a nullptr on failure, right?

So then we can save one LOC:

// Try to replace ExtI with a constant vector.
MachineInstr *MaybeCVec =
      emitConstantVector(ExtI.getOperand(0).getReg(), CV, MIB, MRI);
if (MaybeCVec)
  ExtI.eraseFromParent();
return MaybeCVec;
Allen updated this revision to Diff 538896.Jul 10 2023, 7:54 PM

address comment and add a new mir test llvm/test/CodeGen/AArch64/GlobalISel/select-neon-vector-const.mir

Allen marked 5 inline comments as done.Jul 10 2023, 7:57 PM
Allen added inline comments.
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
3212

Done, thanks

3230

Thanks, add a new function tryToFoldExtendOfVectorConstant to handle this case.

5634

Thanks, apply your comment

paquette added inline comments.Jul 10 2023, 11:11 PM
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
5621

can you add a testcase that shows what happens when one of the G_BUILD_VECTOR sources is a constant?

e.g

%x = G_BUILD_VECTOR %constant, %not_a_constant
5638

can you add a test to the MIR testcase that shows what happens when emitConstantVector returns nullptr?

llvm/test/CodeGen/AArch64/GlobalISel/select-neon-vector-const.mir
3 ↗(On Diff #538896)

you can delete the IR portion

17 ↗(On Diff #538896)

you can delete the registers section

22 ↗(On Diff #538896)

if you delete the IR section, then this will need to be renamed so that it does not reference the IR

Allen marked 3 inline comments as done.Jul 12 2023, 5:54 AM
Allen added inline comments.
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
5621

it will crash before regbankselect, so it seems another independent issue, https://gcc.godbolt.org/z/e3Wq9Mdar, so I fire a issue https://github.com/llvm/llvm-project/issues/63826

5638

I add this function refer to above function tryOptConstantBuildVec, and I don't have the idea how to construct a constant that meets the scenario where returns null , do you have any suggestion ?

I haven't forgotten about this patch, I just need to find some time to look into this issue. In the mean time: as a rule, anything that's produced by the translator must be correct by itself, it can't rely on any optimizations to run in order to generate the correct code. It's easiest to check this by writing an additional MIR test for the irtranslator change. It should be clear from that test whether or not the change is correct. The optimizations, if needed and appropriate, can be a separate patch.

Allen added a comment.Jul 14 2023, 3:07 AM

I'm glad to know about your plans,thank you for your time.

So I had a look at this particular test case, and from what I can tell there's nothing we're doing wrong in the IRTranslator. lowerReturn() is correctly widening the <2 x i16> return type to <2 x i32>. This leaves the following MIR:

body:             |
  bb.1 (%ir-block.0):
    %1:_(s16) = G_CONSTANT i16 0
    %2:_(s16) = G_CONSTANT i16 1
    %0:_(<2 x s16>) = G_BUILD_VECTOR %1(s16), %2(s16)
    %3:_(<2 x s32>) = G_ANYEXT %0(<2 x s16>)
    $d0 = COPY %3(<2 x s32>)
    RET_ReallyLR implicit $d0

I think the problem is that G_BUILD_VECTOR of <2 x i16> needs to be widened to a supported type. Since this was a trivial change, I went ahead and did it in ccffc2705054

Thanks anyway for taking a look at it!

Allen added a comment.Jul 19 2023, 6:11 PM

Thank you for your guidance

Allen abandoned this revision.Jul 21 2023, 1:44 AM
Allen updated this revision to Diff 542866.Jul 21 2023, 5:08 AM
Allen retitled this revision from [AArch64][GlobalISel] Selection support for v2s16 G_ANYEXT to [AArch64][GlobalISel] Legalize <2 x s16> and <4 x s8> for G_BUILD_VECTOR.
Allen edited the summary of this revision. (Show Details)
aemerson added inline comments.Jul 21 2023, 1:11 PM
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
723–729 ↗(On Diff #542866)

Now that we're going to do this multiple times, I think it's worth factoring out the logic to make it easier to re-use. The underlying logic is I believe: "vectors must be at least 64 bits wide", right? I think we could make this easier by adding a new action/predicates in LegalizerInfo.h, so that we could do something like:

.promoteVectoreEltsToVectorMinSize(0, 64)

I think there are other places in this file that could also use this new action to simplify the code.

P.S. please attach more context to your diffs (-U9999 works).

Allen updated this revision to Diff 543173.Jul 22 2023, 4:59 AM
Allen retitled this revision from [AArch64][GlobalISel] Legalize <2 x s16> and <4 x s8> for G_BUILD_VECTOR to [AArch64][GlobalISel] Legalize <2 x s8> and <4 x s8> for G_BUILD_VECTOR.
Allen edited the summary of this revision. (Show Details)

address comment, add new function promoteVectorEltsToVectorMinSize

Allen marked an inline comment as done.Jul 22 2023, 5:01 AM
Allen added inline comments.
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
723–729 ↗(On Diff #542866)

thanks for your detail suggestion, apply your comment.

tschuett added inline comments.Jul 22 2023, 11:03 AM
llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
944 ↗(On Diff #543173)

Nice! s/Ty/VectorSize/ .

aemerson accepted this revision.Jul 22 2023, 10:23 PM

LGTM with a few nits. Thanks for working on this!

llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
945 ↗(On Diff #543173)

Sorry, the name I suggested didn't fit with the rest of the naming scheme. I think widenVectorEltsToVectorMinSize is better.

953 ↗(On Diff #543173)

LLT::isScalable()

This revision is now accepted and ready to land.Jul 22 2023, 10:23 PM
Allen updated this revision to Diff 543349.Jul 23 2023, 7:53 PM
Allen marked an inline comment as done.

address comments

Allen marked 3 inline comments as done.Jul 23 2023, 7:55 PM
Allen added inline comments.
llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
944 ↗(On Diff #543173)

Done, thanks

945 ↗(On Diff #543173)

Done, thanks

This revision was landed with ongoing or failed builds.Jul 23 2023, 8:28 PM
This revision was automatically updated to reflect the committed changes.
Allen marked 2 inline comments as done.