This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GlobalISel] Select op_sel modifiers for VOP3P
AbandonedPublic

Authored by mbrkusanin on Dec 31 2021, 5:17 AM.

Details

Diff Detail

Event Timeline

mbrkusanin created this revision.Dec 31 2021, 5:17 AM
mbrkusanin requested review of this revision.Dec 31 2021, 5:17 AM
arsenm added inline comments.Jan 10 2022, 9:40 AM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3415

Is this really necessary given that we're generally moving away from using G_EXTRACT at all?

  • Removed G_EXTRACT checks
  • Refactored parts of code dealing with setting OP_SEL_0 and OP_SEL_1 bits.
arsenm added inline comments.Jan 17 2022, 4:36 PM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3410

MI cannot be null

3496

Don't need = 0

3503

MI cannot be null

3511

Ditto

3514

Move comment into else if and fix weird line break

3528–3535

This matching is a bit heavy. Seems like we're missing some combines?

This at least could use some comments for the pattern being matched

3545–3546

Why would this pattern appear?

Refactor + updated comments

mbrkusanin added inline comments.Jan 20 2022, 2:40 AM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3528–3535

The Or+Shl+And come from RegBankSelect when it breaks down G_BUILD_VECTOR(_TRUNC) instructions that form <2 x 16> operands.
A combine could help but only in cases where those instructions would be redundant and no op_sel modifiers would need to be used.

There are other cases where such combine could help but it's a separate issue.

3545–3546

Added comment to clarify. Register is s32 but fneg that we want to match is for s16.

foad added a comment.Jan 21 2022, 3:00 AM

Seems reasonable overall, but I am a bit worried about testing all the possible corner cases.

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3521

I don't understand how this can be safe, when you never even look at MI->getOperand(2). Did you mean to test the mask elements for equality, e.g. ShuffleMask[0] == 1 instead of ShuffleMask[0] != 0?

3539

This is NegLo ^= ...

3542

Are there tests for this case? If HiSrc is undef then I am a bit surprised that we would ever match m_GShl(m_Reg(HiSrc), m_SpecificICst(16)) above.

Also, do we need an equivalent case for when LoSrc is undef?

3554

NegHi ^= ...

mbrkusanin marked an inline comment as done.
  • Fix handling of G_SHUFFLE_VECTOR
  • Handled case where low part of <2 x 16> could also be undef.
  • Added tests for new cases
mbrkusanin marked an inline comment as not done.Jan 26 2022, 10:17 AM
mbrkusanin added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3521

I mistakenly thought that ShuffleMask.size() == 2 would be enough of a restriction. That was a bug. Also, a lot more cases are covered now.

3542

Cases where hi part is undef are common when operations on <N x 16> vectors with odd number of elements are broken down into multiple packed instructions. Arguments for last packed instruction will be undef for high part.

Added new test insert_with_undef which has case with both lo and hi as undef.

As for matching above,
these come from RegBankSelect when it breaks down G_BUILD_VECTOR(_TRUNC) for <2 x 16>.
It does not check if either low or high part is undef and everything matched above is always constructed.

mbrkusanin updated this revision to Diff 405273.Feb 2 2022, 7:41 AM
mbrkusanin marked an inline comment as not done.

Rebase

Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 3:29 PM
Herald added a subscriber: kosarev. · View Herald Transcript
arsenm requested changes to this revision.Nov 16 2022, 3:30 PM

Oops, this was at least mostly re-implemented. Is there anything here main doesn't have already?

This revision now requires changes to proceed.Nov 16 2022, 3:30 PM
mbrkusanin abandoned this revision.Nov 17 2022, 6:47 AM

Oops, this was at least mostly re-implemented. Is there anything here main doesn't have already?

Nothing that I can see.