This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add and update scalar instructions
AcceptedPublic

Authored by grahamsellers on Nov 19 2018, 10:48 AM.

Details

Reviewers
nhaehnle
arsenm
Summary

This patch adds support for S_ANDN2, S_ORN2 32-bit and 64-bit instructions and adds splits to move them to the vector unit (for which there is no equivalent instruction). It modifies the way that the more complex scalar instructions are lowered to vector instructions by first breaking them down to sequences of simpler scalar instructions which are then lowered through the existing code paths. The pattern for S_XNOR has also been updated to apply inversion to one input rather than the output of the XOR as the result is equivalent and may allow leaving the NOT instruction on the scalar unit.

A new tests for NAND, NOR, ANDN2 and ORN2 have been added, and existing tests now hit the new instructions (and have been modified accordingly).

Diff Detail

Event Timeline

grahamsellers created this revision.Nov 19 2018, 10:48 AM
arsenm added inline comments.Nov 19 2018, 10:54 AM
lib/Target/AMDGPU/SIInstrInfo.cpp
4533–4535

You shouldn't check the literal class and use one of the isSGPR* functions in SIRegisterInfo

4538–4539

Should probably use SReg_32_XEXEC

4585–4586

Ditto

grahamsellers edited the summary of this revision. (Show Details)

Addressing review comments. Adding more tests.

grahamsellers marked an inline comment as done.Nov 20 2018, 9:48 AM
grahamsellers added inline comments.
lib/Target/AMDGPU/SIInstrInfo.cpp
4538–4539

I couldn't find a SReg_32_XEXEC class, only SReg_32_XEXEC_HI, which seems to only exclude EXEC_LO, or SReg_64_XEXEC, which is 64-bit. Is there any particular reason to avoid EXEC? I can't get the compiler to hit that anyway.

arsenm added inline comments.Nov 26 2018, 8:45 AM
lib/Target/AMDGPU/SIInstrInfo.cpp
4538–4539

I meant SReg_32_XM0 so vcc_lo/vcc_hi will be allowed, but not m0. Exec doesn't matter much. We reserve exec, so it can never be allocated.

4614–4615

SReg_32_XM0

Using SReg_32_XM0 to allocate temporary scalars as requested.

grahamsellers marked 3 inline comments as done.Nov 27 2018, 4:26 AM
arsenm accepted this revision.Nov 27 2018, 9:24 AM

LGTM

This revision is now accepted and ready to land.Nov 27 2018, 9:24 AM