This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU : Add intrinsics for compare with the full wavefront result, such as v_cmp_ne_i32, etc..
ClosedPublic

Authored by wdng on Jul 18 2016, 2:57 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

wdng updated this revision to Diff 64393.Jul 18 2016, 2:57 PM
wdng retitled this revision from to AMDGPU : Add an LLVM intrinsic / Clang Builtin to expose the v_cmp_ne_i32 instruction..
wdng updated this object.
wdng added reviewers: tstellarAMD, arsenm.
wdng set the repository for this revision to rL LLVM.
arsenm requested changes to this revision.Jul 18 2016, 2:58 PM
arsenm edited edge metadata.

We should only do a general icmp, not an intrinsic for specific compare opcodes

This revision now requires changes to proceed.Jul 18 2016, 2:58 PM

We should only do a general icmp, not an intrinsic for specific compare opcodes

Do you mean an LLVM IR icmp instruction or a generic intrinsic that takes a condition code as its third input?

We should only do a general icmp, not an intrinsic for specific compare opcodes

Do you mean an LLVM IR icmp instruction or a generic intrinsic that takes a condition code as its third input?

Yes

wdng updated this revision to Diff 64928.Jul 21 2016, 11:12 AM
wdng edited edge metadata.

Use a general way to implement v_cmp_ne_i32.

wdng updated this revision to Diff 64938.Jul 21 2016, 12:11 PM
wdng edited edge metadata.

Fixed a data type issue for fcmp intrinsic definition.

arsenm added inline comments.Jul 21 2016, 4:12 PM
include/llvm/IR/IntrinsicsAMDGPU.td
394

Remove the GCCBuiltins, they don't work with overloaded intrinsics

400

the 3rd parameter should be i32

lib/Target/AMDGPU/AMDGPUISelLowering.h
231

Needs a comment that this is setcc with the full mask result

lib/Target/AMDGPU/SIISelLowering.cpp
1655–1656

These should be put towards the end of the cases

1657

Variables should be capitalized and camel case.

What happens if the cond code is out of range? There should probably be a clamp

1658

Extra spaces between type and name

1659

This looks like it goes over 80 characters

lib/Target/AMDGPU/SIInstructions.td
2365–2366

This can just be a class. You can also try adding the pattern dag to the v_cmp instruction definition patterns list (although I'm not 100% sure if the multiple patterns actually work). A multiclass might help if you don't want to repeat for i32/i64

2372–2373

All compare types should be defined. Additionally i64 and the FP ones are missing

test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ne.ll
6–12 ↗(On Diff #64938)

There should be a test for every condition code and i32/i64

8 ↗(On Diff #64938)

nounwind should also be an attribute group

9 ↗(On Diff #64938)

Call site doesn't need the at tributes

include/llvm/IR/IntrinsicsAMDGPU.td
400

Also, should the return type be i64 instead of double?

wdng updated this revision to Diff 65358.Jul 25 2016, 9:25 AM
wdng marked 13 inline comments as done.

Code changes based on Matt's comments.

include/llvm/IR/IntrinsicsAMDGPU.td
400

Yes, code has been changed accordingly. Thanks!

arsenm added inline comments.Jul 25 2016, 12:09 PM
lib/Target/AMDGPU/AMDGPUISelLowering.h
231

Should have space after the //, and it should be capitalized and punctuated. Maybe clearer would be a compare with a result bit per item in the wavefront or something, mask result sounds more ambiguous maybe

lib/Target/AMDGPU/SIISelLowering.cpp
1909–1910

You should do the range check before the static_cast since I think it is undefined behavior to have an out of bounds enum value inserted.

This also won't work for fcmp, each should be handled in its own case with its own range check for the specific compare types' range

lib/Target/AMDGPU/SIInstructions.td
2374–2375

The unsigned should use the _U32 compare

2383–2384

Ditto

2385–2386

Ditto

2399–2411

This also needs to be done for the unordered compares

test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
103–105

Missing unordered compares

wdng updated this revision to Diff 65427.Jul 25 2016, 3:09 PM
wdng marked 7 inline comments as done.

Changes based on Matt's comments.

arsenm added inline comments.Jul 25 2016, 3:14 PM
include/llvm/IR/IntrinsicsAMDGPU.td
392

You should remove this comment

397

And this one

lib/Target/AMDGPU/SIISelLowering.cpp
1907–1909

Instead of an assert, how about returning undef? this should also have a test. Same if the operand isn't really constant, you'll need to do the dyn_cast yourself

1918–1922

Should refer to FCmpInst

lib/Target/AMDGPU/SIInstructions.td
2413–2425

These are not the correct unordered comparison instructions, refer to the existing set of fcmp patterns for which to use

lib/Target/AMDGPU/SIInstructions.td
2413–2425

Unordered compares should select the V_CMP_N* instructions. Take a look at the instruction definitions to see which condition matches to which instruction.

arsenm added inline comments.Jul 25 2016, 3:18 PM
test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
200

You should also add a test that uses fabs on the inputs to make sure that source modifiers are folded

wdng updated this revision to Diff 65536.Jul 26 2016, 9:08 AM
wdng marked 6 inline comments as done.
  1. Add dyn_cast for type converting.
  2. Fixed not using correct FCmpInst type.
  3. Fixed incorrectly use of unordered insutrctions
  4. Added fabs as input for fcmp test.
wdng updated this revision to Diff 65537.Jul 26 2016, 9:10 AM

Upload correct diff with cached LIT tests update.

wdng updated this revision to Diff 65538.Jul 26 2016, 9:12 AM

Upload correct diff file.

The title of the commit is also inaccurate, it should be intrinsics for compare with the full wavefront result

test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
200

Still missing these tests

wdng added inline comments.Jul 26 2016, 4:13 PM
test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
200

I have just created one "define void @v_fcmp_f32_oeq_with_fabs(i64 addrspace(1)* %out, float %src, float %a) #1" and put it one the top of tests. Should I write fabs tests for all fcmp comparisons?

wdng retitled this revision from AMDGPU : Add an LLVM intrinsic / Clang Builtin to expose the v_cmp_ne_i32 instruction. to AMDGPU : Add intrinsics for compare with the full wavefront result, such as v_cmp_ne_i32, etc...Jul 26 2016, 4:14 PM
arsenm added inline comments.Jul 26 2016, 4:21 PM
test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
6

Use the attribute group

10–15

The call site does not need the attribute specified. Can you also test the other operand? The check line should check the actual operands, this currently does not actually check much

wdng updated this revision to Diff 65630.Jul 26 2016, 4:44 PM

Modified one LIT test to check operands.

wdng updated this revision to Diff 65632.Jul 26 2016, 4:50 PM

Fixed corrupted diff patch.

arsenm added inline comments.Jul 26 2016, 6:09 PM
lib/Target/AMDGPU/SIISelLowering.cpp
1907

Space before =

1921

Ditto

test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
8

Missing test for invalid condition code value

10

You can move the |s outside of the regex and then you don't have to escape them

12

Don't need call site attributes

12–17

Still should test that both operands can have the source modifiers folded

30

it doesn't really matter, but there's no reason this test needs to under-align the stores, Fix these to be align 8 or remove the aligns

test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ll
7

Missing test for invalid condition code value

wdng updated this revision to Diff 65761.Jul 27 2016, 9:55 AM
wdng marked 6 inline comments as done.

Add illegal cond code LIT tests.

arsenm added inline comments.Jul 27 2016, 12:01 PM
lib/Target/AMDGPU/SIInstructions.td
2391–2392

Spaces before the types and the next (

test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
29

You should drop the suffix here to strengthen the test. It would be best to reduce to just v_cmp because something could commute the instruction

test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ll
17

Ditto

wdng updated this revision to Diff 65788.Jul 27 2016, 12:14 PM
wdng marked 4 inline comments as done.
  1. Rename LIT test function names.
  2. Add space between type and parenthesis.
arsenm accepted this revision.Jul 27 2016, 4:00 PM
arsenm edited edge metadata.

LGTM, you can drop the "such as" part from your commit message

This revision is now accepted and ready to land.Jul 27 2016, 4:00 PM
This revision was automatically updated to reflect the committed changes.