This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GFX10: implement ds_ordered_count changes
ClosedPublic

Authored by nhaehnle on Jun 24 2019, 7:18 AM.

Details

Summary

ds_ordered_count can now simultaneously operate on up to 4 dwords
in a single instruction, which are taken from (and returned to)
lanes 0..3 of a single VGPR.

Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276

Diff Detail

Repository
rL LLVM

Event Timeline

nhaehnle created this revision.Jun 24 2019, 7:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 24 2019, 7:18 AM
This revision is now accepted and ready to land.Jun 24 2019, 8:30 AM
arsenm added inline comments.Jun 24 2019, 8:34 AM
test/CodeGen/AMDGPU/llvm.amdgcn.ds.ordered.add.gfx10.ll
2 ↗(On Diff #206213)

Should merge this with the other test

nhaehnle marked 2 inline comments as done.Jun 25 2019, 4:42 AM
nhaehnle added inline comments.
test/CodeGen/AMDGPU/llvm.amdgcn.ds.ordered.add.gfx10.ll
2 ↗(On Diff #206213)

I've kept it separate on purpose due to the change of meaning in the "ordered count index" argument. The argument encodes the number of threads / number of dwords that participate in the ordered operation.

This revision was automatically updated to reflect the committed changes.