Page MenuHomePhabricator

[AMDGPU] gfx10 atomic optimizer changes.

Authored by foad on Aug 2 2019, 3:33 AM.



Add support for gfx10, where all DPP operations are confined to work
within a single row of 16 lanes, and wave32.

Diff Detail

Event Timeline

foad created this revision.Aug 2 2019, 3:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 2 2019, 3:33 AM

I'm happy to spilt this up if the reviewers would like. There are a few NFC changes I could apply first, and/or I could try to split the wave32 changes out from the gfx10 dpp changes.

arsenm added inline comments.Aug 5 2019, 7:32 AM

I think it would end up being shorter/less line wrapping if you separately got the declaration for the update_dpp intrinsic and reused it in all of these places


I'm trying to avoid explicit getGeneration checks everywhere, and restricting them to all be in the Subtarget.

foad marked an inline comment as done.Aug 5 2019, 7:55 AM
foad added inline comments.

You mean I should define and use some more specific properties like hasDPPBroadcasts and hasDPPWavefrontShifts?

foad updated this revision to Diff 213552.Aug 6 2019, 2:31 AM

Add new hasDPPBroadcasts and hasDPPWavefrontShifts.
Use CreateCall instead of CreateIntrinsic in new helper functions.

foad marked 2 inline comments as done.Aug 6 2019, 2:32 AM
arsenm accepted this revision.Aug 18 2019, 8:21 AM


This revision is now accepted and ready to land.Aug 18 2019, 8:21 AM
This revision was automatically updated to reflect the committed changes.