Page MenuHomePhabricator

AMDGPU: Divergence-driven selection of scalar buffer load intrinsics

Authored by nhaehnle on Oct 15 2018, 4:40 AM.



Moving SMRD to VMEM in SIFixSGPRCopies is rather bad for performance if
the load is really uniform. So select the scalar load intrinsics directly
to either VMEM or SMRD buffer loads based on divergence analysis.

If an offset happens to end up in a VGPR -- either because a floating
point calculation was involved, or due to other remaining deficiencies
in SIFixSGPRCopies -- we use v_readfirstlane.

There is some unrelated churn in tests since we now select MUBUF offsets
in a unified way with non-scalar buffer loads.

Change-Id: I170e6816323beb1348677b358c9d380865cd1a19

Diff Detail

Event Timeline

nhaehnle created this revision.Oct 15 2018, 4:40 AM
alex-t accepted this revision.Oct 16 2018, 12:26 AM


This revision is now accepted and ready to land.Oct 16 2018, 12:26 AM
arsenm added inline comments.Oct 16 2018, 10:40 AM

I don't love the hardcoded types here. Can you assert on the sizes, in case we fix casting all mem operations to int?

nhaehnle updated this revision to Diff 169868.Oct 16 2018, 11:28 AM
  • add an assertion for the VT
arsenm accepted this revision.Oct 16 2018, 11:34 AM
arsenm added inline comments.

This should probably be called soffset? I guess that's a preexisting condition

nhaehnle added inline comments.Oct 16 2018, 11:39 AM

Yeah, changing the OpName seems like a separate thing :)

This revision was automatically updated to reflect the committed changes.

Hi Nicolai,

Fyi, This introduced a regression with Mass Effect Andromeda with DXVK and RADV on Polaris10. See

Thanks for the heads up. I'll take a look.

Hi Nicolai,

Fyi, This introduced a regression with Mass Effect Andromeda with DXVK and RADV on Polaris10. See

I have root-caused the problem, which is that divergence info isn't passed correctly through the SelectionDAG in all cases. I'm going to look into a fix, but I'll likely have to touch common code to do so and have reverted this commit for now.

Hi Nicolai,

I'm sorry but this change (actually r348050) also breaks World Of Tanks, here's the apitrace!MOg2mSrD!aJHdSrimJBrVzv6c0ParmHDKIsduMq55CKPJjk0OgRI and the DXVK issue is here
Note that it's not a renderdoc capture because we failed to record one with that game, it worked with apitrace though. The caterpillar issue can be reproduced with both WineD3D/RadeonSI and DXVK/RADV.

The first bad commit is cc436fd26637b0629b95fd8e60fde61cec4b421f, then it works with 69f971eb1814487fc23ee092a69532a8d152c80d (because you reverted the original change) and e3924b1c15606bb5bf98392e0c20e731b4965311 breaks it again.

Can you look into this?

Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2019, 5:23 AM