This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] need to insert wait between the scalar load and vector store to the same address to avoid WAR conflict.
ClosedPublic

Authored by alex-t on Dec 27 2019, 8:27 AM.

Details

Summary

Before divergence driven ISel introduced scalar loads from the global address space we relied on the VMEM operations ordering enforced by the HW. Now we can easily get WAR on scalar load followed vector store to same address.
The case is here: https://github.com/RadeonOpenCompute/ROCm/issues/500

Current fix relies on the MachineMemOperands equality to check that SMRD and VMEM use same address.
Proper fix should include creating the alias analysis on the machine IR that is obviously too big hummer at the moment.

Diff Detail

Event Timeline

alex-t created this revision.Dec 27 2019, 8:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 27 2019, 8:27 AM
alex-t edited the summary of this revision. (Show Details)Dec 27 2019, 8:30 AM
rampitec added inline comments.Dec 30 2019, 1:45 PM
llvm/test/CodeGen/AMDGPU/smrd_vmem_war.ll
1

Please add -check-prefix=GCN.

8

Test should have no numbered values. Please run opt -instnamer on it.

alex-t updated this revision to Diff 235890.Jan 2 2020, 9:18 AM

Test updated.

rampitec accepted this revision.Jan 2 2020, 10:40 AM

LGTM. Delete source_filename from the test before push.

llvm/test/CodeGen/AMDGPU/smrd_vmem_war.ll
9

Delete this line.

This revision is now accepted and ready to land.Jan 2 2020, 10:40 AM
This revision was automatically updated to reflect the committed changes.