This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Handle non-temporal loads and stores
ClosedPublic

Authored by kzhuravl on Aug 17 2017, 9:59 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

kzhuravl created this revision.Aug 17 2017, 9:59 PM
t-tye added inline comments.Aug 17 2017, 11:32 PM
lib/Target/AMDGPU/SIMemoryLegalizer.cpp
123 ↗(On Diff #111614)

Do vector buffer/flat/global/scratch and scalar buffer/flat all have the slc bit? And not DS?

336–337 ↗(On Diff #111614)

According to http://llvm.org/docs/LangRef.html#load-instruction :

!nontemporal does not have any defined semantics for atomic loads.

My reading of this is that it is allowed to have nontemporal on atomic loads, but the meaning of it is not defined by LLVM. If so an assert is not correct here.

My suggestion is that AMDGPU target treats it the same as non-atomic.

376–377 ↗(On Diff #111614)

Same comment as for loads.

test/CodeGen/AMDGPU/memory-legalizer-nontemporal-load.ll
1–12 ↗(On Diff #111614)

Add tests for private address space so scratch buffer instructions can be tested.

Add tests with uniform values so scalar instructions are generated.

For gfx9 could global/scratch instructions be generated?

Does the LLVM IR validator allow NonTemporal on load atomic, if so add that?

test/CodeGen/AMDGPU/memory-legalizer-nontemporal-store.ll
1–11 ↗(On Diff #111614)

Same comment as load.

kzhuravl updated this revision to Diff 112729.Aug 25 2017, 12:20 PM
kzhuravl marked 5 inline comments as done.

Address review feedback.

b-sumner added inline comments.Aug 28 2017, 10:58 AM
lib/Target/AMDGPU/SIMemoryLegalizer.cpp
336–337 ↗(On Diff #111614)

I'd prefer ignoring the non-temporal in the case of atomics. Note that one would have to resort to coding in IR to even get it applied; no languages have a means to request non-temporal atomic operations.

kzhuravl updated this revision to Diff 112991.Aug 28 2017, 4:47 PM
kzhuravl marked an inline comment as done.

Address review feedback.

t-tye added inline comments.Aug 28 2017, 5:44 PM
lib/Target/AMDGPU/SIMemoryLegalizer.cpp
64 ↗(On Diff #112991)

Is this needed? Isn't it duplicating the information from AtomicOrdering != NotAtomic?

65 ↗(On Diff #112991)

Is having a default value a good thing? The only use seems to be for fences so why not simply pass in false?

77 ↗(On Diff #112991)

assert(Ordering != NotAtomic || IsNonTemporal)?

254–255 ↗(On Diff #112991)

What are examples of instructions that have multiple memory operands? Is is valid to treat them as system scope sequentially consistent atomic and !nontemporarl? Would it be clearer to call the constructor with the arguments explicit so can see what this is actually returning?

Same comment for other places doing this.

335 ↗(On Diff #112991)

Should this be:

if (MOI.IsNonTemporal) {
  assert(!MOI.IsAtomic);

Since above comments are that nontemporal is not allowed on atomic.

342 ↗(On Diff #112991)

Is there really a need for IsAtomic()? Isn't AtomicOrdering != NotAtomic giving the same information?

374 ↗(On Diff #112991)

Same comment as above.

336–337 ↗(On Diff #111614)

So another patch should update the LLVM IR documentation. The syntax for load/store atomic already does not list nontemprarl as an allowed attribute, but the text mentions it. And also add a validator test to enforce that non-temporal is not allowed on load/store/rmw atomic.

kzhuravl updated this revision to Diff 113126.Aug 29 2017, 11:29 AM
kzhuravl marked 8 inline comments as done.

Address review feedback.

lib/Target/AMDGPU/SIMemoryLegalizer.cpp
254–255 ↗(On Diff #112991)

I don't know of any.

t-tye added inline comments.Aug 29 2017, 12:06 PM
lib/Target/AMDGPU/SIMemoryLegalizer.cpp
71 ↗(On Diff #113126)

Since there are now separate constructors for each usage case this one seems to only be used for CAS and so should this also enforce:

assert(FailureOrdering != AtomicOrdering::NotAtomic);
262–264 ↗(On Diff #113126)

Should this be using the new 2 argument constructor?

kzhuravl updated this revision to Diff 113148.Aug 29 2017, 1:45 PM
kzhuravl marked 2 inline comments as done.

Address review feedback.

arsenm added inline comments.Aug 30 2017, 12:04 PM
lib/Target/AMDGPU/SIMemoryLegalizer.cpp
115 ↗(On Diff #113148)

It would be better to check if != 0. If you define an operand as an i1imm for example it is the sign extended constant value -1

b-sumner added inline comments.Aug 30 2017, 12:13 PM
lib/Target/AMDGPU/SIMemoryLegalizer.cpp
336–337 ↗(On Diff #111614)

Is a validator needed, or can we just update the documentation to state that any nontemporal on atomic operations is ignored?

t-tye added inline comments.Aug 30 2017, 5:20 PM
lib/Target/AMDGPU/SIMemoryLegalizer.cpp
336–337 ↗(On Diff #111614)

Currently the documentation gives the syntax for atomic instructions as not allowing the nontemporal attribute. So would seem that the validator should enforce that syntax, or the syntax should be changed.

kzhuravl updated this revision to Diff 113587.Sep 1 2017, 1:11 PM
kzhuravl marked an inline comment as done.

Address review feedback and split out cleanups discussed offline into separate changes.

t-tye added inline comments.Sep 1 2017, 3:02 PM
lib/Target/AMDGPU/SIMemoryLegalizer.cpp
235 ↗(On Diff #113587)

If there are are no MMOs then the default ought to be false.

415 ↗(On Diff #113587)

Suggest removing "else" from these "if"s as each one handles a specific case and returns. So no need for "else if" since plain "if" will be sufficient. At this point want to give the unhandled error since the preceding "if"s should have handled all the cases, so putting it outside the else makes that clearer.

Same comment for other expand functions.

420 ↗(On Diff #113587)

Suggest adding a comment along the lines of:

// Atomic instructions do not have the nontemporal attribute.

Same comment for other expand functions.

kzhuravl updated this revision to Diff 114051.Sep 6 2017, 12:36 PM
kzhuravl marked 3 inline comments as done.

Address review feedback.

lib/Target/AMDGPU/SIMemoryLegalizer.cpp
235 ↗(On Diff #113587)

Discussed offline.

415 ↗(On Diff #113587)

Taken care of in D37397.

t-tye accepted this revision.Sep 6 2017, 4:46 PM

LGTM

This revision is now accepted and ready to land.Sep 6 2017, 4:46 PM
This revision was automatically updated to reflect the committed changes.