This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Represent IntrHasSideEffects using inaccessiblemem read+write
Needs ReviewPublic

Authored by nikic on Nov 14 2022, 5:23 AM.

Details

Summary

Map IntrHasSideEffects to a read+write of inaccessible memory, which is the usual way to represent a side effect that cannot be modelled more precisely.

This means that IntrNoMem + IntrHasSideEffects is reduced from reading and writing all memory to only accessing inaccessible memory, while IntrReadMem + IntrHasSideEffects is now no longer readonly (which is clearly incorrect as it e.g. allows simply removing the side effect).

The fact that IntrNoMem + IntrHasSideEffects is no longer an arbitrary read and write may cause issues for existing intrinsics marked as such. In particular, I think the llvm.amdgcn.s.barrier test changes here look incorrect -- if I understand the semantics of that intrinsic correctly, then moving loads/stores across the intrinsic is not legal, and the current IntrNoMem marking is simply incorrect. Can somebody from the AMDGPU side please confirm what the intended semantics are?

Diff Detail

Unit TestsFailed

Event Timeline

nikic created this revision.Nov 14 2022, 5:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 14 2022, 5:23 AM
nikic requested review of this revision.Nov 14 2022, 5:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 14 2022, 5:23 AM
foad added a comment.Nov 14 2022, 6:34 AM

Map IntrHasSideEffects to a read+write of inaccessible memory, which is the usual way to represent a side effect that cannot be modelled more precisely. [emphasis mine]

Is it? Does that mean that things-with-side-effects only interact with other things-with-side-effects? Is that really the intention of the hasSideEffects flag on an intrinsic?

nikic added a comment.Nov 14 2022, 6:42 AM

Map IntrHasSideEffects to a read+write of inaccessible memory, which is the usual way to represent a side effect that cannot be modelled more precisely. [emphasis mine]

Is it? Does that mean that things-with-side-effects only interact with other things-with-side-effects? Is that really the intention of the hasSideEffects flag on an intrinsic?

It's the closest we can represent. Of course, it the intrinsic has other effects that we explicitly model (such as ordinary memory effects, unwinding effects or divergence effects) then those need to be explicitly stated, but if the effect doesn't fall into one of those categories, then we can currently only model it via inaccessible memory, which at least enforces some of the basic expectations for side-effecting instructions (no DCE, no reordering relative to other effects, no movement across control flow).

I think this is a step in the right direction.
If for intrinsics we need to change their memory effects now, we should be able to do so, I somewhat doubt we need it though.

LG from me. Others might want to chime in.

arsenm added inline comments.Nov 16 2022, 5:13 PM
llvm/test/CodeGen/AMDGPU/noclobber-barrier.ll
225

llvm.amdgcn.s.barrier and llvm.amdgcn.wave.barrier should both act like synchronizes without real memory effects, So I guess it should like a fence, but they're also typically emitted together with a fence

arsenm added inline comments.Nov 17 2022, 3:38 PM
llvm/test/Analysis/GlobalsModRef/nosync_nocallback.ll
34

As counter intuitive as it is, I think this is correct. Can you add a second copy of this test. that uses a fence to show it still isn't moved?

jdoerfert added inline comments.Nov 18 2022, 2:02 PM
llvm/test/Analysis/GlobalsModRef/nosync_nocallback.ll
34

If this is legal our GPU code isn't. Barrier should be sync, and this should not hoist. I'm confused.

arsenm added inline comments.Nov 18 2022, 2:10 PM
llvm/test/Analysis/GlobalsModRef/nosync_nocallback.ll
34

Yes, it should be sync.

nikic added a comment.Dec 15 2022, 1:49 AM

So I tried this, but it doesn't seem to be that simple. There's the diff: https://gist.github.com/nikic/507f6ee3276e66d76b0d4a0c2b9ad7ce

Notably, if we make the barrier read/write in IR, we also need to set mayLoad/mayStore in DAG, and this impact scheduling. I don't know enough about AMDGPU to really interpret those changes, but it looks like we were scheduling at least some loads across a barrier?

mayLoad and mayStore should be set to 0 for barriers. Setting either to one implies there should be a memory operand, which is not the case.

CodeGen has the explicit hasSideEffects for unmodeled / nonmemory side effects unlike the IR. If TableGen is complaining about requiring mayLoad/mayStore here, then that's a tablegen bug. IntrHasSideEffects should be enough to cover this