This is an archive of the discontinued LLVM Phabricator instance.

[EarlyCSE] Do not CSE convergent calls with memory effects
ClosedPublic

Authored by foad on Jun 16 2023, 9:13 AM.

Details

Summary

D149348 did this for readnone calls, which are handled by SimpleValue.
This patch does the same for all other CSEable calls, which are handled
by CallValue.

Diff Detail

Event Timeline

foad created this revision.Jun 16 2023, 9:13 AM
foad requested review of this revision.Jun 16 2023, 9:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2023, 9:13 AM
arsenm added inline comments.
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
144

Maybe should be checking nosync instead?

foad added inline comments.Jun 16 2023, 11:07 AM
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
144

How is that related?

arsenm added inline comments.Jun 16 2023, 3:34 PM
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
144

The comment here and special case of coroutines seems like it was hacking around the general check of nosync. In a roundabout way checking nosync is the same as convergent. IIRC the interaction between them is no-convergent + memory(none) implies nosync

nikic added a subscriber: nikic.Jun 17 2023, 12:46 AM
nikic added inline comments.
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
144

The coroutine check has no relation to nosync. It's hacking around a weakness in memory effect modelling that is unrelated to synchronization.

I'm not familiar with the code but the thrust of it makes sense. We should probably still try to CSE calls that are in the same basic block, though. Is that easy to do?

foad planned changes to this revision.Jun 19 2023, 5:02 AM

I need to rethink. The patch as-is does not make much sense because @llvm.read_register is not readnone.

arsenm added inline comments.Jun 19 2023, 5:02 AM
llvm/test/Transforms/EarlyCSE/AMDGPU/convergent-call.ll
23

I think something is wrong here but you should test with something normal and not read_register

foad updated this revision to Diff 532629.Jun 19 2023, 6:22 AM

Rethink.

foad retitled this revision from [EarlyCSE] Do not CSE convergent readnone calls to [EarlyCSE] Do not CSE convergent calls with memory effects.Jun 19 2023, 6:23 AM
foad edited the summary of this revision. (Show Details)

Can we have a test with 2 convergent calls in the same block, or do we have one already?

foad added inline comments.Jun 20 2023, 12:17 PM
llvm/test/Transforms/EarlyCSE/AMDGPU/convergent-call.ll
48

@jdoerfert it is here. (It's not upstream yet - this comes from a patch that I only have locally.)

arsenm added inline comments.Jun 21 2023, 12:54 PM
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
336–337

CallBase? Are invokes ever CSEable?

llvm/test/Transforms/EarlyCSE/AMDGPU/convergent-call.ll
28

Should still add another test that doesn't involve read_register and uses a proper convergent call

arsenm requested changes to this revision.Jul 13 2023, 6:40 AM

still would like a test that's more normal than read_register

This revision now requires changes to proceed.Jul 13 2023, 6:40 AM
foad updated this revision to Diff 540055.Jul 13 2023, 8:36 AM

Add test with @llvm.amdgcn.live.mask

foad added a comment.Jul 13 2023, 8:38 AM

still would like a test that's more normal than read_register

There aren't really any normal readonly convergent intrinsics. @llvm.amdgcn.live.mask is currently readonly so I used that - but it seems to me it should probably be convergent readnone.

arsenm accepted this revision.Jul 13 2023, 9:47 AM
arsenm added inline comments.
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
545

Is this supposed to implicitly assume invokes and callbr can't reach here?

This revision is now accepted and ready to land.Jul 13 2023, 9:47 AM
foad added inline comments.Jul 14 2023, 3:45 AM
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
545

Yes. This is checked in CallValue::canHandle.

This revision was landed with ongoing or failed builds.Jul 14 2023, 3:48 AM
This revision was automatically updated to reflect the committed changes.