Page MenuHomePhabricator

AMDGPU: Fix s.buffer.load being marked as readnone
Needs ReviewPublic

Authored by arsenm on Jun 17 2019, 5:29 AM.

Details

Diff Detail

Event Timeline

arsenm created this revision.Jun 17 2019, 5:29 AM
nhaehnle requested changes to this revision.Jun 17 2019, 5:33 AM

What does this actually fix?

This is likely to pessimize codegen for graphics quite badly. Graphics APIs make fairly strong aliasing guarantees which we don't properly express in LLVM at the moment, and we kind of get by without it by having s.buffer.load be readnone.

This revision now requires changes to proceed.Jun 17 2019, 5:33 AM

I should clarify that I'm not opposed to this kind of change in the long run, but as-is it needs a careful look at the performance implications.

What does this actually fix?

This is likely to pessimize codegen for graphics quite badly. Graphics APIs make fairly strong aliasing guarantees which we don't properly express in LLVM at the moment, and we kind of get by without it by having s.buffer.load be readnone.

When trying to do some global isel work, I ran into inconsistencies in the attributes (e.g. D63422), and I don't want to spread awareness of this hack to more places. We can't really fix this until we have fat pointers. I would rather have the declaration be accurate. If graphics users want to assume readnone as a performance hack until that is fixed, they can annotate every call site with readnone.

This breaks Mesa:
LLVM ERROR: Cannot select: intrinsic %llvm.amdgcn.s.buffer.load

What does this actually fix?

This is likely to pessimize codegen for graphics quite badly. Graphics APIs make fairly strong aliasing guarantees which we don't properly express in LLVM at the moment, and we kind of get by without it by having s.buffer.load be readnone.

When trying to do some global isel work, I ran into inconsistencies in the attributes (e.g. D63422), and I don't want to spread awareness of this hack to more places. We can't really fix this until we have fat pointers. I would rather have the declaration be accurate. If graphics users want to assume readnone as a performance hack until that is fixed, they can annotate every call site with readnone.

Okay, I think changing the intrinsic to readonly is fine. Mesa sets readnone on the callsite.

That said, I do see lit test failures which should be the same as the Mesa problems. Could you please also add a test (or modify a test) in llvm.amdgcn.s.buffer.load.ll which sets readnone on the callsite?

arsenm updated this revision to Diff 206565.Jun 25 2019, 5:44 PM

Fix selection

mareko added a comment.EditedJun 26 2019, 10:21 AM

This has a possibly negative impact on Mesa. Both SGPR usage and SGPR spilling increased. I don't see anything suspicious in the generated assembly other than instructions being reordered. I'd like to better understand the side effects of this patch.

Note that Mesa does set readnone on call sites.