Page MenuHomePhabricator

Don't sink ptrtoint/inttoptr sequences into non-noop addrspacecasts.
Needs ReviewPublic

Authored by maleadt on Nov 27 2020, 12:21 AM.

Details

Summary

In https://reviews.llvm.org/D30114, support for mismatching address spaces was introduced to CodeGenPrepare's optimizeMemoryInst, using addrspacecast as it was argued that only no-op addrspacecasts would be considered when constructing the address mode. However, by doing inttoptr/ptrtoint, it's possible to get CGP to emit an addrspace that's not actually no-op, introducing a miscompilation:

define void @kernel(i8* %julia_ptr) {
  %intptr = ptrtoint i8* %julia_ptr to i64
  %ptr = inttoptr i64 %intptr to i32 addrspace(3)*

  br label %end
end:

  store atomic i32 1, i32 addrspace(3)* %ptr unordered, align 4
  ret void
}

Gets compiled to:

define void @kernel(i8* %julia_ptr) {
end:
  %0 = addrspacecast i8* %julia_ptr to i32 addrspace(3)*
  store atomic i32 1, i32 addrspace(3)* %0 unordered, align 4
  ret void
}

In the case of NVPTX, this introduces a cvta.to.shared, whereas leaving out the %end block and branch doesn't trigger this optimization. This results in illegal memory accesses as seen in https://github.com/JuliaGPU/CUDA.jl/issues/558

In this change, I introduced a check before doing the pointer cast that verifies address spaces are the same. If not, it emits a ptrtoint/inttoptr combination to get a no-op cast between address spaces. I decided against disallowing ptrtoint/inttoptr with non-default AS in matchOperationAddr, because now its still possible to look through multiple sequences of them that ultimately do not result in a address space mismatch (i.e. the second lit test).

Diff Detail

Event Timeline

maleadt created this revision.Nov 27 2020, 12:21 AM
maleadt requested review of this revision.Nov 27 2020, 12:21 AM
maleadt edited the summary of this revision. (Show Details)Nov 27 2020, 12:29 AM
maleadt added a project: Restricted Project.Nov 27 2020, 12:37 AM
jlebar added a reviewer: tra.Nov 27 2020, 10:34 AM

This looks reasonable to me (and I appreciate all the debugging in the Julia bug!), but I have never touched this code, so I don't 100% feel comfortable approving the change.

tra added a comment.Nov 30 2020, 11:36 AM

This looks reasonable to me (and I appreciate all the debugging in the Julia bug!), but I have never touched this code, so I don't 100% feel comfortable approving the change.

Same here. The change makes sense, but there's more going on that I'd like the authors of the CGP to take a look.

efriedma added inline comments.Nov 30 2020, 12:04 PM
llvm/lib/CodeGen/CodeGenPrepare.cpp
5049

I'd state this comment differently:

There are two reasons the types might not match: a no-op addrspacecast, or a ptrtoint/inttoptr pair. Either way, we emit a ptrtoint/inttoptr pair, to ensure we match the original semantics.


Ideally, we don't want to convert an addrspacecast to a ptrtoint/inttoptr pair; it's semantically valid, but we lose information. At this point in the pipeline, it doesn't affect that much, but it does hurt alias analysis a bit. I'm okay with starting with a conservative fix, and revisiting later.

arsenm added a subscriber: arsenm.Nov 30 2020, 1:29 PM
arsenm added inline comments.
llvm/lib/CodeGen/CodeGenPrepare.cpp
5049

Ideally we would allow bitcast between different address space pointers with the same size, which would avoid the need for this special case

maleadt updated this revision to Diff 308962.Dec 2 2020, 7:16 AM

Update comment.

Thanks for the review, I've updated the comment.
FWIW, we have already applied this patch to the LLVM tree used by Julia, and it (unsurprisingly) hasn't resulted in any issues.

llvm/lib/CodeGen/CodeGenPrepare.cpp
5049

You mean "two reasons the *address spaces of* the types might not match"?

I'll update a comment and add a TODO about bitcasts between different address space pointers.

Bump, anything else needed here?

Bump, anything else needed here?

@arsenm I think this is to you.