This is an archive of the discontinued LLVM Phabricator instance.

SROA: Simplify addrspacecasted allocas with volatile accesses
ClosedPublic

Authored by arsenm on Jun 17 2019, 4:27 AM.

Details

Summary

If the alloca is accessed through an addrspacecasted pointer, allow
the normal changes on the alloca. Cast back to the original use
address space instead of the new alloca's natural address space.

Diff Detail

Event Timeline

arsenm created this revision.Jun 17 2019, 4:27 AM
sanjoy edited reviewers, added: reames; removed: sanjoy.Jun 17 2019, 8:46 AM
sanjoy added a subscriber: sanjoy.

I don't have opinions on this, but Philip might have comments around whether this is okay for GC pointers.

I don't have opinions on this, but Philip might have comments around whether this is okay for GC pointers.

No opinion GC wise, but mostly because I don't understand the intent of this patch. What is the desired outcome here?

I don't have opinions on this, but Philip might have comments around whether this is okay for GC pointers.

No opinion GC wise, but mostly because I don't understand the intent of this patch. What is the desired outcome here?

For an equivalent bitcast, the type of the underlying alloca is changed into a nicer type and possibly split into multiple allocas (with volatile accesses). I conservatively added checks in r363462 to avoid using the alloca's natural address space for the new operations. I figure it's safer to not change the address space of volatile accesses, but I don't particularly care about this property. InferAddressSpaces avoids changing volatiles, but that was also my idea.

I don't have opinions on this, but Philip might have comments around whether this is okay for GC pointers.

No opinion GC wise, but mostly because I don't understand the intent of this patch. What is the desired outcome here?

For an equivalent bitcast, the type of the underlying alloca is changed into a nicer type and possibly split into multiple allocas (with volatile accesses). I conservatively added checks in r363462 to avoid using the alloca's natural address space for the new operations. I figure it's safer to not change the address space of volatile accesses, but I don't particularly care about this property. InferAddressSpaces avoids changing volatiles, but that was also my idea.

So, just to say this back to you, this would be treating an addrspacecast more like a bitcast when the source is known to be an alloca?

Honestly, this feels more than a bit suspect to me, but I can't argue against it on a principled basis. Semantics for addrspacecast are so ill specified that it's hard to say what is and isn't legal in terms of transforms. Just to be clear, the critical detail here to me is "how do we handle addrspacecast", not "how to do handle volatile". From what I can tell, the existing code for non-volatiles would already have surprising semantics. I don't see any need to have *different* semantics for volatile vs non-volatile accesses to an addrspacecast derived pointer. The asc either had semantics, or it didn't. Whether an access is later volatile shouldn't change that.

So, if anything, I'd either lean towards a more-aggressive transform which changed the AS for the volatile access, or not splitting through an addrspacecast at all. The intermediate semantics feel odd.

I don't have opinions on this, but Philip might have comments around whether this is okay for GC pointers.

No opinion GC wise, but mostly because I don't understand the intent of this patch. What is the desired outcome here?

For an equivalent bitcast, the type of the underlying alloca is changed into a nicer type and possibly split into multiple allocas (with volatile accesses). I conservatively added checks in r363462 to avoid using the alloca's natural address space for the new operations. I figure it's safer to not change the address space of volatile accesses, but I don't particularly care about this property. InferAddressSpaces avoids changing volatiles, but that was also my idea.

So, just to say this back to you, this would be treating an addrspacecast more like a bitcast when the source is known to be an alloca?

Honestly, this feels more than a bit suspect to me, but I can't argue against it on a principled basis. Semantics for addrspacecast are so ill specified that it's hard to say what is and isn't legal in terms of transforms. Just to be clear, the critical detail here to me is "how do we handle addrspacecast", not "how to do handle volatile". From what I can tell, the existing code for non-volatiles would already have surprising semantics. I don't see any need to have *different* semantics for volatile vs non-volatile accesses to an addrspacecast derived pointer. The asc either had semantics, or it didn't. Whether an access is later volatile shouldn't change that.

So, if anything, I'd either lean towards a more-aggressive transform which changed the AS for the volatile access, or not splitting through an addrspacecast at all. The intermediate semantics feel odd.

In the original patch for addrspacecast support in SROA, the only questions that came up were surrounding overflow behavior which is currently not specified in the LangRef. As for interchanging accesses in different address spaces, I think that is clearly allowed. The original and result pointer still need to access the same memory, so changing the address space or eliminating the access should be OK. I think volatile is the question here because different address spaces may have different caching behavior or something along those lines

reames resigned from this revision.Mar 25 2020, 11:13 AM
arsenm updated this revision to Diff 471764.Oct 29 2022, 11:21 AM

Rebase and ping

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 29 2022, 11:21 AM
Herald added a subscriber: kosarev. · View Herald Transcript

It seems this patch contradicts the proposed LangRef patch. The LangRef patch says that you can't assume that a volatile operation yields the same result in different spaces.
And I think that's a fair condition. Otherwise we need to document that constraint for hardware vendors and change LangRef. Pick one of the patches basically :)

It seems this patch contradicts the proposed LangRef patch. The LangRef patch says that you can't assume that a volatile operation yields the same result in different spaces.
And I think that's a fair condition. Otherwise we need to document that constraint for hardware vendors and change LangRef. Pick one of the patches basically :)

This isn't contradictory. This is leaving the address space untouched, and doing the other simplifications SROA does (like replace byte arrays with int). A non-volatile access in this case would have been rewritten to point to the original alloca address space

nlopes added a comment.EditedOct 30 2022, 11:21 AM

It seems this patch contradicts the proposed LangRef patch. The LangRef patch says that you can't assume that a volatile operation yields the same result in different spaces.
And I think that's a fair condition. Otherwise we need to document that constraint for hardware vendors and change LangRef. Pick one of the patches basically :)

This isn't contradictory. This is leaving the address space untouched, and doing the other simplifications SROA does (like replace byte arrays with int). A non-volatile access in this case would have been rewritten to point to the original alloca address space

Ah, my bad, I misread the summary.
The patch keeps the AS of the original instructions.

So the concept looks correct to me. But I don't know enough of the SROA's code to know if all transformations that are done are ok. The test coverage doesn't look very extensive.

arsenm retitled this revision from SROA: Allow touching addrspacecast with volatile to SROA: Simplify addrspacecasted allocas with volatile accesses.Nov 1 2022, 6:49 PM
nikic added a comment.Nov 2 2022, 12:56 AM

Looks reasonable. I assume the case where this would be actually useful if an alloca gets split into parts that have volatile and non-volatile accesses, and the non-volatile accesses then get promoted?

llvm/lib/Transforms/Scalar/SROA.cpp
3033

Can't we use getPtrToNewAI here and keep the rest of the code structure?

llvm/test/CodeGen/AMDGPU/flat-address-space.ll
132

Can we use a proper pointer here to avoid making the store UB? (A null pointer is probably fine, given the non-zero address space.)

llvm/test/Transforms/SROA/addrspacecast.ll
177

Why does this load not get promoted (to undef)?

182

Test needs to be converted to opaque pointers. (We should probably add a check to guard against mixed opaque + typed pointer tests, where the typed pointers get promoted.)

arsenm marked 2 inline comments as done.Nov 10 2022, 3:14 PM

Looks reasonable. I assume the case where this would be actually useful if an alloca gets split into parts that have volatile and non-volatile accesses, and the non-volatile accesses then get promoted?

Yes, but I was mostly noticing the missing cleanup up of the aggregate types

llvm/lib/Transforms/Scalar/SROA.cpp
3033

I'm not sure what you're asking here, the new code does use it below

llvm/test/Transforms/SROA/addrspacecast.ll
177

I don't know, but I see the same thing without the patch and without address spaces or volatiles involved

arsenm updated this revision to Diff 474619.Nov 10 2022, 3:14 PM

Address comments

This looks good to me but I feel I'm not familiar enough with SROA to approve the patch.

Is there sufficient test coverage for this?
Did you check and ensure that this works as intended
on the addressspace-heavy code and does not miscompile it?

Is there sufficient test coverage for this?
Did you check and ensure that this works as intended
on the addressspace-heavy code and does not miscompile it?

Every AMDGPU alloca has a non-0 address space, and this passed regular testing at some point (which includes some specific I-want-to-break-the-stack volatile tests). Volatile stack objects aren't super common for GPU code since that kind of defeats the point

lebedev.ri accepted this revision.Dec 1 2022, 11:00 AM

LGTM, thank you.

This revision is now accepted and ready to land.Dec 1 2022, 11:00 AM