This is an archive of the discontinued LLVM Phabricator instance.

[flang] Add basic support for alloca address space handling in FIR->LLVMIR codegen
Needs ReviewPublic

Authored by jsjodin on Feb 16 2023, 10:14 AM.

Details

Summary

This patch adds basic support for targets to specify the address space for
alloca operations. This is used when converting from FIR->LLVMIR. The resulting pointer from
a mlir::LLVM::AllocaOp, will always be cast with a mlir::LLVM::AddrSpaceCastOp
to the generic address space, unless it is already generic.

Depends on https://reviews.llvm.org/D144657

Diff Detail

Event Timeline

jsjodin created this revision.Feb 16 2023, 10:14 AM
jsjodin requested review of this revision.Feb 16 2023, 10:14 AM
jsjodin updated this revision to Diff 498375.Feb 17 2023, 7:42 AM
jsjodin updated this revision to Diff 498474.Feb 17 2023, 12:06 PM

Fixed typo

I am not familiar with address spaces in LLVM. Could you give a pointer to how llvm handles address spaces? Also what is the role of datalayout in address space handling? Currently, it seems that the handling is fully based on the target.

flang/lib/Optimizer/CodeGen/CodeGen.cpp
350–351

Nit: Avoid auto if the type is not in the RHS. Here and in other locations.

528–530

This code for introducing the addrspace cast seems to always be hit. Is that what is intended? Why do we want to allocate in an address space but immediately cast it back to the default address space?

3878

Would it make sense to have separate conversion patterns like the above for this purpose. Note that this is only a question and not a change request.

flang/lib/Optimizer/CodeGen/Target.cpp
502

Is it possible to include this value from an existing file?
I see that it is there in the following file. We might have to check whether it is OK to include.

tools/mlir/include/mlir/Dialect/LLVMIR/ROCDLOpsDialect.h.inc:    static constexpr unsigned kPrivateMemoryAddressSpace = 5;
flang/test/Fir/convert-to-llvm.fir
1029

Nit: It is probably good to include the address space value in the Check for the alloca.

I am not familiar with address spaces in LLVM. Could you give a pointer to how llvm handles address spaces? Also what is the role of datalayout in address space handling? Currently, it seems that the handling is fully based on the target.

Here is the description about data layout and address spaces in LLVM:
https://llvm.org/docs/LangRef.html#data-layout
In the data layout you can specify the address space of a variety of objects., e.g. alloca and globals.

I am working on a patch to add the alloca address space support in DLTI dialect, and the DataLayout interface so that it can be queried.
This canl be used in mlir later on where alloca ops can be introduced by transforms and the address space needs to be known.

jsjodin added inline comments.Feb 21 2023, 7:56 AM
flang/lib/Optimizer/CodeGen/CodeGen.cpp
350–351

Sure, I can specify the types.

528–530

Yes, that is the intent. If you look at how clang handles this, this is basically what happens when an alloca is generated (there's some special casing for constants and such that can be implemented in subsequent patches). The basic problem is that a pointer can point to any object (address space), the safest option is to cast the result from an alloca to the generic address space. From what I gathered this is true for CUDA/HIP/OpenMP, but in OpenCL things are separated, which probably means that pointers with different address spaces cannot be assigned/cast the same way.

3878

I think it could be done that way, but I'm not sure about the pros/cons. It seems natural to me that the address space is specified when the op is created.

flang/lib/Optimizer/CodeGen/Target.cpp
502

I will look into it. I believe it should be possible.

flang/test/Fir/convert-to-llvm.fir
1029

Yes, good point. The address space is captured in the addresspacecast, but it will make the difference between GENERIC and AMDGPU clearer.

jsjodin marked 5 inline comments as not done.Feb 21 2023, 7:56 AM
jsjodin updated this revision to Diff 501287.Feb 28 2023, 1:54 PM

Make sure all tests check alloca return type. Use constant value for alloca address space from the ROCDL dialect.

jsjodin marked 2 inline comments as done.Feb 28 2023, 1:54 PM
jsjodin updated this revision to Diff 501312.Feb 28 2023, 3:32 PM

Make types explicit.

jsjodin marked 2 inline comments as done.Feb 28 2023, 3:32 PM

Thanks for the changes and the replies.

Could you also comment on why the address-space cast need not be modelled in FIR?

There are some passes that convert between allocs and allocas. Could you comment on why this casting is only required for allocas? And wouldn't it cause issues when we convert from one to the other?
https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/Transforms/MemoryAllocation.cpp
https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/Transforms/StackArrays.cpp

flang/lib/Optimizer/CodeGen/Target.cpp
501

Nit: update this comment

Thanks for the changes and the replies.

Could you also comment on why the address-space cast need not be modelled in FIR?

It is somewhat arbitrary when they are introduced. Only the backend really needs them, but if address spaces are known earlier on it is possible do a better job with analyzes/optimizations e.g. alias analysis. If they are introduced in the LLVMIR dialect, then we should get the same benefits as clang with regards to optimizations in LLVM.

There are some passes that convert between allocs and allocas. Could you comment on why this casting is only required for allocas? And wouldn't it cause issues when we convert from one to the other?
https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/Transforms/MemoryAllocation.cpp
https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/Transforms/StackArrays.cpp

It won't cause any since memory spaces are basically optional until the backend, and not having to consider them will make transforms easer to write. LLVM already handles them correctly so it is safe to introduce them at that point, and may provide some performance benefits.

jsjodin updated this revision to Diff 501515.Mar 1 2023, 7:55 AM

Fix comment

jsjodin marked an inline comment as done.Mar 1 2023, 7:55 AM

@kiranchandramohan did you have any other feedback on this patch?

Sorry, I haven't had a chance to look at this in detail. I would need to get up to speed with addrspace in LLVM before proceeding with this. I would recommend someone from your organization review this patch and then @jeanPerier sign this off, if it is OK.

Sorry, I haven't had a chance to look at this in detail. I would need to get up to speed with addrspace in LLVM before proceeding with this. I would recommend someone from your organization review this patch and then @jeanPerier sign this off, if it is OK.

No problem, I will get someone to review it on our side. Thanks!

arsenm added inline comments.Mar 9 2023, 1:33 PM
flang/lib/Optimizer/CodeGen/Target.h
147

I'd assume you should be reproducing a language:target address space map like clang, rather than introducing a hook for every language addrspace

150

Don't see why you need this, can't you just read it from the datalayout?

jsjodin added inline comments.Mar 13 2023, 5:58 AM
flang/lib/Optimizer/CodeGen/Target.h
147

It was only on the LLVMIR side that we need the address space, so looking at the LLVMIR dialect data layout should work.

150

Yes, I should be able to use the LLVMIR dialect data layout. I got something mixed up with the clang TargetInfo I believe. I thought there was a corresponding function there that returned the alloca address space.

jsjodin edited the summary of this revision. (Show Details)Mar 13 2023, 7:29 AM
jsjodin updated this revision to Diff 513688.Apr 14 2023, 11:25 AM

Sorry for the very delayed update on this patch. It now uses the DataLayout to get the correct alloca mem/addr space.

A ping to keep this patch on peoples radar and to try and get some reviewer attention, Thanks!

A second ping to keep this patch on peoples radar and to try and get some reviewer attention, Thanks!

jeanPerier added inline comments.May 9 2023, 1:25 AM
flang/lib/Optimizer/CodeGen/CodeGen.cpp
3829

I think this analysis is walking every operation from the ModuleOp, and since it is only used with getAtOrAbove(mod), all the walking of the descendent is just wasted time, especially when data layout do not matter.
Isn't there a way to get the same info without walking the ModuleOp here?

jsjodin added inline comments.May 18 2023, 7:25 AM
flang/lib/Optimizer/CodeGen/CodeGen.cpp
3829

Not sure if there is a faster way, but this seems to be the standard way of doing things if you look in FuncToLLVM.cpp and MemRefToLLVM.cpp.