This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Add internal linkage to `memref.global`
Needs ReviewPublic

Authored by guraypp on Jun 29 2023, 5:36 AM.

Details

Summary

The internal is missing, so this work adds that.

Diff Detail

Event Timeline

guraypp created this revision.Jun 29 2023, 5:36 AM
Herald added a reviewer: ftynse. · View Herald Transcript
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
guraypp requested review of this revision.Jun 29 2023, 5:36 AM

The motivation behind adding internal is to enable demotion of global shared memory variable in the kernel (maybe there are other ways to do it)

Here is the input MLIR code:

memref.global "private" @gv1 : memref<1xi64,3>
func.func @get_gv0_memref() {
  %0 = memref.get_global @gv1 : memref<1xi64,3>
  %c0 = arith.constant 0 : index
  %c4 = arith.constant 4 : i64
  memref.store %c4, %0[%c0] : memref<1xi64,3>
  return
}

When generating PTX code for memref.global "private", we get:

.shared .align 8 .b8 gv1[8];
.visible .func get_gv0_memref()
{
        .reg .b64       %rd<2>;
        mov.u64         %rd1, 4;
        st.shared.u64   [gv1], %rd1;
        ret;
}

In the case where the linkage was memref.global "internal", the PTX code would look like this:

.visible .func get_gv0_memref()
{
        .reg .b64       %rd<2>;
        // demoted variable
        .shared .align 8 .b8 gv1[8];
        mov.u64         %rd1, 4;
        st.shared.u64   [gv1], %rd1;
        ret;
}

These examples demonstrate the internal, specifically the demotion of the variable gv1 to shared memory in the PTX code.

LLVM linkage and MLIR symbol visibility are orthogonal concepts and should not be confused. For example, MLIR func.func declarations always have a private visibility, while their corresponding LLVM IR linkage can be external.

mlir/include/mlir/IR/SymbolTable.h
88

I do not understand this comment, could you expand and explicitly state the difference between internal and private?

guraypp added a comment.EditedJun 29 2023, 8:53 AM

Thanks for the feedback. As I mentioned, I am not sure this is the right way. Any ideas are welcome!

My goal is to create
memref.global "internal" @gv1 : memref<1xi64,3>

And lower into:
llvm.mlir.global internal @gv1() {addr_space = 3 : i32} : !llvm.array<1 x i64>

Currently, memref.global only supports : private, public, nested

I'm not really familiar with PTX, but do you know why it treats internal and private differently? Their semantic difference is minimal and shouldn't be relevant for PTX, I assume.
Is this just an oversight in the PTX lowering, or are you aware of any semantic difference there?

My goal is to create
memref.global "internal" @gv1 : memref<1xi64,3>

And lower into:
llvm.mlir.global internal @gv1() {addr_space = 3 : i32} : !llvm.array<1 x i64>

One could also add a flag to the MemRefToLLVM conversion that ensures that private visibility is converted into internal linkage. Maybe it even makes sense to always emit internal linkage for these memrefs.

Currently, memref.global only supports : private, public, nested

Again, this is symbol visibility, which is an MLIR internal construct and really only specifies where this symbol is visible inside the IR.

Thanks for feedback. See the linkage types (internal vs private) in llvm https://llvm.org/docs/LangRef.html#linkage-types

I could get rid of Internal in MLIR symbols if it is not desired. In this case, I will need to have code like below in MemRefToLLVM.cpp. I am not sure which way is better.

LLVM::Linkage linkage;
if (global.isPublic()) {
  linkage = LLVM::Linkage::External;
} else {
  if (type.getMemorySpaceAsInt() == 3) {
    linkage = LLVM::Linkage::Internal;
  } else {
    linkage = LLVM::Linkage::Private;
  }
}

Thanks for feedback. See the linkage types (internal vs private) in llvm https://llvm.org/docs/LangRef.html#linkage-types

The only difference is the visibility of a symbol in an ELF file, how does this impact PTX code generations behaviour? I do not assume that PTX cares about such things.
Especially considering that private doesn't expose a symbol in the symbol table, so why can this not be treated as internal global can.

I could get rid of Internal in MLIR symbols if it is not desired. In this case, I will need to have code like below in MemRefToLLVM.cpp. I am not sure which way is better.

LLVM::Linkage linkage;
if (global.isPublic()) {
  linkage = LLVM::Linkage::External;
} else {
  if (type.getMemorySpaceAsInt() == 3) {
    linkage = LLVM::Linkage::Internal;
  } else {
    linkage = LLVM::Linkage::Private;
  }
}

I'm not a fan of encoding target specific behaviour into general purpose passes. Address spaces are fully target specific, so treating 3 somehow differently is not what we want.
For almost all intends and purposes, private and internal behave the same way, so we might really just lower private visibility to internal linkage, if PTX cannot deal with it. Ideally, this is done only when a certain pass option is set, to ensure that we do not mess up other existing flows.

Alternatively, the linkage itself could be added to memref.global, but that would leak LLVM internals up into higher level dialects that shouldn't really care about things like linkage.

Talked with @guraypp offline. I think this is a limitation in the PTX backend.

Talked with @guraypp offline. I think this is a limitation in the PTX backend.

Is it possible to fix this in the PTX backend then? A limitation in this backend should not be the reason to extend the symbol visibility that potentially affects all of MLIR.

Talked with @guraypp offline. I think this is a limitation in the PTX backend.

Is it possible to fix this in the PTX backend then? A limitation in this backend should not be the reason to extend the symbol visibility that potentially affects all of MLIR.

That’s what I asked @guraypp to check.
Sorry for not being clearer earlier.

I've sent a patch for review to fix the NVPTX backend limitation https://reviews.llvm.org/D154507