This is an archive of the discontinued LLVM Phabricator instance.

CUDA/HIP: Use kernel name to map to symbol
ClosedPublic

Authored by dcastagna on Dec 25 2022, 12:49 PM.

Details

Summary

Currently CGCUDANV uses an llvm::Function as a key to map kernels to a
symbol in host code. HIP adds one level of indirection and uses the
llvm::Function to map to a global variable that will be initialized to
the kernel stub ptr.

Unfortunately there is no garantee that the llvm::Function created
by GetOrCreateLLVMFunction will be the same. In fact, the first
time we encounter GetOrCrateLLVMFunction for a kernel, the type
might not be completed yet, and the type of llvm::Function will be
a generic {}, since the complete type is not required to get a symbol
to a function. In this case we end up creating two global variables,
one for the llvm::Function with the incomplete type and one for the
function with the complete type. The first global variable will be
declared by not defined, resulting in a linking error.

This change uses the mangled name of the llvm::Function as key in the
KernelHandles map, in this way the same llvm::Function will be
associated to the same kernel handle even if they types are different.

Diff Detail

Event Timeline

dcastagna created this revision.Dec 25 2022, 12:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 25 2022, 12:49 PM
Herald added subscribers: mattd, yaxunl. · View Herald Transcript
dcastagna requested review of this revision.Dec 25 2022, 12:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 25 2022, 12:49 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
yaxunl added inline comments.Jan 10 2023, 7:19 AM
clang/test/CodeGenCUDA/incomplete-func-ptr-type.cu
3

need to check _Z19__device_stub__kern7TempValIjE generates the correct call of hipLaunchKernel using the correct handle.

also need to check hipRegisterFunction uses the correct function name and handle.

dcastagna updated this revision to Diff 490559.Jan 19 2023, 9:35 AM

Address yaxunl comments.

dcastagna marked an inline comment as done.Jan 19 2023, 9:36 AM
yaxunl added inline comments.Jan 19 2023, 12:48 PM
clang/test/CodeGenCUDA/incomplete-func-ptr-type.cu
22

pls also check @0, which is the device side kernel name.

dcastagna updated this revision to Diff 490644.Jan 19 2023, 1:08 PM

Add a check for device side kernel name

yaxunl accepted this revision.Jan 19 2023, 1:13 PM

LGTM. Thanks.

This revision is now accepted and ready to land.Jan 19 2023, 1:13 PM
This revision was landed with ongoing or failed builds.Jan 19 2023, 3:02 PM
This revision was automatically updated to reflect the committed changes.
tra added a subscriber: tra.Mar 17 2023, 5:26 PM

It appears that this patch may be causing a use-after free when we attempt to generate kernel registration code.
The root cause is that the value we insert into KernelHandles by name is later on replaced by a different instance of the global value with the same name.
AFAICT, the invalidation issue was present before but we accidentally avoided it because we only looked up the still-valid new entries. The dangling references were still in the map, but not accessed.

It's reproducible on this example: https://godbolt.org/z/qGYTr3Ej5

Here's the stack trace for the call path which frees the old entry:

#3  0x000055e5564e75e2 in llvm::User::operator delete (Usr=0x55e55f6ff438) at /usr/local/google/home/tra/work/llvm/repo/llvm/lib/IR/User.cpp:190
#4  0x000055e5563a0a70 in llvm::ilist_alloc_traits<llvm::Function>::deleteNode (V=0x55e55f6ff438) at /usr/local/google/home/tra/work/llvm/repo/llvm/include/llvm/ADT/ilist.h:42
#5  0x000055e55639d875 in llvm::iplist_impl<llvm::simple_ilist<llvm::Function>, llvm::SymbolTableListTraits<llvm::Function> >::erase (this=0x55e55f75a3e8, where=...) at /usr/local/google/home/tra/work/llvm/repo/llvm/include/llvm/ADT/ilist.h:269
#6  0x000055e55637c173 in llvm::Function::eraseFromParent (this=0x55e55f6ff438) at /usr/local/google/home/tra/work/llvm/repo/llvm/lib/IR/Function.cpp:367
#7  0x000055e5563a4ad5 in llvm::GlobalValue::eraseFromParent (this=0x55e55f6ff438) at /usr/local/google/home/tra/work/llvm/repo/llvm/include/llvm/IR/Value.def:76
#8  0x000055e5570dd747 in clang::CodeGen::CodeGenModule::applyGlobalValReplacements (this=0x55e55f7a06d0) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/CodeGen/CodeGenModule.cpp:315
#9  0x000055e5570deb3a in clang::CodeGen::CodeGenModule::Release (this=0x55e55f7a06d0) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/CodeGen/CodeGenModule.cpp:540
#10 0x000055e5581e5ede in (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit (this=0x55e55f79aa40, Ctx=...) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/CodeGen/ModuleBuilder.cpp:287
#11 0x000055e5581de64e in clang::BackendConsumer::HandleTranslationUnit (this=0x55e55f79a7a0, C=...) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/CodeGen/CodeGenAction.cpp:308
#12 0x000055e55b063273 in clang::ParseAST (S=..., PrintStats=false, SkipFunctionBodies=false) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Parse/ParseAST.cpp:175
#13 0x000055e557ff26ec in clang::ASTFrontendAction::ExecuteAction (this=0x55e55f75aa20) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Frontend/FrontendAction.cpp:1168
#14 0x000055e5581da604 in clang::CodeGenAction::ExecuteAction (this=0x55e55f75aa20) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/CodeGen/CodeGenAction.cpp:1172
#15 0x000055e557ff20ec in clang::FrontendAction::Execute (this=0x55e55f75aa20) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Frontend/FrontendAction.cpp:1058
#16 0x000055e557f1b618 in clang::CompilerInstance::ExecuteAction (this=0x55e55f7564c0, Act=...) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Frontend/CompilerInstance.cpp:1048
#17 0x000055e5581c35c7 in clang::ExecuteCompilerInvocation (Clang=0x55e55f7564c0) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:264
#18 0x000055e5533464b0 in cc1_main (Argv=llvm::ArrayRef of length 86 = {...}, Argv0=0x7ffd95ce621e "/usr/local/google/home/tra/work/llvm/build/debug/bin/clang-15", MainAddr=0x55e5533305f0 <GetExecutablePath[abi:cxx11](char const*, bool)>)
    at /usr/local/google/home/tra/work/llvm/repo/clang/tools/driver/cc1_main.cpp:251
#19 0x000055e553331dca in ExecuteCC1Tool (ArgV=llvm::SmallVector of Size 87, Capacity 256 = {...}, ToolContext=...) at /usr/local/google/home/tra/work/llvm/repo/clang/tools/driver/driver.cpp:366
#20 0x000055e553330aec in clang_main (Argc=87, Argv=0x7ffd95ce4a68, ToolContext=...) at /usr/local/google/home/tra/work/llvm/repo/clang/tools/driver/driver.cpp:407
#21 0x000055e553369d5d in main (argc=87, argv=0x7ffd95ce4a68) at tools/clang/tools/driver/clang-driver.cpp:15

It appears that this patch may be causing a use-after free when we attempt to generate kernel registration code.
The root cause is that the value we insert into KernelHandles by name is later on replaced by a different instance of the global value with the same name.
AFAICT, the invalidation issue was present before but we accidentally avoided it because we only looked up the still-valid new entries. The dangling references were still in the map, but not accessed.

Agree. I think the reason is that the F we passed into CGNVCUDARuntime::getKernelHandle may be replaced by a new function with the same name. Luckily, the new function should be passed to CGNVCUDARuntime::getKernelHandle again, therefore we get a chance to update our maps.

clang/lib/CodeGen/CGCUDANV.cpp
1198

It is possible that F is replaced with a new function with the same name. In this case, we need to update our map, so add a condition && Loc->second == F to the above condition.

1207–1215

Add a condition if (Loc == KernelHandles.end()) to the above code for creating and modifying Var.