This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/
-
CodeGen/
-
CGOpenMPRuntime.h
3/8
CGOpenMPRuntime.cpp
-
CGOpenMPRuntimeGPU.h
-
CGOpenMPRuntimeGPU.cpp
-
CodeGenModule.cpp
4/10
TargetInfo.cpp
-
Sema/
-
SemaOpenMP.cpp
-
test/OpenMP/
-
OpenMP/
-
declare_target_codegen.cpp
-
declare_target_link_codegen.cpp
1/1
declare_target_only_one_side_compilation.cpp
3/5
declare_target_visibility_codegen.cpp
-
nvptx_allocate_codegen.cpp
-
nvptx_declare_target_var_ctor_dtor_codegen.cpp
1/3
target_update_messages.cpp
-
openmp/libomptarget/test/mapping/
-
libomptarget/
-
test/
-
mapping/
-
declare_target_static_var.c

Differential D129694

[OPENMP] Make declare target static global externally visible
Needs ReviewPublic

Authored by ssquare08 on Jul 13 2022, 2:04 PM.

Download Raw Diff

Details

Reviewers

jhuber6
jdoerfert
sandoval
dreachem
cchen
tianshilei1992

Summary

This is to support cases where static globals are marked declare
target. By default these file static globals are not externally
visible but in order for OpenMP runtime to access these symbols,
this changes here makes them externally visisble unless they
have "hidden" visibility attribute.
Making them externally visible, however, leads to symbol conflict
when two files have variables with the same name. Thus, these
symbols needs to be mangled on the device side of the compilation.
In order to do so, the host side mangles the symbol names and
passes that metadata information to the device side. It also uses
these mangled names if offload entry table so that the OPenMP
runtime can find these symbols during registration.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ssquare08 created this revision.Jul 13 2022, 2:04 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2022, 2:04 PM

Herald added subscribers: mattd, asavonic, guansong, yaxunl. · View Herald Transcript

ssquare08 requested review of this revision.Jul 13 2022, 2:04 PM

Herald added a reviewer: jdoerfert. · View Herald TranscriptJul 13 2022, 2:04 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: cfe-commits, sstefan1. · View Herald Transcript

ssquare08 added reviewers: sandoval, dreachem, cchen, tianshilei1992.Jul 13 2022, 2:08 PM

Thanks for the patch. I still think this is a silly feature to support, but users will probably expect it. See comments.

clang/lib/CodeGen/CGOpenMPRuntime.cpp
10795–10820	It might be easier to just mangle the original definition, that would reduce a lot of churn here adding `origName` everywhere. Any reason that's not desirable?
10802	`CGM.printPostfixForExternalizedDecl` should ideally give the same output on the host and device, but it's somewhat limited since it just checks the file ID and environment, which is technically possible to change. The kernels use `getTargetEntryUniqueInfo`, which might make sense to re-use for this case.
clang/lib/CodeGen/TargetInfo.cpp
7295	Formatting looks weird, did you do `git clang-format HEAD~1`?
9431	Just spitballing, is it possible to do this when we make the global instead?
clang/test/OpenMP/declare_target_visibility_codegen.cpp
11–13	If there are no updates between the host and device we can keep these static without emitting an offloading entry.

Harbormaster completed remote builds in B175234: Diff 444415.Jul 13 2022, 5:09 PM

ssquare08 added inline comments.Jul 13 2022, 5:23 PM

clang/lib/CodeGen/CGOpenMPRuntime.cpp
10795–10820	You are right, it'd have made the code cleaner but we didn't want to mangle the host side if we could avoid it.
10802	That was the point I had raised in one of the Clang meeting but someone had mentioned that kernels names are created on the host side and the device side reads the information though the Host IR. Seems like kernels name could also run into mismatch issue for some corner cases then?
clang/lib/CodeGen/TargetInfo.cpp
7295	Looks like I didn't run git clang-format correctly, I'll fix it. Thanks
9431	This is something I was wondering as well. In CodeGenModule::GetOrCreateLLVMGlobal, when it creates a new global variable, it always uses the llvm::GlobalValue::ExternalLinkage. Seems like this changes somewhere later to internal for static globals. Do you know where that would be?
clang/test/OpenMP/declare_target_visibility_codegen.cpp
11–13	That 's a good point. I'll fix that.

jhuber6 added inline comments.Jul 13 2022, 5:53 PM

clang/lib/CodeGen/CGOpenMPRuntime.cpp
3285–3289	This comment needs to be adjusted accordingly
10795–10820	Others may want to comment, but personally I'm not too worried about mangling a name that wouldn't have been placed in the symbol table to begin with and it would make the code a lot cleaner.
10802	So the problem here is that the host and device need to agree on what the name is so that we can register the correct variable. The CUDA / HIP toolchains solved this by either performing a mangling that is stable between the host and device, or by having the driver generate a random hash that gets used on both. OpenMP instead solves this by writing the variable to the host IR first and then reading it on the device to see what the name needs to be. Since we have that dependency we can use any mangling we want, though it's still best for it to be somewhat stable unless we want tests to change every time we run them. It probably won't hurt anything to just use `printPostfixForExternalizedDecl` but it's not as strong of a mangling as what we can do with the OpenMP method since it needs to be common between the host and device.
clang/lib/CodeGen/TargetInfo.cpp
9431	I'm not exactly sure, I remember deleting some code in D117806 that did something like that, albeit incorrectly. But I'm not sure if you'd have the necessary information to check whether or not there are updates attached to it. We don't want to externalize things if we don't need to, otherwise we'd get a lot of our device runtime variables with external visibility that now can't be optimized out.

jdoerfert added inline comments.Jul 14 2022, 7:29 AM

clang/test/OpenMP/target_update_messages.cpp
17	There is no test to show you can actually write the update now, is there?

jhuber6 added inline comments.Jul 14 2022, 7:33 AM

clang/test/OpenMP/target_update_messages.cpp
17	We should probably take the deleted code above and put it in an OpenMP runtime test to make sure it actually works now.

Adding a test and fixing

This adds a new runtime test and also address some comments.

Herald added a project: Restricted Project. · View Herald TranscriptJul 26 2022, 5:54 PM

Herald added a subscriber: openmp-commits. · View Herald Transcript

ssquare08 added inline comments.Jul 26 2022, 6:21 PM

clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
61	I wasn't expecting this to change. For some reason G2 gets the `OMPDeclareTargetDeclAttr::DT_Any` attribute instead of `OMPDeclareTargetDeclAttr::DT_NoHost` and because of that the visibility changes. @jdoerfert, is `OMPDeclareTargetDeclAttr::DT_Any` attribute expected here?
clang/test/OpenMP/declare_target_visibility_codegen.cpp
11–13	I thought about this more and I think the behavior for these declare target static globals should be the same as the other declare target. Checking for update is not enough because users could also map these variables. For update, it could be mapped with a pointer or the users could pass address of these variables to an external function. Please let me know what you think of these cases below: #pragma omp declare target static int x[10]; #pragma omp end declare target //case 1 #pragma omp target update to(x) //case 2 int* y = &x[2]; #pragma omp target update to(y[0]) //case 3 #pragma omp target map(always to:x) { x[0]= 111; } //case 4 #pragma omp target { foo(&x[3]); }
clang/test/OpenMP/target_update_messages.cpp
17	I have now added a test as suggested.

Harbormaster completed remote builds in B177768: Diff 447901.Jul 26 2022, 6:24 PM

ssquare08 added inline comments.Jul 26 2022, 6:24 PM

clang/lib/CodeGen/CGOpenMPRuntime.cpp
10802	`CGM.printPostfixForExternalizedDecl` should ideally give the same output on the host and device, but it's somewhat limited since it just checks the file ID and environment, which is technically possible to change. The kernels use `getTargetEntryUniqueInfo`, which might make sense to re-use for this case. This has been changed as suggested.

I still think we shouldn't bother making all the noise containing the original name. Just mangle it and treat it like every other declare target variable without introducing any extra complexity. These symbols never should've been emitted in the first place so I'm not concerned if someone cracks open a binary and sees some ugly names. CUDA and HIP just mangle the declaration directly as far as I'm aware.

clang/lib/CodeGen/TargetInfo.cpp
9431	Were you able to find a place for this when we generate the variable? You should be able to do something similar to the patch above if it's a declare target static to force it to have external visibility, but as mentioned before I would prefer we only do this if necessary which might take some extra analysis.
clang/test/OpenMP/declare_target_visibility_codegen.cpp
11–13	We should still be able to do this if there are either no updates at all in the module, or if the declare type is `nohost`. Doing anything more complicated would require some optimizations between the host and device we can't do yet. I'm making this point because making these statics external is a performance regression so we should only do it when needed. To that end we may even want a flag that entirely disables this feature.

In D129694#3685207, @jhuber6 wrote:

I still think we shouldn't bother making all the noise containing the original name. Just mangle it and treat it like every other declare target variable without introducing any extra complexity. These symbols never should've been emitted in the first place so I'm not concerned if someone cracks open a binary and sees some ugly names. CUDA and HIP just mangle the declaration directly as far as I'm aware.

If that's the preference I can make changes as suggested. You mentioned CUDA and HIP mangle the declaration directly. To me it looks like they mangle it on host and device separately. Is that not correct? If so, can you point me to the source you are referring to?

ssquare08 added inline comments.Aug 8 2022, 3:15 PM

clang/lib/CodeGen/TargetInfo.cpp
9431	If you are asking about the GV, it is created in 'CodeGenModule::GetOrCreateLLVMGlobal' with external linkage always. auto *GV = new llvm::GlobalVariable( getModule(), Ty, false, llvm::GlobalValue::ExternalLinkage, nullptr, MangledName, nullptr, llvm::GlobalVariable::NotThreadLocal, getContext().getTargetAddressSpace(DAddrSpace)); The linkage, however, changes in 'CodeGenModule::EmitGlobalVarDefinition' based on the information VarDecl llvm::GlobalValue::LinkageTypes Linkage = getLLVMLinkageVarDefinition(D, GV->isConstant()); Maybe you are suggesting changing the linkage information in 'VarDecl' itself?
clang/test/OpenMP/declare_target_visibility_codegen.cpp
11–13	I'll add a check to see if there are any updates in the module.

In D129694#3708214, @ssquare08 wrote:

If that's the preference I can make changes as suggested. You mentioned CUDA and HIP mangle the declaration directly. To me it looks like they mangle it on host and device separately. Is that not correct? If so, can you point me to the source you are referring to?

You're right, they mangle them separately like in https://godbolt.org/z/r6hG4brqx, this is most likely because they already had separate "device side" names. For OpenMP we currently just use the same name for the variable on the host and device side like in https://godbolt.org/z/eaGo9qsW3 where we just use the same kernel names. Thinking again, I'm still wondering if there's any utility in keeping the names separate. Correct me if I'm wrong, but the host-side variable should be able to remain internal so this mangled device name shouldn't show up in the final executable. In that case the only benefit is slightly nicer IR, which I'm not super concerned with.

clang/lib/CodeGen/TargetInfo.cpp
9431	Yes, the patch I linked previously did something like that where it set the `LinkageValue` based on some information. Although I'm not sure if it would be excessively difficult to try to prune definitions that don't need to be externalized. I haven't looked too deep into this, but I believe CUDA does this inside of `adjustGVALinkageForAttributes`, there we also check some variable called `CUDADeviceVarODRUsedByHost` that I'm assuming tracks if we need to bother externalizing this.

In D129694#3708297, @jhuber6 wrote:

In D129694#3708214, @ssquare08 wrote:

If that's the preference I can make changes as suggested. You mentioned CUDA and HIP mangle the declaration directly. To me it looks like they mangle it on host and device separately. Is that not correct? If so, can you point me to the source you are referring to?

You're right, they mangle them separately like in https://godbolt.org/z/r6hG4brqx, this is most likely because they already had separate "device side" names. For OpenMP we currently just use the same name for the variable on the host and device side like in https://godbolt.org/z/eaGo9qsW3 where we just use the same kernel names. Thinking again, I'm still wondering if there's any utility in keeping the names separate. Correct me if I'm wrong, but the host-side variable should be able to remain internal so this mangled device name shouldn't show up in the final executable. In that case the only benefit is slightly nicer IR, which I'm not super concerned with.

Yes, the host-side variable should be able to remain internal.

In D129694#3717008, @ssquare08 wrote:

In D129694#3708297, @jhuber6 wrote:

In D129694#3708214, @ssquare08 wrote:

If that's the preference I can make changes as suggested. You mentioned CUDA and HIP mangle the declaration directly. To me it looks like they mangle it on host and device separately. Is that not correct? If so, can you point me to the source you are referring to?

You're right, they mangle them separately like in https://godbolt.org/z/r6hG4brqx, this is most likely because they already had separate "device side" names. For OpenMP we currently just use the same name for the variable on the host and device side like in https://godbolt.org/z/eaGo9qsW3 where we just use the same kernel names. Thinking again, I'm still wondering if there's any utility in keeping the names separate. Correct me if I'm wrong, but the host-side variable should be able to remain internal so this mangled device name shouldn't show up in the final executable. In that case the only benefit is slightly nicer IR, which I'm not super concerned with.

Yes, the host-side variable should be able to remain internal.

The OpenMP kernel names you mentioned are also generated separately by the host and the device. Would you be okay generating declare target mangle names separately by host and device using the same utility function getTargetEntryUniqueInfo?

If you still think it should only be generated only once by the host, what is a good way of doing this since we can't modify the name in VarDecl?

clang/lib/CodeGen/TargetInfo.cpp
9431	The exter

ssquare08 marked an inline comment as not done.Aug 11 2022, 1:03 PM

ssquare08 added inline comments.

clang/lib/CodeGen/TargetInfo.cpp
9431	Thanks for the information, I'll take a look

In D129694#3717166, @ssquare08 wrote:

The OpenMP kernel names you mentioned are also generated separately by the host and the device. Would you be okay generating declare target mangle names separately by host and device using the same utility function getTargetEntryUniqueInfo?

If you still think it should only be generated only once by the host, what is a good way of doing this since we can't modify the name in VarDecl?

I thought we already emitted the mangled name at least on the device side. I was suggesting that we just use the same name on the host so we don't need to worry about a host-side and device-side name difference and we can get rid of the extra argument to all the offload entry functions.

In D129694#3717208, @jhuber6 wrote:

In D129694#3717166, @ssquare08 wrote:

The OpenMP kernel names you mentioned are also generated separately by the host and the device. Would you be okay generating declare target mangle names separately by host and device using the same utility function getTargetEntryUniqueInfo?

If you still think it should only be generated only once by the host, what is a good way of doing this since we can't modify the name in VarDecl?

I thought we already emitted the mangled name at least on the device side. I was suggesting that we just use the same name on the host so we don't need to worry about a host-side and device-side name difference and we can get rid of the extra argument to all the offload entry functions.

Yes, that is correct. My question is, is it okay to mangle the host and the device side independently using getTargetEntryUniqueInfo? The reason I am asking is because you had expressed some concerns regarding mangling them separately. Or, maybe there is a way to mangle the original name before the host and device compilation split?

In D129694#3718225, @ssquare08 wrote:

Yes, that is correct. My question is, is it okay to mangle the host and the device side independently using getTargetEntryUniqueInfo? The reason I am asking is because you had expressed some concerns regarding mangling them separately. Or, maybe there is a way to mangle the original name before the host and device compilation split?

You'll need to mangle them separately for the device and host, the difference is that we want to use a function that shares the input to create the mangled name. As far as I know, this is done using a metadata node in the host bitcode. So as long as we share the same method that kernels use it should be fine.

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGOpenMPRuntime.h

32 lines

CGOpenMPRuntime.cpp

100 lines

CGOpenMPRuntimeGPU.h

3 lines

CGOpenMPRuntimeGPU.cpp

7 lines

CodeGenModule.cpp

39 lines

TargetInfo.cpp

1 line

Sema/

SemaOpenMP.cpp

9 lines

test/

OpenMP/

declare_target_codegen.cpp

6 lines

declare_target_link_codegen.cpp

2 lines

declare_target_only_one_side_compilation.cpp

2 lines

declare_target_visibility_codegen.cpp

4 lines

nvptx_allocate_codegen.cpp

2 lines

nvptx_declare_target_var_ctor_dtor_codegen.cpp

8 lines

target_update_messages.cpp

7 lines

openmp/

libomptarget/

test/

mapping/

declare_target_static_var.c

21 lines

Diff 447901

clang/lib/CodeGen/CGOpenMPRuntime.h

Show First 20 Lines • Show All 313 Lines • ▼ Show 20 Lines	protected:
/// Constructor allowing to redefine the name separator for the variables.		/// Constructor allowing to redefine the name separator for the variables.
explicit CGOpenMPRuntime(CodeGenModule &CGM, StringRef FirstSeparator,		explicit CGOpenMPRuntime(CodeGenModule &CGM, StringRef FirstSeparator,
StringRef Separator);		StringRef Separator);

/// Creates offloading entry for the provided entry ID \a ID,		/// Creates offloading entry for the provided entry ID \a ID,
/// address \a Addr, size \a Size, and flags \a Flags.		/// address \a Addr, size \a Size, and flags \a Flags.
virtual void createOffloadEntry(llvm::Constant ID, llvm::Constant Addr,		virtual void createOffloadEntry(llvm::Constant ID, llvm::Constant Addr,
uint64_t Size, int32_t Flags,		uint64_t Size, int32_t Flags,
llvm::GlobalValue::LinkageTypes Linkage);		llvm::GlobalValue::LinkageTypes Linkage,
		StringRef MangledName);

/// Helper to emit outlined function for 'target' directive.		/// Helper to emit outlined function for 'target' directive.
/// \param D Directive to emit.		/// \param D Directive to emit.
/// \param ParentName Name of the function that encloses the target region.		/// \param ParentName Name of the function that encloses the target region.
/// \param OutlinedFn Outlined function value to be defined by this call.		/// \param OutlinedFn Outlined function value to be defined by this call.
/// \param OutlinedFnID Outlined function ID value to be defined by this call.		/// \param OutlinedFnID Outlined function ID value to be defined by this call.
/// \param IsOffloadEntry True if the outlined function is an offload entry.		/// \param IsOffloadEntry True if the outlined function is an offload entry.
/// \param CodeGen Lambda codegen specific to an accelerator device.		/// \param CodeGen Lambda codegen specific to an accelerator device.
▲ Show 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	enum OMPTargetGlobalVarEntryKind : uint32_t {
OMPTargetGlobalVarEntryLink = 0x1,		OMPTargetGlobalVarEntryLink = 0x1,
};		};

/// Device global variable entries info.		/// Device global variable entries info.
class OffloadEntryInfoDeviceGlobalVar final : public OffloadEntryInfo {		class OffloadEntryInfoDeviceGlobalVar final : public OffloadEntryInfo {
/// Type of the global variable.		/// Type of the global variable.
CharUnits VarSize;		CharUnits VarSize;
llvm::GlobalValue::LinkageTypes Linkage;		llvm::GlobalValue::LinkageTypes Linkage;
		StringRef OrigName;

public:		public:
OffloadEntryInfoDeviceGlobalVar()		OffloadEntryInfoDeviceGlobalVar()
: OffloadEntryInfo(OffloadingEntryInfoDeviceGlobalVar) {}		: OffloadEntryInfo(OffloadingEntryInfoDeviceGlobalVar) {}
explicit OffloadEntryInfoDeviceGlobalVar(unsigned Order,		explicit OffloadEntryInfoDeviceGlobalVar(unsigned Order,
OMPTargetGlobalVarEntryKind Flags)		OMPTargetGlobalVarEntryKind Flags,
: OffloadEntryInfo(OffloadingEntryInfoDeviceGlobalVar, Order, Flags) {}		StringRef OrigName)
		: OffloadEntryInfo(OffloadingEntryInfoDeviceGlobalVar, Order, Flags),
		OrigName(OrigName) {}
explicit OffloadEntryInfoDeviceGlobalVar(		explicit OffloadEntryInfoDeviceGlobalVar(
unsigned Order, llvm::Constant *Addr, CharUnits VarSize,		unsigned Order, llvm::Constant *Addr, CharUnits VarSize,
OMPTargetGlobalVarEntryKind Flags,		OMPTargetGlobalVarEntryKind Flags,
llvm::GlobalValue::LinkageTypes Linkage)		llvm::GlobalValue::LinkageTypes Linkage, StringRef OrigName)
: OffloadEntryInfo(OffloadingEntryInfoDeviceGlobalVar, Order, Flags),		: OffloadEntryInfo(OffloadingEntryInfoDeviceGlobalVar, Order, Flags),
VarSize(VarSize), Linkage(Linkage) {		VarSize(VarSize), Linkage(Linkage), OrigName(OrigName) {
setAddress(Addr);		setAddress(Addr);
}		}

CharUnits getVarSize() const { return VarSize; }		CharUnits getVarSize() const { return VarSize; }
void setVarSize(CharUnits Size) { VarSize = Size; }		void setVarSize(CharUnits Size) { VarSize = Size; }
llvm::GlobalValue::LinkageTypes getLinkage() const { return Linkage; }		llvm::GlobalValue::LinkageTypes getLinkage() const { return Linkage; }
void setLinkage(llvm::GlobalValue::LinkageTypes LT) { Linkage = LT; }		void setLinkage(llvm::GlobalValue::LinkageTypes LT) { Linkage = LT; }
static bool classof(const OffloadEntryInfo *Info) {		static bool classof(const OffloadEntryInfo *Info) {
return Info->getKind() == OffloadingEntryInfoDeviceGlobalVar;		return Info->getKind() == OffloadingEntryInfoDeviceGlobalVar;
}		}
		StringRef getOrigName() const { return OrigName; }
		void setOrigName(StringRef Name) { OrigName = Name; }
};		};

/// Initialize device global variable entry.		/// Initialize device global variable entry.
void initializeDeviceGlobalVarEntryInfo(StringRef Name,		void initializeDeviceGlobalVarEntryInfo(StringRef Name,
OMPTargetGlobalVarEntryKind Flags,		OMPTargetGlobalVarEntryKind Flags,
unsigned Order);		unsigned Order, StringRef OrigName);
		void enterDeviceGlobalVarMangledName(StringRef OrigName, StringRef Name);

/// Register device global variable entry.		/// Register device global variable entry.
void		void
registerDeviceGlobalVarEntryInfo(StringRef VarName, llvm::Constant *Addr,		registerDeviceGlobalVarEntryInfo(StringRef VarName, StringRef OrigName,
CharUnits VarSize,		llvm::Constant *Addr, CharUnits VarSize,
OMPTargetGlobalVarEntryKind Flags,		OMPTargetGlobalVarEntryKind Flags,
llvm::GlobalValue::LinkageTypes Linkage);		llvm::GlobalValue::LinkageTypes Linkage);
/// Checks if the variable with the given name has been registered already.		/// Checks if the variable with the given name has been registered already.
bool hasDeviceGlobalVarEntryInfo(StringRef VarName) const {		bool hasDeviceGlobalVarEntryInfo(StringRef VarName) const {
return OffloadEntriesDeviceGlobalVar.count(VarName) > 0;		return OffloadEntriesDeviceGlobalVar.count(VarName) > 0;
}		}
/// Applies action \a Action on all registered entries.		/// Applies action \a Action on all registered entries.
typedef llvm::function_ref<void(StringRef,		typedef llvm::function_ref<void(StringRef,
const OffloadEntryInfoDeviceGlobalVar &)>		const OffloadEntryInfoDeviceGlobalVar &)>
OffloadDeviceGlobalVarEntryInfoActTy;		OffloadDeviceGlobalVarEntryInfoActTy;
void actOnDeviceGlobalVarEntriesInfo(		void actOnDeviceGlobalVarEntriesInfo(
const OffloadDeviceGlobalVarEntryInfoActTy &Action);		const OffloadDeviceGlobalVarEntryInfoActTy &Action);
		/// Return host mangled name
		StringRef getOffloadEntryHostMangledName(StringRef VarName);

private:		private:
// Storage for target region entries kind. The storage is to be indexed by		// Storage for target region entries kind. The storage is to be indexed by
// file ID, device ID, parent function name and line number.		// file ID, device ID, parent function name and line number.
typedef llvm::DenseMap<unsigned, OffloadEntryInfoTargetRegion>		typedef llvm::DenseMap<unsigned, OffloadEntryInfoTargetRegion>
OffloadEntriesTargetRegionPerLine;		OffloadEntriesTargetRegionPerLine;
typedef llvm::StringMap<OffloadEntriesTargetRegionPerLine>		typedef llvm::StringMap<OffloadEntriesTargetRegionPerLine>
OffloadEntriesTargetRegionPerParentName;		OffloadEntriesTargetRegionPerParentName;
typedef llvm::DenseMap<unsigned, OffloadEntriesTargetRegionPerParentName>		typedef llvm::DenseMap<unsigned, OffloadEntriesTargetRegionPerParentName>
OffloadEntriesTargetRegionPerFile;		OffloadEntriesTargetRegionPerFile;
typedef llvm::DenseMap<unsigned, OffloadEntriesTargetRegionPerFile>		typedef llvm::DenseMap<unsigned, OffloadEntriesTargetRegionPerFile>
OffloadEntriesTargetRegionPerDevice;		OffloadEntriesTargetRegionPerDevice;
typedef OffloadEntriesTargetRegionPerDevice OffloadEntriesTargetRegionTy;		typedef OffloadEntriesTargetRegionPerDevice OffloadEntriesTargetRegionTy;
OffloadEntriesTargetRegionTy OffloadEntriesTargetRegion;		OffloadEntriesTargetRegionTy OffloadEntriesTargetRegion;
/// Storage for device global variable entries kind. The storage is to be		/// Storage for device global variable entries kind. The storage is to be
/// indexed by mangled name.		/// indexed by mangled name.
typedef llvm::StringMap<OffloadEntryInfoDeviceGlobalVar>		typedef llvm::StringMap<OffloadEntryInfoDeviceGlobalVar>
OffloadEntriesDeviceGlobalVarTy;		OffloadEntriesDeviceGlobalVarTy;
OffloadEntriesDeviceGlobalVarTy OffloadEntriesDeviceGlobalVar;		OffloadEntriesDeviceGlobalVarTy OffloadEntriesDeviceGlobalVar;
		/// indexed by original name
		llvm::StringMap<std::string> OffloadEntriesDeviceGlobalVarNameMap;
};		};
OffloadEntriesInfoManagerTy OffloadEntriesInfoManager;		OffloadEntriesInfoManagerTy OffloadEntriesInfoManager;

bool ShouldMarkAsGlobal = true;		bool ShouldMarkAsGlobal = true;
/// List of the emitted declarations.		/// List of the emitted declarations.
llvm::DenseSet<CanonicalDeclPtr<const Decl>> AlreadyEmittedTargetDecls;		llvm::DenseSet<CanonicalDeclPtr<const Decl>> AlreadyEmittedTargetDecls;
/// List of the global variables with their addresses that should not be		/// List of the global variables with their addresses that should not be
/// emitted for the target.		/// emitted for the target.
▲ Show 20 Lines • Show All 1,182 Lines • ▼ Show 20 Lines
void emitUsesAllocatorsInit(CodeGenFunction &CGF, const Expr *Allocator,		void emitUsesAllocatorsInit(CodeGenFunction &CGF, const Expr *Allocator,
const Expr *AllocatorTraits);		const Expr *AllocatorTraits);

/// Destroys user defined allocators specified in the uses_allocators clause.		/// Destroys user defined allocators specified in the uses_allocators clause.
void emitUsesAllocatorsFini(CodeGenFunction &CGF, const Expr *Allocator);		void emitUsesAllocatorsFini(CodeGenFunction &CGF, const Expr *Allocator);

/// Returns true if the variable is a local variable in untied task.		/// Returns true if the variable is a local variable in untied task.
bool isLocalVarInUntiedTask(CodeGenFunction &CGF, const VarDecl *VD) const;		bool isLocalVarInUntiedTask(CodeGenFunction &CGF, const VarDecl *VD) const;

		/// Returns the mangled name for declare target global
		StringRef getHostMangledDeclareTargetGlobal(StringRef VarName);
};		};

/// Class supports emissionof SIMD-only code.		/// Class supports emissionof SIMD-only code.
class CGOpenMPSIMDRuntime final : public CGOpenMPRuntime {		class CGOpenMPSIMDRuntime final : public CGOpenMPRuntime {
public:		public:
explicit CGOpenMPSIMDRuntime(CodeGenModule &CGM) : CGOpenMPRuntime(CGM) {}		explicit CGOpenMPSIMDRuntime(CodeGenModule &CGM) : CGOpenMPRuntime(CGM) {}
~CGOpenMPSIMDRuntime() override {}		~CGOpenMPSIMDRuntime() override {}

▲ Show 20 Lines • Show All 604 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGOpenMPRuntime.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,027 Lines • ▼ Show 20 Lines	void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::actOnTargetRegionEntriesInfo(
for (const auto &D : OffloadEntriesTargetRegion)		for (const auto &D : OffloadEntriesTargetRegion)
for (const auto &F : D.second)		for (const auto &F : D.second)
for (const auto &P : F.second)		for (const auto &P : F.second)
for (const auto &L : P.second)		for (const auto &L : P.second)
Action(D.first, F.first, P.first(), L.first, L.second);		Action(D.first, F.first, P.first(), L.first, L.second);
}		}

void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::		void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::
		enterDeviceGlobalVarMangledName(StringRef OrigName, StringRef MangledName) {
		if (!OrigName.equals(MangledName)) {
		OffloadEntriesDeviceGlobalVarNameMap.try_emplace(OrigName,
		MangledName.str());
		}
		}

		void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::
initializeDeviceGlobalVarEntryInfo(StringRef Name,		initializeDeviceGlobalVarEntryInfo(StringRef Name,
OMPTargetGlobalVarEntryKind Flags,		OMPTargetGlobalVarEntryKind Flags,
unsigned Order) {		unsigned Order, StringRef OrigName) {
assert(CGM.getLangOpts().OpenMPIsDevice && "Initialization of entries is "		assert(CGM.getLangOpts().OpenMPIsDevice && "Initialization of entries is "
"only required for the device "		"only required for the device "
"code generation.");		"code generation.");
OffloadEntriesDeviceGlobalVar.try_emplace(Name, Order, Flags);		OffloadEntriesDeviceGlobalVar.try_emplace(Name, Order, Flags, OrigName);
++OffloadingEntriesNum;		++OffloadingEntriesNum;
}		}

		StringRef
		CGOpenMPRuntime::OffloadEntriesInfoManagerTy::getOffloadEntryHostMangledName(
		StringRef VarName) {
		if (OffloadEntriesDeviceGlobalVarNameMap.find(VarName) !=
		OffloadEntriesDeviceGlobalVarNameMap.end()) {
		return OffloadEntriesDeviceGlobalVarNameMap[VarName];
		}
		return StringRef();
		}

void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::		void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::
registerDeviceGlobalVarEntryInfo(StringRef VarName, llvm::Constant *Addr,		registerDeviceGlobalVarEntryInfo(StringRef VarName, StringRef OrigName,
CharUnits VarSize,		llvm::Constant *Addr, CharUnits VarSize,
OMPTargetGlobalVarEntryKind Flags,		OMPTargetGlobalVarEntryKind Flags,
llvm::GlobalValue::LinkageTypes Linkage) {		llvm::GlobalValue::LinkageTypes Linkage) {
if (CGM.getLangOpts().OpenMPIsDevice) {		if (CGM.getLangOpts().OpenMPIsDevice) {
// This could happen if the device compilation is invoked standalone.		// This could happen if the device compilation is invoked standalone.
if (!hasDeviceGlobalVarEntryInfo(VarName))		if (!hasDeviceGlobalVarEntryInfo(VarName))
return;		return;
auto &Entry = OffloadEntriesDeviceGlobalVar[VarName];		auto &Entry = OffloadEntriesDeviceGlobalVar[VarName];
if (Entry.getAddress() && hasDeviceGlobalVarEntryInfo(VarName)) {		if (Entry.getAddress() && hasDeviceGlobalVarEntryInfo(VarName)) {
if (Entry.getVarSize().isZero()) {		if (Entry.getVarSize().isZero()) {
Entry.setVarSize(VarSize);		Entry.setVarSize(VarSize);
Entry.setLinkage(Linkage);		Entry.setLinkage(Linkage);
}		}
return;		return;
}		}
Entry.setVarSize(VarSize);		Entry.setVarSize(VarSize);
Entry.setLinkage(Linkage);		Entry.setLinkage(Linkage);
Entry.setAddress(Addr);		Entry.setAddress(Addr);
		Entry.setOrigName(OrigName);
} else {		} else {
if (hasDeviceGlobalVarEntryInfo(VarName)) {		if (hasDeviceGlobalVarEntryInfo(VarName)) {
auto &Entry = OffloadEntriesDeviceGlobalVar[VarName];		auto &Entry = OffloadEntriesDeviceGlobalVar[VarName];
assert(Entry.isValid() && Entry.getFlags() == Flags &&		assert(Entry.isValid() && Entry.getFlags() == Flags &&
"Entry not initialized!");		"Entry not initialized!");
if (Entry.getVarSize().isZero()) {		if (Entry.getVarSize().isZero()) {
Entry.setVarSize(VarSize);		Entry.setVarSize(VarSize);
Entry.setLinkage(Linkage);		Entry.setLinkage(Linkage);
}		}
return;		return;
}		}
OffloadEntriesDeviceGlobalVar.try_emplace(		OffloadEntriesDeviceGlobalVar.try_emplace(
VarName, OffloadingEntriesNum, Addr, VarSize, Flags, Linkage);		VarName, OffloadingEntriesNum, Addr, VarSize, Flags, Linkage, OrigName);
++OffloadingEntriesNum;		++OffloadingEntriesNum;
}		}
}		}

void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::		void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::
actOnDeviceGlobalVarEntriesInfo(		actOnDeviceGlobalVarEntriesInfo(
const OffloadDeviceGlobalVarEntryInfoActTy &Action) {		const OffloadDeviceGlobalVarEntryInfoActTy &Action) {
// Scan all target region entries and perform the provided action.		// Scan all target region entries and perform the provided action.
for (const auto &E : OffloadEntriesDeviceGlobalVar)		for (const auto &E : OffloadEntriesDeviceGlobalVar)
Action(E.getKey(), E.getValue());		Action(E.getKey(), E.getValue());
}		}

void CGOpenMPRuntime::createOffloadEntry(		void CGOpenMPRuntime::createOffloadEntry(
llvm::Constant ID, llvm::Constant Addr, uint64_t Size, int32_t Flags,		llvm::Constant ID, llvm::Constant Addr, uint64_t Size, int32_t Flags,
llvm::GlobalValue::LinkageTypes Linkage) {		llvm::GlobalValue::LinkageTypes Linkage, StringRef MangledName) {
OMPBuilder.emitOffloadingEntry(ID, Addr->getName(), Size, Flags);		StringRef VarName = (MangledName.empty()) ? Addr->getName() : MangledName;
		OMPBuilder.emitOffloadingEntry(ID, VarName, Size, Flags);
}		}

void CGOpenMPRuntime::createOffloadEntriesAndInfoMetadata() {		void CGOpenMPRuntime::createOffloadEntriesAndInfoMetadata() {
// Emit the offloading entries and metadata so that the device codegen side		// Emit the offloading entries and metadata so that the device codegen side
// can easily figure out what to emit. The produced metadata looks like		// can easily figure out what to emit. The produced metadata looks like
// this:		// this:
//		//
// !omp_offload.info = !{!1, ...}		// !omp_offload.info = !{!1, ...}
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	auto &&DeviceGlobalVarMetadataEmitter =
const OffloadEntriesInfoManagerTy::OffloadEntryInfoDeviceGlobalVar		const OffloadEntriesInfoManagerTy::OffloadEntryInfoDeviceGlobalVar
&E) {		&E) {
// Generate metadata for global variables. Each entry of this metadata		// Generate metadata for global variables. Each entry of this metadata
// contains:		// contains:
// - Entry 0 -> Kind of this type of metadata (1).		// - Entry 0 -> Kind of this type of metadata (1).
// - Entry 1 -> Mangled name of the variable.		// - Entry 1 -> Mangled name of the variable.
// - Entry 2 -> Declare target kind.		// - Entry 2 -> Declare target kind.
// - Entry 3 -> Order the entry was created.		// - Entry 3 -> Order the entry was created.
		// - Entry 4 -> Original name of the variable.
// The first element of the metadata node is the kind.		// The first element of the metadata node is the kind.
llvm::Metadata *Ops[] = {		llvm::Metadata *Ops[] = {GetMDInt(E.getKind()),
GetMDInt(E.getKind()), GetMDString(MangledName),		GetMDString(MangledName),
GetMDInt(E.getFlags()), GetMDInt(E.getOrder())};		GetMDInt(E.getFlags()), GetMDInt(E.getOrder()),
		GetMDString(E.getOrigName())};

// Save this entry in the right position of the ordered entries array.		// Save this entry in the right position of the ordered entries array.
OrderedEntries[E.getOrder()] =		OrderedEntries[E.getOrder()] =
std::make_tuple(&E, SourceLocation(), MangledName);		std::make_tuple(&E, SourceLocation(), MangledName);

// Add metadata to the named metadata node.		// Add metadata to the named metadata node.
MD->addOperand(llvm::MDNode::get(C, Ops));		MD->addOperand(llvm::MDNode::get(C, Ops));
};		};
Show All 14 Lines	if (const auto *CE =
unsigned DiagID = CGM.getDiags().getCustomDiagID(		unsigned DiagID = CGM.getDiags().getCustomDiagID(
DiagnosticsEngine::Error,		DiagnosticsEngine::Error,
"Offloading entry for target region in %0 is incorrect: either the "		"Offloading entry for target region in %0 is incorrect: either the "
"address or the ID is invalid.");		"address or the ID is invalid.");
CGM.getDiags().Report(std::get<1>(E), DiagID) << FnName;		CGM.getDiags().Report(std::get<1>(E), DiagID) << FnName;
continue;		continue;
}		}
createOffloadEntry(CE->getID(), CE->getAddress(), /Size=/0,		createOffloadEntry(CE->getID(), CE->getAddress(), /Size=/0,
CE->getFlags(), llvm::GlobalValue::WeakAnyLinkage);		CE->getFlags(), llvm::GlobalValue::WeakAnyLinkage,
		/MangledName/ StringRef());
} else if (const auto *CE = dyn_cast<OffloadEntriesInfoManagerTy::		} else if (const auto *CE = dyn_cast<OffloadEntriesInfoManagerTy::
OffloadEntryInfoDeviceGlobalVar>(		OffloadEntryInfoDeviceGlobalVar>(
std::get<0>(E))) {		std::get<0>(E))) {
OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryKind Flags =		OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryKind Flags =
static_cast<OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryKind>(		static_cast<OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryKind>(
CE->getFlags());		CE->getFlags());
switch (Flags) {		switch (Flags) {
case OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryTo: {		case OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryTo: {
Show All 24 Lines	if (const auto *CE =
DiagnosticsEngine::Error,		DiagnosticsEngine::Error,
"Offloading entry for declare target variable is incorrect: the "		"Offloading entry for declare target variable is incorrect: the "
"address is invalid.");		"address is invalid.");
CGM.getDiags().Report(DiagID);		CGM.getDiags().Report(DiagID);
continue;		continue;
}		}
break;		break;
}		}

// Hidden or internal symbols on the device are not externally visible. We		// Hidden symbols on the device are not externally visible and constants
// should not attempt to register them by creating an offloading entry.		// don't need to be modified. We should not attempt to register them by
		// creating an offloading entry.
if (auto *GV = dyn_cast<llvm::GlobalValue>(CE->getAddress()))		if (auto *GV = dyn_cast<llvm::GlobalValue>(CE->getAddress()))
		jhuber6Unsubmitted Not Done Reply Inline Actions This comment needs to be adjusted accordingly jhuber6: This comment needs to be adjusted accordingly
if (GV->hasLocalLinkage() \|\| GV->hasHiddenVisibility())		if (GV->hasHiddenVisibility() \|\|
		dyn_cast<llvm::GlobalVariable>(GV)->isConstant())
continue;		continue;

		StringRef MangledName = std::get<2>(E);
createOffloadEntry(CE->getAddress(), CE->getAddress(),		createOffloadEntry(CE->getAddress(), CE->getAddress(),
CE->getVarSize().getQuantity(), Flags,		CE->getVarSize().getQuantity(), Flags,
CE->getLinkage());		CE->getLinkage(), MangledName);
} else {		} else {
llvm_unreachable("Unsupported entry kind.");		llvm_unreachable("Unsupported entry kind.");
}		}
}		}
}		}

/// Loads all the offload entries information from the host IR		/// Loads all the offload entries information from the host IR
/// metadata.		/// metadata.
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	case OffloadEntriesInfoManagerTy::OffloadEntryInfo::
/Order=/GetMDInt(5));		/Order=/GetMDInt(5));
break;		break;
case OffloadEntriesInfoManagerTy::OffloadEntryInfo::		case OffloadEntriesInfoManagerTy::OffloadEntryInfo::
OffloadingEntryInfoDeviceGlobalVar:		OffloadingEntryInfoDeviceGlobalVar:
OffloadEntriesInfoManager.initializeDeviceGlobalVarEntryInfo(		OffloadEntriesInfoManager.initializeDeviceGlobalVarEntryInfo(
/MangledName=/GetMDString(1),		/MangledName=/GetMDString(1),
static_cast<OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryKind>(		static_cast<OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryKind>(
/Flags=/GetMDInt(2)),		/Flags=/GetMDInt(2)),
/Order=/GetMDInt(3));		/Order=/GetMDInt(3),
		/OrigName=/GetMDString(4));
		OffloadEntriesInfoManager.enterDeviceGlobalVarMangledName(
		/OrigName=/GetMDString(4),
		/MangledName=/GetMDString(1));
break;		break;
}		}
}		}
}		}

		StringRef
		CGOpenMPRuntime::getHostMangledDeclareTargetGlobal(StringRef VarName) {
		return OffloadEntriesInfoManager.getOffloadEntryHostMangledName(VarName);
		}

void CGOpenMPRuntime::emitKmpRoutineEntryT(QualType KmpInt32Ty) {		void CGOpenMPRuntime::emitKmpRoutineEntryT(QualType KmpInt32Ty) {
if (!KmpRoutineEntryPtrTy) {		if (!KmpRoutineEntryPtrTy) {
// Build typedef kmp_int32 (* kmp_routine_entry_t)(kmp_int32, void *); type.		// Build typedef kmp_int32 (* kmp_routine_entry_t)(kmp_int32, void *); type.
ASTContext &C = CGM.getContext();		ASTContext &C = CGM.getContext();
QualType KmpRoutineEntryTyArgs[] = {KmpInt32Ty, C.VoidPtrTy};		QualType KmpRoutineEntryTyArgs[] = {KmpInt32Ty, C.VoidPtrTy};
FunctionProtoType::ExtProtoInfo EPI;		FunctionProtoType::ExtProtoInfo EPI;
KmpRoutineEntryPtrQTy = C.getPointerType(		KmpRoutineEntryPtrQTy = C.getPointerType(
C.getFunctionType(KmpInt32Ty, KmpRoutineEntryTyArgs, EPI));		C.getFunctionType(KmpInt32Ty, KmpRoutineEntryTyArgs, EPI));
▲ Show 20 Lines • Show All 7,390 Lines • ▼ Show 20 Lines	if (!Res) {
}		}
return;		return;
}		}
// Register declare target variables.		// Register declare target variables.
OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryKind Flags;		OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryKind Flags;
StringRef VarName;		StringRef VarName;
CharUnits VarSize;		CharUnits VarSize;
llvm::GlobalValue::LinkageTypes Linkage;		llvm::GlobalValue::LinkageTypes Linkage;
		StringRef OrigName = VD->getName();

		SmallString<256> Buffer;
		llvm::raw_svector_ostream Out(Buffer);
if (*Res == OMPDeclareTargetDeclAttr::MT_To &&		if (*Res == OMPDeclareTargetDeclAttr::MT_To &&
!HasRequiresUnifiedSharedMemory) {		!HasRequiresUnifiedSharedMemory) {
Flags = OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryTo;		Flags = OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryTo;

		// We don't need to mangle the host side of declare target global variables
		// but we need to create offload entry that matches the device side which
		// gets mangled.
		auto *GV = dyn_cast<llvm::GlobalValue>(Addr);
		if (!CGM.getLangOpts().OpenMPIsDevice && !VD->isExternallyVisible() &&
		!GV->hasHiddenVisibility() &&
		!dyn_cast<llvm::GlobalVariable>(GV)->isConstant()) {
		jhuber6Unsubmitted Not Done Reply Inline Actions `CGM.printPostfixForExternalizedDecl` should ideally give the same output on the host and device, but it's somewhat limited since it just checks the file ID and environment, which is technically possible to change. The kernels use `getTargetEntryUniqueInfo`, which might make sense to re-use for this case. jhuber6: `CGM.printPostfixForExternalizedDecl` should ideally give the same output on the host and…
		ssquare08AuthorUnsubmitted Done Reply Inline Actions That was the point I had raised in one of the Clang meeting but someone had mentioned that kernels names are created on the host side and the device side reads the information though the Host IR. Seems like kernels name could also run into mismatch issue for some corner cases then? ssquare08: That was the point I had raised in one of the Clang meeting but someone had mentioned that…
		jhuber6Unsubmitted Not Done Reply Inline Actions So the problem here is that the host and device need to agree on what the name is so that we can register the correct variable. The CUDA / HIP toolchains solved this by either performing a mangling that is stable between the host and device, or by having the driver generate a random hash that gets used on both. OpenMP instead solves this by writing the variable to the host IR first and then reading it on the device to see what the name needs to be. Since we have that dependency we can use any mangling we want, though it's still best for it to be somewhat stable unless we want tests to change every time we run them. It probably won't hurt anything to just use `printPostfixForExternalizedDecl` but it's not as strong of a mangling as what we can do with the OpenMP method since it needs to be common between the host and device. jhuber6: So the problem here is that the host and device need to agree on what the name is so that we…
		ssquare08AuthorUnsubmitted Done Reply Inline Actions `CGM.printPostfixForExternalizedDecl` should ideally give the same output on the host and device, but it's somewhat limited since it just checks the file ID and environment, which is technically possible to change. The kernels use `getTargetEntryUniqueInfo`, which might make sense to re-use for this case. This has been changed as suggested. ssquare08: > `CGM.printPostfixForExternalizedDecl` should ideally give the same output on the host and…
		VarName =
		OffloadEntriesInfoManager.getOffloadEntryHostMangledName(OrigName);
		if (VarName.empty()) {
		unsigned DeviceID;
		unsigned FileID;
		unsigned Line;
		SourceLocation Loc = VD->getCanonicalDecl()->getBeginLoc();
		getTargetEntryUniqueInfo(CGM.getContext(), Loc, DeviceID, FileID, Line);
		{
		Out << VD->getName() << "__static__" << llvm::format("%x", DeviceID)
		<< llvm::format("_%x_", FileID) << "l" << Line;
		}
		VarName = Buffer;
		}
		} else {
VarName = CGM.getMangledName(VD);		VarName = CGM.getMangledName(VD);
		}

		jhuber6Unsubmitted Not Done Reply Inline Actions It might be easier to just mangle the original definition, that would reduce a lot of churn here adding `origName` everywhere. Any reason that's not desirable? jhuber6: It might be easier to just mangle the original definition, that would reduce a lot of churn…
		ssquare08AuthorUnsubmitted Done Reply Inline Actions You are right, it'd have made the code cleaner but we didn't want to mangle the host side if we could avoid it. ssquare08: You are right, it'd have made the code cleaner but we didn't want to mangle the host side if we…
		jhuber6Unsubmitted Not Done Reply Inline Actions Others may want to comment, but personally I'm not too worried about mangling a name that wouldn't have been placed in the symbol table to begin with and it would make the code a lot cleaner. jhuber6: Others may want to comment, but personally I'm not too worried about mangling a name that…
if (VD->hasDefinition(CGM.getContext()) != VarDecl::DeclarationOnly) {		if (VD->hasDefinition(CGM.getContext()) != VarDecl::DeclarationOnly) {
VarSize = CGM.getContext().getTypeSizeInChars(VD->getType());		VarSize = CGM.getContext().getTypeSizeInChars(VD->getType());
assert(!VarSize.isZero() && "Expected non-zero size of the variable");		assert(!VarSize.isZero() && "Expected non-zero size of the variable");
} else {		} else {
VarSize = CharUnits::Zero();		VarSize = CharUnits::Zero();
}		}
Linkage = CGM.getLLVMLinkageVarDefinition(VD, /IsConstant=/false);		Linkage = CGM.getLLVMLinkageVarDefinition(VD, /IsConstant=/false);
// Temp solution to prevent optimizations of the internal variables.		// Temp solution to prevent optimizations of the internal variables.
Show All 30 Lines	if (CGM.getLangOpts().OpenMPIsDevice) {
VarName = getAddrOfDeclareTargetVar(VD).getName();		VarName = getAddrOfDeclareTargetVar(VD).getName();
Addr = cast<llvm::Constant>(getAddrOfDeclareTargetVar(VD).getPointer());		Addr = cast<llvm::Constant>(getAddrOfDeclareTargetVar(VD).getPointer());
}		}
VarSize = CGM.getPointerSize();		VarSize = CGM.getPointerSize();
Linkage = llvm::GlobalValue::WeakAnyLinkage;		Linkage = llvm::GlobalValue::WeakAnyLinkage;
}		}

OffloadEntriesInfoManager.registerDeviceGlobalVarEntryInfo(		OffloadEntriesInfoManager.registerDeviceGlobalVarEntryInfo(
VarName, Addr, VarSize, Flags, Linkage);		VarName, OrigName, Addr, VarSize, Flags, Linkage);
}		}

bool CGOpenMPRuntime::emitTargetGlobal(GlobalDecl GD) {		bool CGOpenMPRuntime::emitTargetGlobal(GlobalDecl GD) {
if (isa<FunctionDecl>(GD.getDecl()) \|\|		if (isa<FunctionDecl>(GD.getDecl()) \|\|
isa<OMPDeclareReductionDecl>(GD.getDecl()))		isa<OMPDeclareReductionDecl>(GD.getDecl()))
return emitTargetFunctions(GD);		return emitTargetFunctions(GD);

return emitTargetGlobalVariable(GD);		return emitTargetGlobalVariable(GD);
▲ Show 20 Lines • Show All 369 Lines • ▼ Show 20 Lines	void CGOpenMPRuntime::emitTargetDataStandAloneCall(
const Expr *Device) {		const Expr *Device) {
if (!CGF.HaveInsertPoint())		if (!CGF.HaveInsertPoint())
return;		return;

assert((isa<OMPTargetEnterDataDirective>(D) \|\|		assert((isa<OMPTargetEnterDataDirective>(D) \|\|
isa<OMPTargetExitDataDirective>(D) \|\|		isa<OMPTargetExitDataDirective>(D) \|\|
isa<OMPTargetUpdateDirective>(D)) &&		isa<OMPTargetUpdateDirective>(D)) &&
"Expecting either target enter, exit data, or update directives.");		"Expecting either target enter, exit data, or update directives.");

CodeGenFunction::OMPTargetDataInfo InputInfo;		CodeGenFunction::OMPTargetDataInfo InputInfo;
llvm::Value *MapTypesArray = nullptr;		llvm::Value *MapTypesArray = nullptr;
llvm::Value *MapNamesArray = nullptr;		llvm::Value *MapNamesArray = nullptr;
// Generate the code for the opening of the data environment.		// Generate the code for the opening of the data environment.
auto &&ThenGen = [this, &D, Device, &InputInfo, &MapTypesArray,		auto &&ThenGen = [this, &D, Device, &InputInfo, &MapTypesArray,
&MapNamesArray](CodeGenFunction &CGF, PrePostActionTy &) {		&MapNamesArray](CodeGenFunction &CGF, PrePostActionTy &) {
// Emit device ID if any.		// Emit device ID if any.
llvm::Value *DeviceID = nullptr;		llvm::Value *DeviceID = nullptr;
▲ Show 20 Lines • Show All 1,874 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGOpenMPRuntimeGPU.h

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	private:
//		//
// Base class overrides.		// Base class overrides.
//		//

/// Creates offloading entry for the provided entry ID \a ID,		/// Creates offloading entry for the provided entry ID \a ID,
/// address \a Addr, size \a Size, and flags \a Flags.		/// address \a Addr, size \a Size, and flags \a Flags.
void createOffloadEntry(llvm::Constant ID, llvm::Constant Addr,		void createOffloadEntry(llvm::Constant ID, llvm::Constant Addr,
uint64_t Size, int32_t Flags,		uint64_t Size, int32_t Flags,
llvm::GlobalValue::LinkageTypes Linkage) override;		llvm::GlobalValue::LinkageTypes Linkage,
		StringRef MangledName) override;

/// Emit outlined function specialized for the Fork-Join		/// Emit outlined function specialized for the Fork-Join
/// programming model for applicable target directives on the NVPTX device.		/// programming model for applicable target directives on the NVPTX device.
/// \param D Directive to emit.		/// \param D Directive to emit.
/// \param ParentName Name of the function that encloses the target region.		/// \param ParentName Name of the function that encloses the target region.
/// \param OutlinedFn Outlined function value to be defined by this call.		/// \param OutlinedFn Outlined function value to be defined by this call.
/// \param OutlinedFnID Outlined function ID value to be defined by this call.		/// \param OutlinedFnID Outlined function ID value to be defined by this call.
/// \param IsOffloadEntry True if the outlined function is an offload entry.		/// \param IsOffloadEntry True if the outlined function is an offload entry.
▲ Show 20 Lines • Show All 372 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp

Show First 20 Lines • Show All 1,114 Lines • ▼ Show 20 Lines	auto *GVMode = new llvm::GlobalVariable(
llvm::GlobalValue::WeakAnyLinkage,		llvm::GlobalValue::WeakAnyLinkage,
llvm::ConstantInt::get(CGM.Int8Ty, Mode ? OMP_TGT_EXEC_MODE_SPMD		llvm::ConstantInt::get(CGM.Int8Ty, Mode ? OMP_TGT_EXEC_MODE_SPMD
: OMP_TGT_EXEC_MODE_GENERIC),		: OMP_TGT_EXEC_MODE_GENERIC),
Twine(Name, "_exec_mode"));		Twine(Name, "_exec_mode"));
CGM.addCompilerUsedGlobal(GVMode);		CGM.addCompilerUsedGlobal(GVMode);
}		}

void CGOpenMPRuntimeGPU::createOffloadEntry(llvm::Constant *ID,		void CGOpenMPRuntimeGPU::createOffloadEntry(llvm::Constant *ID,
llvm::Constant *Addr,		llvm::Constant *Addr, uint64_t Size,
uint64_t Size, int32_t,		int32_t,
llvm::GlobalValue::LinkageTypes) {		llvm::GlobalValue::LinkageTypes,
		StringRef) {
// TODO: Add support for global variables on the device after declare target		// TODO: Add support for global variables on the device after declare target
// support.		// support.
llvm::Function *Fn = dyn_cast<llvm::Function>(Addr);		llvm::Function *Fn = dyn_cast<llvm::Function>(Addr);
if (!Fn)		if (!Fn)
return;		return;

llvm::Module &M = CGM.getModule();		llvm::Module &M = CGM.getModule();
llvm::LLVMContext &Ctx = CGM.getLLVMContext();		llvm::LLVMContext &Ctx = CGM.getLLVMContext();
▲ Show 20 Lines • Show All 2,895 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenModule.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,513 Lines • ▼ Show 20 Lines	if (!LangOpts.CUDAIsDevice \|\| !getContext().mayExternalize(GD.getDecl())) {
if (FoundName != MangledDeclNames.end())		if (FoundName != MangledDeclNames.end())
return FoundName->second;		return FoundName->second;
}		}

// Keep the first result in the case of a mangling collision.		// Keep the first result in the case of a mangling collision.
const auto *ND = cast<NamedDecl>(GD.getDecl());		const auto *ND = cast<NamedDecl>(GD.getDecl());
std::string MangledName = getMangledNameImpl(*this, GD, ND);		std::string MangledName = getMangledNameImpl(*this, GD, ND);

		if (getLangOpts().OpenMPIsDevice) {
		if (isa<VarDecl>(GD.getDecl())) {
		const auto *VD = dyn_cast<VarDecl>(GD.getDecl());
		llvm::Optional<OMPDeclareTargetDeclAttr::MapTypeTy> Res =
		OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);

		if (Res && (*Res == OMPDeclareTargetDeclAttr::MT_To) &&
		!getOpenMPRuntime().hasRequiresUnifiedSharedMemory() &&
		!VD->isExternallyVisible()) {
		StringRef HostMangledName =
		getOpenMPRuntime().getHostMangledDeclareTargetGlobal(VD->getName());
		if (!HostMangledName.empty())
		MangledName = HostMangledName.str();
		}
		}
		}

// Ensure either we have different ABIs between host and device compilations,		// Ensure either we have different ABIs between host and device compilations,
// says host compilation following MSVC ABI but device compilation follows		// says host compilation following MSVC ABI but device compilation follows
// Itanium C++ ABI or, if they follow the same ABI, kernel names after		// Itanium C++ ABI or, if they follow the same ABI, kernel names after
// mangling should be the same after name stubbing. The later checking is		// mangling should be the same after name stubbing. The later checking is
// very important as the device kernel name being mangled in host-compilation		// very important as the device kernel name being mangled in host-compilation
// is used to resolve the device binaries to be executed. Inconsistent naming		// is used to resolve the device binaries to be executed. Inconsistent naming
// result in undefined behavior. Even though we cannot check that naming		// result in undefined behavior. Even though we cannot check that naming
// directly between host- and device-compilations, the host- and		// directly between host- and device-compilations, the host- and
▲ Show 20 Lines • Show All 2,739 Lines • ▼ Show 20 Lines	if (DDI != DeferredDecls.end()) {
// Move the potentially referenced deferred decl to the DeferredDeclsToEmit		// Move the potentially referenced deferred decl to the DeferredDeclsToEmit
// list, and remove it from DeferredDecls (since we don't need it anymore).		// list, and remove it from DeferredDecls (since we don't need it anymore).
addDeferredDeclToEmit(DDI->second);		addDeferredDeclToEmit(DDI->second);
DeferredDecls.erase(DDI);		DeferredDecls.erase(DDI);
}		}

// Handle things which are present even on external declarations.		// Handle things which are present even on external declarations.
if (D) {		if (D) {
if (LangOpts.OpenMP && !LangOpts.OpenMPSimd)
getOpenMPRuntime().registerTargetGlobalVariable(D, GV);

// FIXME: This code is overly simple and should be merged with other global		// FIXME: This code is overly simple and should be merged with other global
// handling.		// handling.
GV->setConstant(isTypeConstant(D->getType(), false));		GV->setConstant(isTypeConstant(D->getType(), false));

		if (LangOpts.OpenMP && !LangOpts.OpenMPSimd)
		getOpenMPRuntime().registerTargetGlobalVariable(D, GV);

GV->setAlignment(getContext().getDeclAlign(D).getAsAlign());		GV->setAlignment(getContext().getDeclAlign(D).getAsAlign());

setLinkageForGV(GV, D);		setLinkageForGV(GV, D);

if (D->getTLSKind()) {		if (D->getTLSKind()) {
if (D->getTLSKind() == VarDecl::TLS_Dynamic)		if (D->getTLSKind() == VarDecl::TLS_Dynamic)
CXXThreadLocals.push_back(D);		CXXThreadLocals.push_back(D);
setTLSMode(GV, *D);		setTLSMode(GV, *D);
▲ Show 20 Lines • Show All 565 Lines • ▼ Show 20 Lines	#endif
// weak or linkonce, the de-duplication semantics are important to preserve,		// weak or linkonce, the de-duplication semantics are important to preserve,
// so we don't change the linkage.		// so we don't change the linkage.
if (D->getTLSKind() == VarDecl::TLS_Dynamic &&		if (D->getTLSKind() == VarDecl::TLS_Dynamic &&
Linkage == llvm::GlobalValue::ExternalLinkage &&		Linkage == llvm::GlobalValue::ExternalLinkage &&
Context.getTargetInfo().getTriple().isOSDarwin() &&		Context.getTargetInfo().getTriple().isOSDarwin() &&
!D->hasAttr<ConstInitAttr>())		!D->hasAttr<ConstInitAttr>())
Linkage = llvm::GlobalValue::InternalLinkage;		Linkage = llvm::GlobalValue::InternalLinkage;

		// Make sure any variable with OpenMP declare target is visible to the runtime
		// except for constants and those with hidden visibility
		Optional<OMPDeclareTargetDeclAttr::DevTypeTy> DevTy =
		OMPDeclareTargetDeclAttr::getDeviceType(D);
		if (DevTy && (*DevTy == OMPDeclareTargetDeclAttr::DT_Any) &&
		getLangOpts().OpenMPIsDevice && D && !GV->hasHiddenVisibility() &&
		!GV->isConstant() &&
		!getOpenMPRuntime().hasRequiresUnifiedSharedMemory()) {
		GV->setLinkage(llvm::GlobalValue::ExternalLinkage);
		GV->setDSOLocal(false);
		} else {
GV->setLinkage(Linkage);		GV->setLinkage(Linkage);
		}

if (D->hasAttr<DLLImportAttr>())		if (D->hasAttr<DLLImportAttr>())
GV->setDLLStorageClass(llvm::GlobalVariable::DLLImportStorageClass);		GV->setDLLStorageClass(llvm::GlobalVariable::DLLImportStorageClass);
else if (D->hasAttr<DLLExportAttr>())		else if (D->hasAttr<DLLExportAttr>())
GV->setDLLStorageClass(llvm::GlobalVariable::DLLExportStorageClass);		GV->setDLLStorageClass(llvm::GlobalVariable::DLLExportStorageClass);
else		else
GV->setDLLStorageClass(llvm::GlobalVariable::DefaultStorageClass);		GV->setDLLStorageClass(llvm::GlobalVariable::DefaultStorageClass);

if (Linkage == llvm::GlobalVariable::CommonLinkage) {		if (Linkage == llvm::GlobalVariable::CommonLinkage) {
▲ Show 20 Lines • Show All 2,089 Lines • ▼ Show 20 Lines	if (getLangOpts().CUID.empty()) {
llvm::sys::fs::UniqueID ID;		llvm::sys::fs::UniqueID ID;
if (auto EC = llvm::sys::fs::getUniqueID(PLoc.getFilename(), ID)) {		if (auto EC = llvm::sys::fs::getUniqueID(PLoc.getFilename(), ID)) {
PLoc = SM.getPresumedLoc(D->getLocation(), /UseLineDirectives=/false);		PLoc = SM.getPresumedLoc(D->getLocation(), /UseLineDirectives=/false);
assert(PLoc.isValid() && "Source location is expected to be valid.");		assert(PLoc.isValid() && "Source location is expected to be valid.");
if (auto EC = llvm::sys::fs::getUniqueID(PLoc.getFilename(), ID))		if (auto EC = llvm::sys::fs::getUniqueID(PLoc.getFilename(), ID))
SM.getDiagnostics().Report(diag::err_cannot_open_file)		SM.getDiagnostics().Report(diag::err_cannot_open_file)
<< PLoc.getFilename() << EC.message();		<< PLoc.getFilename() << EC.message();
}		}

OS << llvm::format("%x", ID.getFile()) << llvm::format("%x", ID.getDevice())		OS << llvm::format("%x", ID.getFile()) << llvm::format("%x", ID.getDevice())
<< "_" << llvm::utohexstr(Result.low(), /LowerCase=/true, /Width=/8);		<< "_" << llvm::utohexstr(Result.low(), /LowerCase=/true, /Width=/8);
} else {		} else {
OS << getContext().getCUIDHash();		OS << getContext().getCUIDHash();
}		}
}		}

void CodeGenModule::moveLazyEmissionStates(CodeGenModule *NewBuilder) {		void CodeGenModule::moveLazyEmissionStates(CodeGenModule *NewBuilder) {
Show All 28 Lines

clang/lib/CodeGen/TargetInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,286 Lines • ▼ Show 20 Lines

Address NVPTXABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,		Address NVPTXABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
QualType Ty) const {		QualType Ty) const {
llvm_unreachable("NVPTX does not support varargs");		llvm_unreachable("NVPTX does not support varargs");
}		}

void NVPTXTargetCodeGenInfo::setTargetAttributes(		void NVPTXTargetCodeGenInfo::setTargetAttributes(
const Decl D, llvm::GlobalValue GV, CodeGen::CodeGenModule &M) const {		const Decl D, llvm::GlobalValue GV, CodeGen::CodeGenModule &M) const {
if (GV->isDeclaration())		if (GV->isDeclaration())
		jhuber6Unsubmitted Not Done Reply Inline Actions Formatting looks weird, did you do `git clang-format HEAD~1`? jhuber6: Formatting looks weird, did you do `git clang-format HEAD~1`?
		ssquare08AuthorUnsubmitted Done Reply Inline Actions Looks like I didn't run git clang-format correctly, I'll fix it. Thanks ssquare08: Looks like I didn't run git clang-format correctly, I'll fix it. Thanks
return;		return;

const VarDecl *VD = dyn_cast_or_null<VarDecl>(D);		const VarDecl *VD = dyn_cast_or_null<VarDecl>(D);
if (VD) {		if (VD) {
if (M.getLangOpts().CUDA) {		if (M.getLangOpts().CUDA) {
if (VD->getType()->isCUDADeviceBuiltinSurfaceType())		if (VD->getType()->isCUDADeviceBuiltinSurfaceType())
addNVVMMetadata(GV, "surface", 1);		addNVVMMetadata(GV, "surface", 1);
else if (VD->getType()->isCUDADeviceBuiltinTextureType())		else if (VD->getType()->isCUDADeviceBuiltinTextureType())
addNVVMMetadata(GV, "texture", 1);		addNVVMMetadata(GV, "texture", 1);
return;		return;
▲ Show 20 Lines • Show All 2,117 Lines • ▼ Show 20 Lines	if (const auto *Attr = FD->getAttr<AMDGPUNumVGPRAttr>()) {

if (NumVGPR != 0)		if (NumVGPR != 0)
F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR));		F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR));
}		}
}		}

void AMDGPUTargetCodeGenInfo::setTargetAttributes(		void AMDGPUTargetCodeGenInfo::setTargetAttributes(
const Decl D, llvm::GlobalValue GV, CodeGen::CodeGenModule &M) const {		const Decl D, llvm::GlobalValue GV, CodeGen::CodeGenModule &M) const {
if (requiresAMDGPUProtectedVisibility(D, GV)) {		if (requiresAMDGPUProtectedVisibility(D, GV)) {
		jhuber6Unsubmitted Not Done Reply Inline Actions Just spitballing, is it possible to do this when we make the global instead? jhuber6: Just spitballing, is it possible to do this when we make the global instead?
		ssquare08AuthorUnsubmitted Done Reply Inline Actions This is something I was wondering as well. In CodeGenModule::GetOrCreateLLVMGlobal, when it creates a new global variable, it always uses the llvm::GlobalValue::ExternalLinkage. Seems like this changes somewhere later to internal for static globals. Do you know where that would be? ssquare08: This is something I was wondering as well. In CodeGenModule::GetOrCreateLLVMGlobal, when it…
		jhuber6Unsubmitted Not Done Reply Inline Actions I'm not exactly sure, I remember deleting some code in D117806 that did something like that, albeit incorrectly. But I'm not sure if you'd have the necessary information to check whether or not there are updates attached to it. We don't want to externalize things if we don't need to, otherwise we'd get a lot of our device runtime variables with external visibility that now can't be optimized out. jhuber6: I'm not exactly sure, I remember deleting some code in D117806 that did something like that…
		jhuber6Unsubmitted Not Done Reply Inline Actions Were you able to find a place for this when we generate the variable? You should be able to do something similar to the patch above if it's a declare target static to force it to have external visibility, but as mentioned before I would prefer we only do this if necessary which might take some extra analysis. jhuber6: Were you able to find a place for this when we generate the variable? You should be able to do…
		ssquare08AuthorUnsubmitted Done Reply Inline Actions If you are asking about the GV, it is created in 'CodeGenModule::GetOrCreateLLVMGlobal' with external linkage always. auto GV = new llvm::GlobalVariable( getModule(), Ty, false, llvm::GlobalValue::ExternalLinkage, nullptr, MangledName, nullptr, llvm::GlobalVariable::NotThreadLocal, getContext().getTargetAddressSpace(DAddrSpace)); The linkage, however, changes in 'CodeGenModule::EmitGlobalVarDefinition' based on the information VarDecl llvm::GlobalValue::LinkageTypes Linkage = getLLVMLinkageVarDefinition(D, GV->isConstant()); Maybe you are suggesting changing the linkage information in 'VarDecl' itself? ssquare08:* If you are asking about the GV, it is created in 'CodeGenModule::GetOrCreateLLVMGlobal' with…
		jhuber6Unsubmitted Not Done Reply Inline Actions Yes, the patch I linked previously did something like that where it set the `LinkageValue` based on some information. Although I'm not sure if it would be excessively difficult to try to prune definitions that don't need to be externalized. I haven't looked too deep into this, but I believe CUDA does this inside of `adjustGVALinkageForAttributes`, there we also check some variable called `CUDADeviceVarODRUsedByHost` that I'm assuming tracks if we need to bother externalizing this. jhuber6: Yes, the patch I linked previously did something like that where it set the `LinkageValue`…
		ssquare08AuthorUnsubmitted Not Done Reply Inline Actions The exter ssquare08: The exter
		ssquare08AuthorUnsubmitted Done Reply Inline Actions Thanks for the information, I'll take a look ssquare08: Thanks for the information, I'll take a look
GV->setVisibility(llvm::GlobalValue::ProtectedVisibility);		GV->setVisibility(llvm::GlobalValue::ProtectedVisibility);
GV->setDSOLocal(true);		GV->setDSOLocal(true);
}		}

if (GV->isDeclaration())		if (GV->isDeclaration())
return;		return;

llvm::Function *F = dyn_cast<llvm::Function>(GV);		llvm::Function *F = dyn_cast<llvm::Function>(GV);
▲ Show 20 Lines • Show All 2,417 Lines • Show Last 20 Lines

clang/lib/Sema/SemaOpenMP.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 12,953 Lines • ▼ Show 20 Lines
	}			}

	template <typename... Params>			template <typename... Params>
	static bool hasClauses(ArrayRef<OMPClause *> Clauses, const OpenMPClauseKind K,			static bool hasClauses(ArrayRef<OMPClause *> Clauses, const OpenMPClauseKind K,
	const Params... ClauseTypes) {			const Params... ClauseTypes) {
	return hasClauses(Clauses, K) \|\| hasClauses(Clauses, ClauseTypes...);			return hasClauses(Clauses, K) \|\| hasClauses(Clauses, ClauseTypes...);
	}			}

	/// Check if the variables in the mapping clause are externally visible.			/// Check if the variables in the mapping clause have hidden visibility
				/// attribute
	static bool isClauseMappable(ArrayRef<OMPClause *> Clauses) {			static bool isClauseMappable(ArrayRef<OMPClause *> Clauses) {
	for (const OMPClause *C : Clauses) {			for (const OMPClause *C : Clauses) {
	if (auto *TC = dyn_cast<OMPToClause>(C))			if (auto *TC = dyn_cast<OMPToClause>(C))
	return llvm::all_of(TC->all_decls(), [](ValueDecl *VD) {			return llvm::all_of(TC->all_decls(), [](ValueDecl *VD) {
	return !VD \|\| !VD->hasAttr<OMPDeclareTargetDeclAttr>() \|\|			return !VD \|\| !VD->hasAttr<OMPDeclareTargetDeclAttr>() \|\|
	(VD->isExternallyVisible() &&			(VD->getVisibility() != HiddenVisibility);
	VD->getVisibility() != HiddenVisibility);
	});			});
	else if (auto *FC = dyn_cast<OMPFromClause>(C))			else if (auto *FC = dyn_cast<OMPFromClause>(C))
	return llvm::all_of(FC->all_decls(), [](ValueDecl *VD) {			return llvm::all_of(FC->all_decls(), [](ValueDecl *VD) {
	return !VD \|\| !VD->hasAttr<OMPDeclareTargetDeclAttr>() \|\|			return !VD \|\| !VD->hasAttr<OMPDeclareTargetDeclAttr>() \|\|
	(VD->isExternallyVisible() &&			(VD->getVisibility() != HiddenVisibility);
	VD->getVisibility() != HiddenVisibility);
	});			});
	}			}

	return true;			return true;
	}			}

	StmtResult Sema::ActOnOpenMPTargetDataDirective(ArrayRef<OMPClause *> Clauses,			StmtResult Sema::ActOnOpenMPTargetDataDirective(ArrayRef<OMPClause *> Clauses,
	Stmt *AStmt,			Stmt *AStmt,
	▲ Show 20 Lines • Show All 10,499 Lines • Show Last 20 Lines

clang/test/OpenMP/declare_target_codegen.cpp

	Show All 37 Lines
	// CHECK-DAG: @eee_decl_tgt_ref_ptr = weak global i32* null			// CHECK-DAG: @eee_decl_tgt_ref_ptr = weak global i32* null
	// CHECK-DAG: @{{.}}maini1{{.}}aaa = internal global i64 23,			// CHECK-DAG: @{{.}}maini1{{.}}aaa = internal global i64 23,
	// CHECK-DAG: @pair = {{.*}}addrspace(3) global %struct.PAIR undef			// CHECK-DAG: @pair = {{.*}}addrspace(3) global %struct.PAIR undef
	// CHECK-DAG: @_ZN2SS3SSSE ={{ protected \| }}global i32 1,			// CHECK-DAG: @_ZN2SS3SSSE ={{ protected \| }}global i32 1,
	// CHECK-DAG: @b ={{ protected \| }}global i32 15,			// CHECK-DAG: @b ={{ protected \| }}global i32 15,
	// CHECK-DAG: @d ={{ protected \| }}global i32 0,			// CHECK-DAG: @d ={{ protected \| }}global i32 0,
	// CHECK-DAG: @c = external global i32,			// CHECK-DAG: @c = external global i32,
	// CHECK-DAG: @globals ={{ protected \| }}global %struct.S zeroinitializer,			// CHECK-DAG: @globals ={{ protected \| }}global %struct.S zeroinitializer,
	// CHECK-DAG: [[STAT:@.+stat]] = internal global %struct.S zeroinitializer,			// CHECK-DAG: [[STAT:@stat__static__.+]] = global %struct.S zeroinitializer,
	// CHECK-DAG: [[STAT_REF:@.+]] = internal constant %struct.S* [[STAT]]			// CHECK-DAG: [[STAT_REF:@.+]] = internal constant %struct.S* [[STAT]]
	// CHECK-DAG: @out_decl_target ={{ protected \| }}global i32 0,			// CHECK-DAG: @out_decl_target ={{ protected \| }}global i32 0,
	// CHECK-DAG: @llvm.compiler.used = appending global [1 x i8] [i8 bitcast (%struct.S** [[STAT_REF]] to i8*)],			// CHECK-DAG: @llvm.compiler.used = appending global [1 x i8] [i8 bitcast (%struct.S** [[STAT_REF]] to i8*)],

	// CHECK-DAG: define {{.}}i32 @{{.}}{{foo\|bar\|baz2\|baz3\|FA\|f_method}}{{.*}}()			// CHECK-DAG: define {{.}}i32 @{{.}}{{foo\|bar\|baz2\|baz3\|FA\|f_method}}{{.*}}()
	// CHECK-DAG: define {{.}}void @{{.}}TemplateClass{{.}}(%class.TemplateClass {{[^,]}} %{{.}})			// CHECK-DAG: define {{.}}void @{{.}}TemplateClass{{.}}(%class.TemplateClass {{[^,]}} %{{.}})
	// CHECK-DAG: define {{.}}i32 @{{.}}TemplateClass{{.}}f_method{{.}}(%class.TemplateClass* {{[^,]}} %{{.}})			// CHECK-DAG: define {{.}}i32 @{{.}}TemplateClass{{.}}f_method{{.}}(%class.TemplateClass* {{[^,]}} %{{.}})
	// CHECK-DAG: define {{.}}void @__omp_offloading__{{.}}_globals_l[[@LINE+78]]_ctor()			// CHECK-DAG: define {{.}}void @__omp_offloading__{{.}}_globals_l[[@LINE+78]]_ctor()
	▲ Show 20 Lines • Show All 187 Lines • ▼ Show 20 Lines
	};			};

	// CHECK-DAG: define {{.}}void @__omp_offloading_{{.}}emitted{{.*}}_l[[@LINE-5]]()			// CHECK-DAG: define {{.}}void @__omp_offloading_{{.}}emitted{{.*}}_l[[@LINE-5]]()

	// CHECK-DAG: declare extern_weak noundef signext i32 @__create()			// CHECK-DAG: declare extern_weak noundef signext i32 @__create()

	// CHECK-NOT: define {{.*}}{{baz1\|baz4\|maini1\|Base\|virtual_}}			// CHECK-NOT: define {{.*}}{{baz1\|baz4\|maini1\|Base\|virtual_}}

	// CHECK-DAG: !{i32 1, !"aaa", i32 0, i32 {{[0-9]+}}}			// CHECK-DAG: !{i32 1, !"aaa", i32 0, i32 {{[0-9]+}}, !"aaa"}
	// CHECK-DAG: !{i32 1, !"ccc", i32 0, i32 {{[0-9]+}}}			// CHECK-DAG: !{i32 1, !"ccc", i32 0, i32 {{[0-9]+}}, !"ccc"}
	// CHECK-DAG: !{{{.+}}virtual_foo			// CHECK-DAG: !{{{.+}}virtual_foo

	#ifdef OMP5			#ifdef OMP5
	void host_fun() {}			void host_fun() {}
	#pragma omp declare target to(host_fun) device_type(host)			#pragma omp declare target to(host_fun) device_type(host)
	void device_fun() {}			void device_fun() {}
	#pragma omp declare target to(device_fun) device_type(nohost)			#pragma omp declare target to(device_fun) device_type(nohost)
	// HOST5-NOT: define {{.}}void {{.}}device_fun{{.*}}			// HOST5-NOT: define {{.}}void {{.}}device_fun{{.*}}
	Show All 35 Lines

clang/test/OpenMP/declare_target_link_codegen.cpp

	Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines
	// HOST: call i32 @__tgt_target_kernel(%struct.ident_t* @{{.+}}, i64 -1, i32 -1, i32 0, i8* @.{{.+}}.region_id, %struct.__tgt_kernel_arguments* %{{.+}})			// HOST: call i32 @__tgt_target_kernel(%struct.ident_t* @{{.+}}, i64 -1, i32 -1, i32 0, i8* @.{{.+}}.region_id, %struct.__tgt_kernel_arguments* %{{.+}})
	// HOST: call void @__omp_offloading_{{.}}_{{.}}_{{.}}maini1{{.}}_l42(i32* %{{[^,]+}})			// HOST: call void @__omp_offloading_{{.}}_{{.}}_{{.}}maini1{{.}}_l42(i32* %{{[^,]+}})
	// HOST: call i32 @__tgt_target_kernel(%struct.ident_t* @{{.+}}, i64 -1, i32 0, i32 0, i8* @.{{.+}}.region_id, %struct.__tgt_kernel_arguments* %{{.+}})			// HOST: call i32 @__tgt_target_kernel(%struct.ident_t* @{{.+}}, i64 -1, i32 0, i32 0, i8* @.{{.+}}.region_id, %struct.__tgt_kernel_arguments* %{{.+}})

	// HOST: define internal void @__omp_offloading_{{.}}_{{.}}maini1{{.}}_l42(i32 noundef nonnull align {{[0-9]+}} dereferenceable{{.*}})			// HOST: define internal void @__omp_offloading_{{.}}_{{.}}maini1{{.}}_l42(i32 noundef nonnull align {{[0-9]+}} dereferenceable{{.*}})
	// HOST: [[C:%.]] = load i32, i32 @c,			// HOST: [[C:%.]] = load i32, i32 @c,
	// HOST: store i32 [[C]], i32* %			// HOST: store i32 [[C]], i32* %

	// CHECK: !{i32 1, !"c_decl_tgt_ref_ptr", i32 1, i32 {{[0-9]+}}}			// CHECK: !{i32 1, !"c_decl_tgt_ref_ptr", i32 1, i32 {{[0-9]+}}, !"c"}
	#endif // HEADER			#endif // HEADER

clang/test/OpenMP/declare_target_only_one_side_compilation.cpp

	Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	static int GY;			static int GY;
	#pragma omp end declare variant			#pragma omp end declare variant
	#pragma omp end declare target			#pragma omp end declare target
	#endif			#endif

	// TODO: It is odd, probably wrong, that we don't mangle all variables.			// TODO: It is odd, probably wrong, that we don't mangle all variables.

	// DEVICE-DAG: @G1 = {{.*}}global i32 0, align 4			// DEVICE-DAG: @G1 = {{.*}}global i32 0, align 4
	// DEVICE-DAG: @_ZL2G2 = internal {{.*}}global i32 0, align 4			// DEVICE-DAG: @_ZL2G2 = {{.*}}global i32 0, align 4
				ssquare08AuthorUnsubmitted Done Reply Inline Actions I wasn't expecting this to change. For some reason G2 gets the `OMPDeclareTargetDeclAttr::DT_Any` attribute instead of `OMPDeclareTargetDeclAttr::DT_NoHost` and because of that the visibility changes. @jdoerfert, is `OMPDeclareTargetDeclAttr::DT_Any` attribute expected here? ssquare08: I wasn't expecting this to change. For some reason G2 gets the `OMPDeclareTargetDeclAttr…
	// DEVICE-DAG: @G3 = {{.*}}global i32 0, align 4			// DEVICE-DAG: @G3 = {{.*}}global i32 0, align 4
	// DEVICE-DAG: @_ZL2G4 = internal {{.*}}global i32 0, align 4			// DEVICE-DAG: @_ZL2G4 = internal {{.*}}global i32 0, align 4
	// DEVICE-DAG: @G5 = {{.*}}global i32 0, align 4			// DEVICE-DAG: @G5 = {{.*}}global i32 0, align 4
	// DEVICE-DAG: @_ZL2G6 = internal {{.*}}global i32 0, align 4			// DEVICE-DAG: @_ZL2G6 = internal {{.*}}global i32 0, align 4
	// DEVICE-NOT: ref			// DEVICE-NOT: ref
	// DEVICE-NOT: llvm.used			// DEVICE-NOT: llvm.used
	// DEVICE-NOT: omp_offload			// DEVICE-NOT: omp_offload

	// HOST-DAG: @G7 = global i32 0, align 4			// HOST-DAG: @G7 = global i32 0, align 4
	// HOST-DAG: @_ZL2G8 = internal global i32 0, align 4			// HOST-DAG: @_ZL2G8 = internal global i32 0, align 4
	// HOST-DAG: @G9 = global i32 0, align 4			// HOST-DAG: @G9 = global i32 0, align 4
	// HOST-DAG: @_ZL3G10 = internal global i32 0, align 4			// HOST-DAG: @_ZL3G10 = internal global i32 0, align 4
	// HOST-DAG: @G11 = global i32 0, align 4			// HOST-DAG: @G11 = global i32 0, align 4
	// HOST-DAG: @_ZL3G12 = internal global i32 0, align 4			// HOST-DAG: @_ZL3G12 = internal global i32 0, align 4

clang/test/OpenMP/declare_target_visibility_codegen.cpp

	// RUN: %clang_cc1 -verify -fopenmp -fopenmp-version=45 -fopenmp-targets=powerpc64le-ibm-linux-gnu -x c++ -triple x86_64-unknown-unknown -emit-llvm %s -o - \| FileCheck %s --check-prefix=HOST			// RUN: %clang_cc1 -verify -fopenmp -fopenmp-version=45 -fopenmp-targets=powerpc64le-ibm-linux-gnu -x c++ -triple x86_64-unknown-unknown -emit-llvm %s -o - \| FileCheck %s --check-prefix=HOST

	// expected-no-diagnostics			// expected-no-diagnostics
	class C {			class C {
	public:			public:
	//.			//.
	// HOST: @[[C:.+]] = internal global %class.C zeroinitializer, align 4			// HOST: @[[C:.+]] = internal global %class.C zeroinitializer, align 4
	// HOST: @[[X:.+]] = internal global i32 0, align 4			// HOST: @[[X:.+]] = internal global i32 0, align 4
	// HOST: @y = hidden global i32 0			// HOST: @y = hidden global i32 0
	// HOST: @z = global i32 0			// HOST: @z = global i32 0
	// HOST-NOT: @.omp_offloading.entry.c			// HOST: @.omp_offloading.entry.c__static__{{[0-9a-z]+_[0-9a-z]+_l[0-9]+}}
	// HOST-NOT: @.omp_offloading.entry.x			// HOST: @.omp_offloading.entry.x__static__{{[0-9a-z]+_[0-9a-z]+_l[0-9]+}}
	// HOST-NOT: @.omp_offloading.entry.y			// HOST-NOT: @.omp_offloading.entry.y
				jhuber6Unsubmitted Not Done Reply Inline Actions If there are no updates between the host and device we can keep these static without emitting an offloading entry. jhuber6: If there are no updates between the host and device we can keep these static without emitting…
				ssquare08AuthorUnsubmitted Done Reply Inline Actions That 's a good point. I'll fix that. ssquare08: That 's a good point. I'll fix that.
				ssquare08AuthorUnsubmitted Done Reply Inline Actions I thought about this more and I think the behavior for these declare target static globals should be the same as the other declare target. Checking for update is not enough because users could also map these variables. For update, it could be mapped with a pointer or the users could pass address of these variables to an external function. Please let me know what you think of these cases below: #pragma omp declare target static int x[10]; #pragma omp end declare target //case 1 #pragma omp target update to(x) //case 2 int* y = &x[2]; #pragma omp target update to(y[0]) //case 3 #pragma omp target map(always to:x) { x[0]= 111; } //case 4 #pragma omp target { foo(&x[3]); } ssquare08: I thought about this more and I think the behavior for these declare target static globals…
				jhuber6Unsubmitted Not Done Reply Inline Actions We should still be able to do this if there are either no updates at all in the module, or if the declare type is `nohost`. Doing anything more complicated would require some optimizations between the host and device we can't do yet. I'm making this point because making these statics external is a performance regression so we should only do it when needed. To that end we may even want a flag that entirely disables this feature. jhuber6: We should still be able to do this if there are either no updates at all in the module, or if…
				ssquare08AuthorUnsubmitted Done Reply Inline Actions I'll add a check to see if there are any updates in the module. ssquare08: I'll add a check to see if there are any updates in the module.
	// HOST: @.omp_offloading.entry.z			// HOST: @.omp_offloading.entry.z
	C() : x(0) {}			C() : x(0) {}

	int x;			int x;
	};			};

	static C c;			static C c;
	#pragma omp declare target(c)			#pragma omp declare target(c)
	Show All 9 Lines

clang/test/OpenMP/nvptx_allocate_codegen.cpp

	Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines
	#pragma omp end declare target			#pragma omp end declare target
	#endif			#endif
	// CHECK1-LABEL: define {{[^@]+}}@main			// CHECK1-LABEL: define {{[^@]+}}@main
	// CHECK1-SAME: () #[[ATTR0:[0-9]+]] {			// CHECK1-SAME: () #[[ATTR0:[0-9]+]] {
	// CHECK1-NEXT: entry:			// CHECK1-NEXT: entry:
	// CHECK1-NEXT: [[RETVAL:%.*]] = alloca i32, align 4			// CHECK1-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
	// CHECK1-NEXT: [[B:%.*]] = alloca double, align 8			// CHECK1-NEXT: [[B:%.*]] = alloca double, align 8
	// CHECK1-NEXT: store i32 0, i32* [[RETVAL]], align 4			// CHECK1-NEXT: store i32 0, i32* [[RETVAL]], align 4
	// CHECK1-NEXT: store i32 2, i32* @_ZZ4mainE1a, align 4			// CHECK1-NEXT: store i32 2, i32* @_ZN2ns1aE1, align 4
	// CHECK1-NEXT: store double 3.000000e+00, double* [[B]], align 8			// CHECK1-NEXT: store double 3.000000e+00, double* [[B]], align 8
	// CHECK1-NEXT: [[CALL:%.*]] = call noundef i32 @_Z3fooIiET_v() #[[ATTR7:[0-9]+]]			// CHECK1-NEXT: [[CALL:%.*]] = call noundef i32 @_Z3fooIiET_v() #[[ATTR7:[0-9]+]]
	// CHECK1-NEXT: ret i32 [[CALL]]			// CHECK1-NEXT: ret i32 [[CALL]]
	//			//
	//			//
	// CHECK1-LABEL: define {{[^@]+}}@_Z3fooIiET_v			// CHECK1-LABEL: define {{[^@]+}}@_Z3fooIiET_v
	// CHECK1-SAME: () #[[ATTR1:[0-9]+]] comdat {			// CHECK1-SAME: () #[[ATTR1:[0-9]+]] comdat {
	// CHECK1-NEXT: entry:			// CHECK1-NEXT: entry:
	▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp

	Show All 9 Lines
	// RUN: %clang_cc1 -no-opaque-pointers -fopenmp-simd -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-ppc-host.bc -emit-pch -o %t			// RUN: %clang_cc1 -no-opaque-pointers -fopenmp-simd -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-ppc-host.bc -emit-pch -o %t
	// RUN: %clang_cc1 -no-opaque-pointers -fopenmp-simd -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-ppc-host.bc -include-pch %t -o - \| FileCheck %s --check-prefix SIMD-ONLY			// RUN: %clang_cc1 -no-opaque-pointers -fopenmp-simd -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-ppc-host.bc -include-pch %t -o - \| FileCheck %s --check-prefix SIMD-ONLY

	#ifndef HEADER			#ifndef HEADER
	#define HEADER			#define HEADER

	// SIMD-ONLY-NOT: {{__kmpc\|__tgt}}			// SIMD-ONLY-NOT: {{__kmpc\|__tgt}}

	// DEVICE-DAG: [[C_ADDR:.+]] = internal global i32 0,			// DEVICE-DAG: [[C_ADDR:.+]] = global i32 0,
	// DEVICE-DAG: [[CD_ADDR:@.+]] ={{ protected \| }}global %struct.S zeroinitializer,			// DEVICE-DAG: [[CD_ADDR:@.+]] ={{ protected \| }}global %struct.S zeroinitializer,
	// HOST-DAG: @[[C_ADDR:.+]] = internal global i32 0,			// HOST-DAG: @[[C_ADDR:.+]] = internal global i32 0,
	// HOST-DAG: @[[CD_ADDR:.+]] ={{( protected \| dso_local)?}} global %struct.S zeroinitializer,			// HOST-DAG: @[[CD_ADDR:.+]] ={{( protected \| dso_local)?}} global %struct.S zeroinitializer,

	#pragma omp declare target			#pragma omp declare target
	int foo() { return 0; }			int foo() { return 0; }
	#pragma omp end declare target			#pragma omp end declare target
	int bar() { return 0; }			int bar() { return 0; }
	Show All 40 Lines
	// DEVICE-DAG: call noundef i32 [[CAZ]]()			// DEVICE-DAG: call noundef i32 [[CAZ]]()
	// DEVICE-DAG: ret void			// DEVICE-DAG: ret void

	// HOST-DAG: @[[CD_DTOR:__omp_offloading__.+_cd_l61_dtor]] = private constant i8 0			// HOST-DAG: @[[CD_DTOR:__omp_offloading__.+_cd_l61_dtor]] = private constant i8 0
	// DEVICE-DAG: define weak_odr void [[CD_DTOR:@__omp_offloading__.+_cd_l61_dtor]]()			// DEVICE-DAG: define weak_odr void [[CD_DTOR:@__omp_offloading__.+_cd_l61_dtor]]()
	// DEVICE-DAG: call void			// DEVICE-DAG: call void
	// DEVICE-DAG: ret void			// DEVICE-DAG: ret void

				// HOST-DAG: @.omp_offloading.entry_name = internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[C_ENTRY_NAME:c__static__.+]]\00"
				// HOST-DAG: @.omp_offloading.entry.[[C_ENTRY_NAME]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 bitcast (i32* @[[C_ADDR]] to i8), i8 getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name, i32 0, i32 0), i64 4, i32 0, i32 0 }, section "omp_offloading_entries", align 1
	// HOST-DAG: @.omp_offloading.entry_name{{.*}} = internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[CD_ADDR]]\00"			// HOST-DAG: @.omp_offloading.entry_name{{.*}} = internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[CD_ADDR]]\00"
	// HOST-DAG: @.omp_offloading.entry.[[CD_ADDR]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 bitcast (%struct.S* @[[CD_ADDR]] to i8), i8 getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name{{.*}}, i32 0, i32 0), i64 4, i32 0, i32 0 }, section "omp_offloading_entries", align 1			// HOST-DAG: @.omp_offloading.entry.[[CD_ADDR]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 bitcast (%struct.S* @[[CD_ADDR]] to i8), i8 getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name{{.*}}, i32 0, i32 0), i64 4, i32 0, i32 0 }, section "omp_offloading_entries", align 1
	// HOST-DAG: @.omp_offloading.entry_name{{.*}} = internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[C_CTOR]]\00"			// HOST-DAG: @.omp_offloading.entry_name{{.*}} = internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[C_CTOR]]\00"
	// HOST-DAG: @.omp_offloading.entry.[[C_CTOR]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 @[[C_CTOR]], i8* getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name{{.*}}, i32 0, i32 0), i64 0, i32 2, i32 0 }, section "omp_offloading_entries", align 1			// HOST-DAG: @.omp_offloading.entry.[[C_CTOR]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 @[[C_CTOR]], i8* getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name{{.*}}, i32 0, i32 0), i64 0, i32 2, i32 0 }, section "omp_offloading_entries", align 1
	// HOST-DAG: @.omp_offloading.entry_name{{.*}}= internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[CD_CTOR]]\00"			// HOST-DAG: @.omp_offloading.entry_name{{.*}}= internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[CD_CTOR]]\00"
	// HOST-DAG: @.omp_offloading.entry.[[CD_CTOR]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 @[[CD_CTOR]], i8* getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name{{.*}}, i32 0, i32 0), i64 0, i32 2, i32 0 }, section "omp_offloading_entries", align 1			// HOST-DAG: @.omp_offloading.entry.[[CD_CTOR]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 @[[CD_CTOR]], i8* getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name{{.*}}, i32 0, i32 0), i64 0, i32 2, i32 0 }, section "omp_offloading_entries", align 1
	// HOST-DAG: @.omp_offloading.entry_name{{.*}}= internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[CD_DTOR]]\00"			// HOST-DAG: @.omp_offloading.entry_name{{.*}}= internal unnamed_addr constant [{{[0-9]+}} x i8] c"[[CD_DTOR]]\00"
	// HOST-DAG: @.omp_offloading.entry.[[CD_DTOR]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 @[[CD_DTOR]], i8* getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name{{.*}}, i32 0, i32 0), i64 0, i32 4, i32 0 }, section "omp_offloading_entries", align 1			// HOST-DAG: @.omp_offloading.entry.[[CD_DTOR]] = weak{{.}} constant %struct.__tgt_offload_entry { i8 @[[CD_DTOR]], i8* getelementptr inbounds ([{{[0-9]+}} x i8], [{{[0-9]+}} x i8]* @.omp_offloading.entry_name{{.*}}, i32 0, i32 0), i64 0, i32 4, i32 0 }, section "omp_offloading_entries", align 1
	Show All 9 Lines
	// DEVICE-DAG: define weak{{.}} void @__omp_offloading_{{.}}_{{.}}maini1{{.}}_l[[@LINE-7]](i32* noundef nonnull align {{[0-9]+}} dereferenceable{{[^,]*}}			// DEVICE-DAG: define weak{{.}} void @__omp_offloading_{{.}}_{{.}}maini1{{.}}_l[[@LINE-7]](i32* noundef nonnull align {{[0-9]+}} dereferenceable{{[^,]*}}
	// DEVICE-DAG: [[C:%.+]] = load i32, i32* [[C_ADDR]],			// DEVICE-DAG: [[C:%.+]] = load i32, i32* [[C_ADDR]],
	// DEVICE-DAG: store i32 [[C]], i32* %			// DEVICE-DAG: store i32 [[C]], i32* %

	// HOST: define internal void @__omp_offloading_{{.}}_{{.}}maini1{{.}}_l[[@LINE-11]](i32 noundef nonnull align {{[0-9]+}} dereferenceable{{.*}})			// HOST: define internal void @__omp_offloading_{{.}}_{{.}}maini1{{.}}_l[[@LINE-11]](i32 noundef nonnull align {{[0-9]+}} dereferenceable{{.*}})
	// HOST: [[C:%.]] = load i32, i32 @[[C_ADDR]],			// HOST: [[C:%.]] = load i32, i32 @[[C_ADDR]],
	// HOST: store i32 [[C]], i32* %			// HOST: store i32 [[C]], i32* %

	// HOST-DAG: !{i32 1, !"[[CD_ADDR]]", i32 0, i32 {{[0-9]+}}}			// HOST-DAG: !{i32 1, !"[[CD_ADDR]]", i32 0, i32 {{[0-9]+}}, !"cd"}
	// HOST-DAG: !{i32 1, !"[[C_ADDR]]", i32 0, i32 {{[0-9]+}}}			// HOST-DAG: !{i32 1, !"[[C_ENTRY_NAME]]", i32 0, i32 {{[0-9]+}}, !"c"}

	// DEVICE: !nvvm.annotations			// DEVICE: !nvvm.annotations
	// DEVICE-DAG: !{void ()* [[C_CTOR]], !"kernel", i32 1}			// DEVICE-DAG: !{void ()* [[C_CTOR]], !"kernel", i32 1}
	// DEVICE-DAG: !{void ()* [[CD_CTOR]], !"kernel", i32 1}			// DEVICE-DAG: !{void ()* [[CD_CTOR]], !"kernel", i32 1}
	// DEVICE-DAG: !{void ()* [[CD_DTOR]], !"kernel", i32 1}			// DEVICE-DAG: !{void ()* [[CD_DTOR]], !"kernel", i32 1}

	#endif // HEADER			#endif // HEADER

clang/test/OpenMP/target_update_messages.cpp

	// RUN: %clang_cc1 -verify=expected,lt50,lt51 -fopenmp -fopenmp-version=45 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized			// RUN: %clang_cc1 -verify=expected,lt50,lt51 -fopenmp -fopenmp-version=45 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized
	// RUN: %clang_cc1 -verify=expected,ge50,lt51 -fopenmp -fopenmp-version=50 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized			// RUN: %clang_cc1 -verify=expected,ge50,lt51 -fopenmp -fopenmp-version=50 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized
	// RUN: %clang_cc1 -verify=expected,ge50,ge51 -fopenmp -fopenmp-version=51 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized			// RUN: %clang_cc1 -verify=expected,ge50,ge51 -fopenmp -fopenmp-version=51 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized

	// RUN: %clang_cc1 -verify=expected,lt50,lt51 -fopenmp-simd -fopenmp-version=45 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized			// RUN: %clang_cc1 -verify=expected,lt50,lt51 -fopenmp-simd -fopenmp-version=45 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized
	// RUN: %clang_cc1 -verify=expected,ge50,lt51 -fopenmp-simd -fopenmp-version=50 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized			// RUN: %clang_cc1 -verify=expected,ge50,lt51 -fopenmp-simd -fopenmp-version=50 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized
	// RUN: %clang_cc1 -verify=expected,ge50,ge51 -fopenmp-simd -fopenmp-version=51 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized			// RUN: %clang_cc1 -verify=expected,ge50,ge51 -fopenmp-simd -fopenmp-version=51 -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized

	// RUN: %clang_cc1 -verify=expected,ge50,ge51,cxx2b -fopenmp -fopenmp-simd -fopenmp-version=51 -x c++ -std=c++2b %s -Wuninitialized			// RUN: %clang_cc1 -verify=expected,ge50,ge51,cxx2b -fopenmp -fopenmp-simd -fopenmp-version=51 -x c++ -std=c++2b %s -Wuninitialized

	void xxx(int argc) {			void xxx(int argc) {
	int x; // expected-note {{initialize the variable 'x' to silence this warning}}			int x; // expected-note {{initialize the variable 'x' to silence this warning}}
	#pragma omp target update to(x)			#pragma omp target update to(x)
	argc = x; // expected-warning {{variable 'x' is uninitialized when used here}}			argc = x; // expected-warning {{variable 'x' is uninitialized when used here}}
	}			}

	static int y;
	#pragma omp declare target(y)

	void yyy() {
	#pragma omp target update to(y) // expected-error {{the host cannot update a declare target variable that is not externally visible.}}
	}

	int __attribute__((visibility("hidden"))) z;			int __attribute__((visibility("hidden"))) z;
				jdoerfertUnsubmitted Not Done Reply Inline Actions There is no test to show you can actually write the update now, is there? jdoerfert: There is no test to show you can actually write the update now, is there?
				jhuber6Unsubmitted Not Done Reply Inline Actions We should probably take the deleted code above and put it in an OpenMP runtime test to make sure it actually works now. jhuber6: We should probably take the deleted code above and put it in an OpenMP runtime test to make…
				ssquare08AuthorUnsubmitted Done Reply Inline Actions I have now added a test as suggested. ssquare08: I have now added a test as suggested.
	#pragma omp declare target(z)			#pragma omp declare target(z)

	void zzz() {			void zzz() {
	#pragma omp target update from(z) // expected-error {{the host cannot update a declare target variable that is not externally visible.}}			#pragma omp target update from(z) // expected-error {{the host cannot update a declare target variable that is not externally visible.}}
	}			}

	void foo() {			void foo() {
	}			}
	▲ Show 20 Lines • Show All 235 Lines • Show Last 20 Lines

openmp/libomptarget/test/mapping/declare_target_static_var.c

This file was added.

				// RUN: %libomptarget-compile-run-and-check-generic

				#include <stdio.h>

				#pragma omp declare target
				static int y;
				#pragma omp end declare target

				int main(void) {
				y = 2;
				#pragma omp target update to(y)

				#pragma omp target
				{ y += 3; }

				#pragma omp target update from(y)

				// CHECK: Declare target var update successful
				printf("Declare target var update %s\n", (y == 5) ? "successful" : "failed");
				return 0;
				}

This is an archive of the discontinued LLVM Phabricator instance.

[OPENMP] Make declare target static global externally visibleNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 447901

clang/lib/CodeGen/CGOpenMPRuntime.h

clang/lib/CodeGen/CGOpenMPRuntime.cpp

clang/lib/CodeGen/CGOpenMPRuntimeGPU.h

clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp

clang/lib/CodeGen/CodeGenModule.cpp

clang/lib/CodeGen/TargetInfo.cpp

clang/lib/Sema/SemaOpenMP.cpp

clang/test/OpenMP/declare_target_codegen.cpp

clang/test/OpenMP/declare_target_link_codegen.cpp

clang/test/OpenMP/declare_target_only_one_side_compilation.cpp

clang/test/OpenMP/declare_target_visibility_codegen.cpp

clang/test/OpenMP/nvptx_allocate_codegen.cpp

clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp

clang/test/OpenMP/target_update_messages.cpp

openmp/libomptarget/test/mapping/declare_target_static_var.c

[OPENMP] Make declare target static global externally visible
Needs ReviewPublic