This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
TargetInfo.h
-
lib/
-
Basic/Targets/
-
Targets/
-
NVPTX.h
1/1
NVPTX.cpp
-
CodeGen/
-
CGStmtOpenMP.cpp

Differential D125350

[RFC][Clang] Add a check if the target supports atomicrmw instruction with specific operator and type
AbandonedPublic

Authored by tianshilei1992 on May 10 2022, 8:22 PM.

Download Raw Diff

Details

Reviewers

jdoerfert
ABataev

Summary

With respect of atomicrmw instruction, different targets have different
backend support. The front end can always emit the intruction, but when it comes
to the backend, the compiler could crash at instruction selection because the
target doesn't support the instruction/operator at all. Currently we don't have
a way in the front end to tell if an atomicrmw with specific operator and type
is supported. This patch adds a virtual function hasAtomicrmw to the class
TargetInfo and it is expected to be implemented by all targets accordingly.
In this way, we can always try to emit atomicrmw first because it has better
performance than falling back to CAS loop. Currently at the PoC stage, I only make
it in NVPTX, but if the community is happy with this change, I'll add to all the
targets.

For the long term, I'll seek a way to expand the instruction if the target doesn't
support it, probably similar to things in llvm/lib/CodeGen/AtomicExpandPass.cpp.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tianshilei1992 created this revision.May 10 2022, 8:22 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2022, 8:22 PM

Herald added subscribers: mattd, gchakrabarti, asavonic, jholewinski. · View Herald Transcript

tianshilei1992 requested review of this revision.May 10 2022, 8:22 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2022, 8:22 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Feel free to add more reviewers

tianshilei1992 added inline comments.May 10 2022, 8:24 PM

clang/lib/Basic/Targets/NVPTX.cpp
311	`llvm_unreachable` might be more appropriate here.

tianshilei1992 retitled this revision from [PoC][Clang] Add a check if the target supports atomicrmw instruction with specific operator and type to [RFC][Clang] Add a check if the target supports atomicrmw instruction with specific operator and type.May 10 2022, 8:25 PM

As for the test, here is a simple code snippet:

#pragma omp begin declare target device_type(nohost)
void foo() {
  double x, y;
#pragma omp atomic update
  x += y;
}
#pragma omp end declare target

sm_35 doesn't support FAdd with double type, then it emits:

define protected void @foo() #0 {
entry:
  %x = alloca double, align 8
  %y = alloca double, align 8
  %atomic-temp = alloca double, align 8
  %0 = load double, ptr %y, align 8
  %atomic-load = load atomic i64, ptr %x monotonic, align 8
  br label %atomic_cont

atomic_cont:                                      ; preds = %atomic_cont, %entry
  %1 = phi i64 [ %atomic-load, %entry ], [ %5, %atomic_cont ]
  %2 = bitcast i64 %1 to double
  %add = fadd double %2, %0
  store double %add, ptr %atomic-temp, align 8
  %3 = load i64, ptr %atomic-temp, align 8
  %4 = cmpxchg ptr %x, i64 %1, i64 %3 monotonic monotonic, align 8
  %5 = extractvalue { i64, i1 } %4, 0
  %6 = extractvalue { i64, i1 } %4, 1
  br i1 %6, label %atomic_exit, label %atomic_cont

atomic_exit:                                      ; preds = %atomic_cont
  ret void
}

Starting from sm_60, it is supported, now we get:

define protected void @foo() #0 {
entry:
  %x = alloca double, align 8
  %y = alloca double, align 8
  %0 = load double, ptr %y, align 8
  %1 = atomicrmw fadd ptr %x, double %0 monotonic, align 8
  ret void
}

tianshilei1992 edited the summary of this revision. (Show Details)May 10 2022, 8:38 PM

Harbormaster completed remote builds in B163829: Diff 428557.May 10 2022, 8:50 PM

Do it in BE.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

TargetInfo.h

7 lines

lib/

Basic/

Targets/

NVPTX.h

3 lines

NVPTX.cpp

40 lines

CodeGen/

CGStmtOpenMP.cpp

5 lines

Diff 428557

clang/include/clang/Basic/TargetInfo.h

Show All 25 Lines
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/IntrusiveRefCntPtr.h"		#include "llvm/ADT/IntrusiveRefCntPtr.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
#include "llvm/Frontend/OpenMP/OMPGridValues.h"		#include "llvm/Frontend/OpenMP/OMPGridValues.h"
		#include "llvm/IR/Instructions.h"
#include "llvm/Support/DataTypes.h"		#include "llvm/Support/DataTypes.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
#include "llvm/Support/VersionTuple.h"		#include "llvm/Support/VersionTuple.h"
#include <cassert>		#include <cassert>
#include <string>		#include <string>
#include <vector>		#include <vector>

namespace llvm {		namespace llvm {
▲ Show 20 Lines • Show All 709 Lines • ▼ Show 20 Lines	public:
virtual bool hasBuiltinAtomic(uint64_t AtomicSizeInBits,		virtual bool hasBuiltinAtomic(uint64_t AtomicSizeInBits,
uint64_t AlignmentInBits) const {		uint64_t AlignmentInBits) const {
return AtomicSizeInBits <= AlignmentInBits &&		return AtomicSizeInBits <= AlignmentInBits &&
AtomicSizeInBits <= getMaxAtomicInlineWidth() &&		AtomicSizeInBits <= getMaxAtomicInlineWidth() &&
(AtomicSizeInBits <= getCharWidth() \|\|		(AtomicSizeInBits <= getCharWidth() \|\|
llvm::isPowerOf2_64(AtomicSizeInBits / getCharWidth()));		llvm::isPowerOf2_64(AtomicSizeInBits / getCharWidth()));
}		}

		/// Returns true if the given target supports atomicrmw with given operator
		/// and type.
		virtual bool hasAtomicrmw(llvm::AtomicRMWInst::BinOp, llvm::Type *) const {
		llvm_unreachable("hasAtomicrmw not implemented on this target");
		}

/// Return the maximum vector alignment supported for the given target.		/// Return the maximum vector alignment supported for the given target.
unsigned getMaxVectorAlign() const { return MaxVectorAlign; }		unsigned getMaxVectorAlign() const { return MaxVectorAlign; }
/// Return default simd alignment for the given target. Generally, this		/// Return default simd alignment for the given target. Generally, this
/// value is type-specific, but this alignment can be used for most of the		/// value is type-specific, but this alignment can be used for most of the
/// types for the given target.		/// types for the given target.
unsigned getSimdDefaultAlign() const { return SimdDefaultAlign; }		unsigned getSimdDefaultAlign() const { return SimdDefaultAlign; }

unsigned getMaxOpenCLWorkGroupSize() const { return MaxOpenCLWorkGroupSize; }		unsigned getMaxOpenCLWorkGroupSize() const { return MaxOpenCLWorkGroupSize; }
▲ Show 20 Lines • Show All 899 Lines • Show Last 20 Lines

clang/lib/Basic/Targets/NVPTX.h

Show All 11 Lines

#ifndef LLVM_CLANG_LIB_BASIC_TARGETS_NVPTX_H		#ifndef LLVM_CLANG_LIB_BASIC_TARGETS_NVPTX_H
#define LLVM_CLANG_LIB_BASIC_TARGETS_NVPTX_H		#define LLVM_CLANG_LIB_BASIC_TARGETS_NVPTX_H

#include "clang/Basic/Cuda.h"		#include "clang/Basic/Cuda.h"
#include "clang/Basic/TargetInfo.h"		#include "clang/Basic/TargetInfo.h"
#include "clang/Basic/TargetOptions.h"		#include "clang/Basic/TargetOptions.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
		#include "llvm/IR/Instructions.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"

namespace clang {		namespace clang {
namespace targets {		namespace targets {

static const unsigned NVPTXAddrSpaceMap[] = {		static const unsigned NVPTXAddrSpaceMap[] = {
0, // Default		0, // Default
1, // opencl_global		1, // opencl_global
▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines	CallingConvCheckResult checkCallingConvention(CallingConv CC) const override {
// TODO: We should warn if you apply a non-default CC to anything other than		// TODO: We should warn if you apply a non-default CC to anything other than
// a host function.		// a host function.
if (HostTarget)		if (HostTarget)
return HostTarget->checkCallingConvention(CC);		return HostTarget->checkCallingConvention(CC);
return CCCR_Warning;		return CCCR_Warning;
}		}

bool hasBitIntType() const override { return true; }		bool hasBitIntType() const override { return true; }

		bool hasAtomicrmw(llvm::AtomicRMWInst::BinOp, llvm::Type *) const override;
};		};
} // namespace targets		} // namespace targets
} // namespace clang		} // namespace clang
#endif // LLVM_CLANG_LIB_BASIC_TARGETS_NVPTX_H		#endif // LLVM_CLANG_LIB_BASIC_TARGETS_NVPTX_H

clang/lib/Basic/Targets/NVPTX.cpp

Show First 20 Lines • Show All 267 Lines • ▼ Show 20 Lines	if (Opts.CUDAIsDevice) {
Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);		Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);
}		}
}		}

ArrayRef<Builtin::Info> NVPTXTargetInfo::getTargetBuiltins() const {		ArrayRef<Builtin::Info> NVPTXTargetInfo::getTargetBuiltins() const {
return llvm::makeArrayRef(BuiltinInfo, clang::NVPTX::LastTSBuiltin -		return llvm::makeArrayRef(BuiltinInfo, clang::NVPTX::LastTSBuiltin -
Builtin::FirstTSBuiltin);		Builtin::FirstTSBuiltin);
}		}

		bool NVPTXTargetInfo::hasAtomicrmw(llvm::AtomicRMWInst::BinOp Op,
		llvm::Type *Ty) const {
		switch (Op) {
		default:
		llvm_unreachable("bad binop");
		case llvm::AtomicRMWInst::BinOp::FSub:
		case llvm::AtomicRMWInst::BinOp::Nand:
		return false;
		case llvm::AtomicRMWInst::BinOp::FAdd:
		assert(Ty->isFloatingPointTy());
		if (Ty->isFloatTy())
		return true;
		if (Ty->isDoubleTy() && GPU >= CudaArch::SM_60)
		return true;
		return false;
		case llvm::AtomicRMWInst::BinOp::Add:
		case llvm::AtomicRMWInst::BinOp::Sub:
		case llvm::AtomicRMWInst::BinOp::Max:
		case llvm::AtomicRMWInst::BinOp::Min:
		case llvm::AtomicRMWInst::BinOp::UMax:
		case llvm::AtomicRMWInst::BinOp::UMin:
		case llvm::AtomicRMWInst::BinOp::Xchg:
		case llvm::AtomicRMWInst::BinOp::And:
		case llvm::AtomicRMWInst::BinOp::Or:
		case llvm::AtomicRMWInst::BinOp::Xor:
		assert(Ty->isIntegerTy());
		switch (cast<llvm::IntegerType>(Ty)->getBitWidth()) {
		case 32:
		return true;
		case 64:
		return true;
		default:
		return false;
		}
		return false;
		tianshilei1992AuthorUnsubmitted Done Reply Inline Actions `llvm_unreachable` might be more appropriate here. tianshilei1992: `llvm_unreachable` might be more appropriate here.
		}

		return false;
		}

clang/lib/CodeGen/CGStmtOpenMP.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,919 Lines • ▼ Show 20 Lines	static std::pair<bool, RValue> emitOMPAtomicRMW(CodeGenFunction &CGF, LValue X,
case BO_MulAssign:		case BO_MulAssign:
case BO_DivAssign:		case BO_DivAssign:
case BO_RemAssign:		case BO_RemAssign:
case BO_ShlAssign:		case BO_ShlAssign:
case BO_ShrAssign:		case BO_ShrAssign:
case BO_Comma:		case BO_Comma:
llvm_unreachable("Unsupported atomic update operation");		llvm_unreachable("Unsupported atomic update operation");
}		}

		if (!Context.getTargetInfo().hasAtomicrmw(RMWOp,
		X.getAddress(CGF).getElementType()))
		return std::make_pair(false, RValue::get(nullptr));

llvm::Value *UpdateVal = Update.getScalarVal();		llvm::Value *UpdateVal = Update.getScalarVal();
if (auto *IC = dyn_cast<llvm::ConstantInt>(UpdateVal)) {		if (auto *IC = dyn_cast<llvm::ConstantInt>(UpdateVal)) {
if (IsInteger)		if (IsInteger)
UpdateVal = CGF.Builder.CreateIntCast(		UpdateVal = CGF.Builder.CreateIntCast(
IC, X.getAddress(CGF).getElementType(),		IC, X.getAddress(CGF).getElementType(),
X.getType()->hasSignedIntegerRepresentation());		X.getType()->hasSignedIntegerRepresentation());
else		else
UpdateVal = CGF.Builder.CreateCast(llvm::Instruction::CastOps::UIToFP, IC,		UpdateVal = CGF.Builder.CreateCast(llvm::Instruction::CastOps::UIToFP, IC,
▲ Show 20 Lines • Show All 1,762 Lines • Show Last 20 Lines