This is an archive of the discontinued LLVM Phabricator instance.

domada retitled this revision from [Flang][AMDGPU][OpenMP] Save target features in OpenMP MLIR dialect to [WIP][Flang][AMDGPU][OpenMP] Save target features in OpenMP MLIR dialect.Mar 16 2023, 5:41 AM

Herald added a subscriber: jplehr. · View Herald TranscriptMar 16 2023, 5:41 AM

Patch rebased

domada added a child revision: D146612: [Flang][OpenMP][MLIR] Lower OpenMP target attributes.Mar 22 2023, 2:28 AM

D146612 presents the lowering from MLIR attributes to LLVM IR.

Really nice to see some shared code being elevated out of Clang into LLVM, thanks!

I've only reviewed on the Flang driver changes. I will let the OpenMP experts to review the remaining bits. All in all looks good, I've only made some small suggestions.

Thanks for working on this!

flang/lib/Frontend/FrontendActions.cpp
93	This method could be simplified by following https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code. For example: std::string CodeGenAction::getAllTargetFeatures() { if (!triple.isAMDGPU()) { allFeaturesStr = llvm::join(targetOpts.featuresAsWritten.begin(), targetOpts.featuresAsWritten.end(), ","); return allFeaturesStr; } // The logic for AMDGPU // Perhaps add a dedicated hook: getExplicitAndImplicitAMDGPUTargetFeatures() } Btw, this method does different things depending on the triple. Perhaps rename it as something more generic, e.g. `getTargetFeatures`? I think that the current name, `getAllTargetFeatures`, is a bit misleading (i.e. what does "all" mean?). Also: make it `static` document

Harbormaster completed remote builds in B220952: Diff 507269.Mar 22 2023, 3:06 AM

Applied remarks.
Moved OpenMP changes to https://reviews.llvm.org/D146612 .

domada marked an inline comment as done.Mar 22 2023, 6:29 AM

domada added inline comments.

flang/lib/Frontend/FrontendActions.cpp
93	Hi, thanks for the review. I applied your remarks. I also moved OpenMP related changes to the child review.

A few more comments, but mostly nits. Btw, is this patch sufficient to generate code for AMDGPU? Or, put differently, what's the level of support atm?

clang/lib/Driver/ToolChains/Flang.cpp
107	Should there be a test for this triple as well?
flang/lib/Frontend/FrontendActions.cpp
134–137	[nit] I would probably flip this as: if (triple.isAMDGPU()) { // Clang does not append all target features to the clang -cc1 invocation. // Some AMDGPU features are passed implicitly by the Clang frontend. // That's why we need to extract implicit AMDGPU target features and add // them to the target features specified by the user return getExplicitAndImplicitAMDGPUTargetFeatures(ci, targetOpts, triple); } I know that I suggested the opposite in my previous comment, but you have simplified this since :) In any case, feel free to ignore.
139–142	[nit] IMHO this is documenting an implementation detail that would be more relevant inside `getExplicitAndImplicitAMDGPUTargetFeatures`. More importantly, you are suggesting that the driver is doing whatever it is doing "because that's what Clang does". Consistency with Clang is important (you could call it out in the commit message) :) However, it would be even nicer to understand the actual rationale behind these implicit features. Any idea? Perhaps there are some clues in git history? Also, perhaps a TODO to share this code with Clang? (to make it even clearer that the frontend driver aims for full consistency with Clang here).

Harbormaster completed remote builds in B221008: Diff 507333.Mar 22 2023, 8:16 AM

domada marked an inline comment as done.Mar 23 2023, 4:25 AM

domada added inline comments.

flang/lib/Frontend/FrontendActions.cpp
139–142	I think that the main difference between Clang and Flang is the lack of `TargetInfo` class. TargetInfo classes are Clang specific and they are responsible for parsing/adding default target features. Every target performs initialization in different way (compare for example AArch64 vs AMDGPU target initialization. I don't want to make TargetInfo class Clang indendent (see discussion: https://discourse.llvm.org/t/rfc-targetinfo-library/64342 ). I also don't want to reimplement the whole TargetInfo class in Flang, because Flang already uses LLVM TargetMachine class (see: https://llvm.org/doxygen/classllvm_1_1TargetMachine.html and https://github.com/llvm/llvm-project/blob/main/flang/lib/Frontend/FrontendActions.cpp#L614 ) which can play similar role as Clang TargetInfo IMO. That's why I decided to implement `getExplicitAndImplicitAMDGPUTargetFeatures` function which performs initialization of default AMDGPU target features.

awarzynski added inline comments.Mar 26 2023, 10:15 AM

flang/lib/Frontend/FrontendActions.cpp
139–142	Thanks for this comprehensive overview! It would be helpful if this rationale was included in the summary (in the spirit of documenting things for our future selves). So, there isn't anything special about AMDGPU here, is there? We will have to implement similar hooks for other targets at some point too, right? Or perhaps there's some reason to do this for AMDGPU sooner rather than later? I'm not against this change, just want to better understand the wider context.

Rebase & applied review remarks

Harbormaster completed remote builds in B221992: Diff 508618.Mar 27 2023, 6:14 AM

domada added inline comments.Mar 27 2023, 6:27 AM

flang/lib/Frontend/FrontendActions.cpp

139–142

Hi,
I modified the comment above to be more informative.

IMO, we need to add similar hooks for other targets. For example:

clang --target=aarch64 t.c -S -emit-llvm -v 
// I see in the logs explicit target features:
// +neon,  +v8a ,  -fmv
// However, generated t.ll contains 4 target features:
"target-features"="+fp-armv8,+neon,+v8a,-fmv"
// It looks like target feature +fp-armv8 is implicit

LGTM about the AMDGPU TargetInfo change.

Do you want to move the AMDGPU changes into AMDGPU.cpp next to AMD.cpp? From the conversation, there seems to be more target specific behaviours.

In D145579#4224157, @tschuett wrote:

Do you want to move the AMDGPU changes into AMDGPU.cpp next to AMD.cpp? From the conversation, there seems to be more target specific behaviours.

I prefer to defer further refactoring to the future, so that downstream branches have time to digest this change first.

Patch rebased

Thanks for the updates, mostly looks good. Just a couple of extra questions about the test coverage.

flang/lib/Frontend/FrontendActions.cpp
106–110
108	Are you able to add a test that will trigger this?
139–142	Thanks for checking!
flang/test/Driver/target-cpu-features.f90
65	Hm, there aren't any "implicit" target features here.

Harbormaster completed remote builds in B222022: Diff 508660.Mar 27 2023, 8:54 PM

In D145579#4224157, @tschuett wrote:

Do you want to move the AMDGPU changes into AMDGPU.cpp next to AMD.cpp? From the conversation, there seems to be more target specific behaviours.

@tschuett No. I don't plan to further refactor AMDGPU::TargeTInfo Clang class. My current goal is to add function attributes to the LLVM IR for Fortran OpenMP code and I don't need to make more changes in Clang to finish my goal.

I wanted to ask whether you want to put an AMDGPU.cpp and AMD.cpp file in the flang/lib/Frontend directory.

Patch rebased and added new test for checking incorrect wavefront sizes AMDGPU target features.

domada marked 2 inline comments as done.Mar 28 2023, 6:00 AM

domada added inline comments.

flang/lib/Frontend/FrontendActions.cpp
108	Yes, please check the newest patch.
flang/test/Driver/target-cpu-features.f90
65	You will see them in MLIR code: https://reviews.llvm.org/D146612#change-ciOHRHlq0yvL (file: mlir/test/Target/LLVMIR/openmp-llvm.mlir ) Clang also does not list these features in command line for AMDGPU. Slightly different situation is for example for ARM target. Clang list 3 options as -cc1 options and it attaches 4 target options to the generated LLVM IR: clang --target=aarch64 t.c -S -emit-llvm -v // I see in the command line 3 explicit target features: // +neon, +v8a , -fmv // However, generated t.ll contains 4 target features: "target-features"="+fp-armv8,+neon,+v8a,-fmv" // It looks like target feature +fp-armv8 is implicit

In D145579#4226542, @tschuett wrote:

I wanted to ask whether you want to put an AMDGPU.cpp and AMD.cpp file in the flang/lib/Frontend directory.

@tschuett No, I don't plan to modify further flang/lib/Frontend.

Thanks for implementing this, LGTM!

This revision is now accepted and ready to land.Mar 28 2023, 6:21 AM

Harbormaster completed remote builds in B222221: Diff 508977.Mar 28 2023, 7:02 AM

Closed by commit rGe43247dd329c: [Clang][Flang][AMDGPU] Add support for AMDGPU to Flang driver (authored by domada). · Explain WhyMar 29 2023, 12:30 AM

This revision was automatically updated to reflect the committed changes.

domada added a commit: rGe43247dd329c: [Clang][Flang][AMDGPU] Add support for AMDGPU to Flang driver.

MaskRay mentioned this in D151590: [Driver] Add ClangFlags::TargetSpecific to simplify err_drv_unsupported_opt_for_target processing.May 30 2023, 11:18 AM

MaskRay mentioned this in rGbd0aab5a1599: [Frontend] Sort featuresVec for AMDGPU target features.Jul 22 2023, 3:57 PM

Revision Contents

Path

Size

clang/

lib/

Basic/

Targets/

AMDGPU.cpp

179 lines

Driver/

ToolChains/

CommonArgs.cpp

3 lines

Flang.cpp

4 lines

flang/

include/

flang/

Frontend/

FrontendActions.h

2 lines

lib/

Frontend/

FrontendActions.cpp

59 lines

test/

Driver/

target-cpu-features.f90

6 lines

Lower/

OpenMP/

target_cpu_features.f90

16 lines

llvm/

include/

llvm/

TargetParser/

TargetParser.h

9 lines

lib/

TargetParser/

TargetParser.cpp

209 lines

mlir/

include/

mlir/

Dialect/

OpenMP/

OpenMPOps.td

15 lines

lib/

Dialect/

OpenMP/

IR/

OpenMPDialect.cpp

33 lines

Diff 503324

clang/lib/Basic/Targets/AMDGPU.cpp

	Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines

	ArrayRef<const char *> AMDGPUTargetInfo::getGCCRegNames() const {			ArrayRef<const char *> AMDGPUTargetInfo::getGCCRegNames() const {
	return llvm::ArrayRef(GCCRegNames);			return llvm::ArrayRef(GCCRegNames);
	}			}

	bool AMDGPUTargetInfo::initFeatureMap(			bool AMDGPUTargetInfo::initFeatureMap(
	llvm::StringMap<bool> &Features, DiagnosticsEngine &Diags, StringRef CPU,			llvm::StringMap<bool> &Features, DiagnosticsEngine &Diags, StringRef CPU,
	const std::vector<std::string> &FeatureVec) const {			const std::vector<std::string> &FeatureVec) const {
	const bool IsNullCPU = CPU.empty();
	bool IsWave32Capable = false;

	using namespace llvm::AMDGPU;			using namespace llvm::AMDGPU;
				fillAMDGPUFeatureMap(CPU, getTriple(), Features);
	// XXX - What does the member GPU mean if device name string passed here?
	if (isAMDGCN(getTriple())) {
	switch (llvm::AMDGPU::parseArchAMDGCN(CPU)) {
	case GK_GFX1103:
	case GK_GFX1102:
	case GK_GFX1101:
	case GK_GFX1100:
	IsWave32Capable = true;
	Features["ci-insts"] = true;
	Features["dot5-insts"] = true;
	Features["dot7-insts"] = true;
	Features["dot8-insts"] = true;
	Features["dot9-insts"] = true;
	Features["dot10-insts"] = true;
	Features["dl-insts"] = true;
	Features["16-bit-insts"] = true;
	Features["dpp"] = true;
	Features["gfx8-insts"] = true;
	Features["gfx9-insts"] = true;
	Features["gfx10-insts"] = true;
	Features["gfx10-3-insts"] = true;
	Features["gfx11-insts"] = true;
	break;
	case GK_GFX1036:
	case GK_GFX1035:
	case GK_GFX1034:
	case GK_GFX1033:
	case GK_GFX1032:
	case GK_GFX1031:
	case GK_GFX1030:
	IsWave32Capable = true;
	Features["ci-insts"] = true;
	Features["dot1-insts"] = true;
	Features["dot2-insts"] = true;
	Features["dot5-insts"] = true;
	Features["dot6-insts"] = true;
	Features["dot7-insts"] = true;
	Features["dot10-insts"] = true;
	Features["dl-insts"] = true;
	Features["16-bit-insts"] = true;
	Features["dpp"] = true;
	Features["gfx8-insts"] = true;
	Features["gfx9-insts"] = true;
	Features["gfx10-insts"] = true;
	Features["gfx10-3-insts"] = true;
	Features["s-memrealtime"] = true;
	Features["s-memtime-inst"] = true;
	break;
	case GK_GFX1012:
	case GK_GFX1011:
	Features["dot1-insts"] = true;
	Features["dot2-insts"] = true;
	Features["dot5-insts"] = true;
	Features["dot6-insts"] = true;
	Features["dot7-insts"] = true;
	Features["dot10-insts"] = true;
	[[fallthrough]];
	case GK_GFX1013:
	case GK_GFX1010:
	IsWave32Capable = true;
	Features["dl-insts"] = true;
	Features["ci-insts"] = true;
	Features["16-bit-insts"] = true;
	Features["dpp"] = true;
	Features["gfx8-insts"] = true;
	Features["gfx9-insts"] = true;
	Features["gfx10-insts"] = true;
	Features["s-memrealtime"] = true;
	Features["s-memtime-inst"] = true;
	break;
	case GK_GFX940:
	Features["gfx940-insts"] = true;
	Features["fp8-insts"] = true;
	[[fallthrough]];
	case GK_GFX90A:
	Features["gfx90a-insts"] = true;
	[[fallthrough]];
	case GK_GFX908:
	Features["dot3-insts"] = true;
	Features["dot4-insts"] = true;
	Features["dot5-insts"] = true;
	Features["dot6-insts"] = true;
	Features["mai-insts"] = true;
	[[fallthrough]];
	case GK_GFX906:
	Features["dl-insts"] = true;
	Features["dot1-insts"] = true;
	Features["dot2-insts"] = true;
	Features["dot7-insts"] = true;
	Features["dot10-insts"] = true;
	[[fallthrough]];
	case GK_GFX90C:
	case GK_GFX909:
	case GK_GFX904:
	case GK_GFX902:
	case GK_GFX900:
	Features["gfx9-insts"] = true;
	[[fallthrough]];
	case GK_GFX810:
	case GK_GFX805:
	case GK_GFX803:
	case GK_GFX802:
	case GK_GFX801:
	Features["gfx8-insts"] = true;
	Features["16-bit-insts"] = true;
	Features["dpp"] = true;
	Features["s-memrealtime"] = true;
	[[fallthrough]];
	case GK_GFX705:
	case GK_GFX704:
	case GK_GFX703:
	case GK_GFX702:
	case GK_GFX701:
	case GK_GFX700:
	Features["ci-insts"] = true;
	[[fallthrough]];
	case GK_GFX602:
	case GK_GFX601:
	case GK_GFX600:
	Features["s-memtime-inst"] = true;
	break;
	case GK_NONE:
	break;
	default:
	llvm_unreachable("Unhandled GPU!");
	}
	} else {
	if (CPU.empty())
	CPU = "r600";

	switch (llvm::AMDGPU::parseArchR600(CPU)) {
	case GK_CAYMAN:
	case GK_CYPRESS:
	case GK_RV770:
	case GK_RV670:
	// TODO: Add fp64 when implemented.
	break;
	case GK_TURKS:
	case GK_CAICOS:
	case GK_BARTS:
	case GK_SUMO:
	case GK_REDWOOD:
	case GK_JUNIPER:
	case GK_CEDAR:
	case GK_RV730:
	case GK_RV710:
	case GK_RS880:
	case GK_R630:
	case GK_R600:
	break;
	default:
	llvm_unreachable("Unhandled GPU!");
	}
	}

	if (!TargetInfo::initFeatureMap(Features, Diags, CPU, FeatureVec))			if (!TargetInfo::initFeatureMap(Features, Diags, CPU, FeatureVec))
	return false;			return false;

	// FIXME: Not diagnosing wavefrontsize32 on wave64 only targets.
	const bool HaveWave32 =
	(IsWave32Capable \|\| IsNullCPU) && Features.count("wavefrontsize32");
	const bool HaveWave64 = Features.count("wavefrontsize64");

	// TODO: Should move this logic into TargetParser			// TODO: Should move this logic into TargetParser
	if (HaveWave32 && HaveWave64) {			std::string ErrorMsg;
	Diags.Report(diag::err_invalid_feature_combination)			if (!insertWaveSizeFeature(CPU, getTriple(), Features, ErrorMsg)) {
	<< "'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive";			Diags.Report(diag::err_invalid_feature_combination) << ErrorMsg;
	return false;			return false;
	}			}

	// Don't assume any wavesize with an unknown subtarget.
	if (!IsNullCPU) {
	// Default to wave32 if available, or wave64 if not
	if (!HaveWave32 && !HaveWave64) {
	StringRef DefaultWaveSizeFeature =
	IsWave32Capable ? "wavefrontsize32" : "wavefrontsize64";
	Features.insert(std::make_pair(DefaultWaveSizeFeature, true));
	}
	}

	return true;			return true;
	}			}

	void AMDGPUTargetInfo::fillValidCPUList(			void AMDGPUTargetInfo::fillValidCPUList(
	SmallVectorImpl<StringRef> &Values) const {			SmallVectorImpl<StringRef> &Values) const {
	if (isAMDGCN(getTriple()))			if (isAMDGCN(getTriple()))
	llvm::AMDGPU::fillValidArchListAMDGCN(Values);			llvm::AMDGPU::fillValidArchListAMDGCN(Values);
	else			else
	▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/CommonArgs.cpp

Show First 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	return llvm::StringSwitch<std::string>(GPUName)
.Cases("rv610", "rv620", "rs780", "rs880")		.Cases("rv610", "rv620", "rs780", "rs880")
.Case("rv740", "rv770")		.Case("rv740", "rv770")
.Case("palm", "cedar")		.Case("palm", "cedar")
.Cases("sumo", "sumo2", "sumo")		.Cases("sumo", "sumo2", "sumo")
.Case("hemlock", "cypress")		.Case("hemlock", "cypress")
.Case("aruba", "cayman")		.Case("aruba", "cayman")
.Default(GPUName.str());		.Default(GPUName.str());
}		}
		if (Arg *A = Args.getLastArg(options::OPT_march_EQ)) {
		return getProcessorFromTargetID(T, A->getValue()).str();
		}
return "";		return "";
}		}

static std::string getLanaiTargetCPU(const ArgList &Args) {		static std::string getLanaiTargetCPU(const ArgList &Args) {
if (Arg *A = Args.getLastArg(options::OPT_mcpu_EQ)) {		if (Arg *A = Args.getLastArg(options::OPT_mcpu_EQ)) {
return A->getValue();		return A->getValue();
}		}
return "";		return "";
▲ Show 20 Lines • Show All 2,010 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/Flang.cpp

Show First 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	if (!CPU.empty()) {
CmdArgs.push_back("-target-cpu");		CmdArgs.push_back("-target-cpu");
CmdArgs.push_back(Args.MakeArgString(CPU));		CmdArgs.push_back(Args.MakeArgString(CPU));
}		}

// Add the target features.		// Add the target features.
switch (TC.getArch()) {		switch (TC.getArch()) {
default:		default:
break;		break;
		case llvm::Triple::r600:
		awarzynskiUnsubmitted Not Done Reply Inline Actions Should there be a test for this triple as well? awarzynski: Should there be a test for this triple as well?
		[[fallthrough]];
		case llvm::Triple::amdgcn:
		[[fallthrough]];
case llvm::Triple::aarch64:		case llvm::Triple::aarch64:
[[fallthrough]];		[[fallthrough]];
case llvm::Triple::x86_64:		case llvm::Triple::x86_64:
getTargetFeatures(D, Triple, Args, CmdArgs, /ForAs/ false);		getTargetFeatures(D, Triple, Args, CmdArgs, /ForAs/ false);
break;		break;
}		}

// TODO: Add target specific flags, ABI, mtune option etc.		// TODO: Add target specific flags, ABI, mtune option etc.
▲ Show 20 Lines • Show All 260 Lines • Show Last 20 Lines

flang/include/flang/Frontend/FrontendActions.h

Show First 20 Lines • Show All 202 Lines • ▼ Show 20 Lines	class CodeGenAction : public FrontendAction {
void executeAction() override;		void executeAction() override;
/// Runs prescan, parsing, sema and lowers to MLIR.		/// Runs prescan, parsing, sema and lowers to MLIR.
bool beginSourceFileAction() override;		bool beginSourceFileAction() override;
/// Sets up LLVM's TargetMachine.		/// Sets up LLVM's TargetMachine.
void setUpTargetMachine();		void setUpTargetMachine();
/// Runs the optimization (aka middle-end) pipeline on the LLVM module		/// Runs the optimization (aka middle-end) pipeline on the LLVM module
/// associated with this action.		/// associated with this action.
void runOptimizationPipeline(llvm::raw_pwrite_stream &os);		void runOptimizationPipeline(llvm::raw_pwrite_stream &os);
		/// Produces the string which represents all target features
		std::string getAllTargetFeatures();

protected:		protected:
CodeGenAction(BackendActionTy act) : action{act} {};		CodeGenAction(BackendActionTy act) : action{act} {};
/// @name MLIR		/// @name MLIR
/// {		/// {
std::unique_ptr<mlir::ModuleOp> mlirModule;		std::unique_ptr<mlir::ModuleOp> mlirModule;
std::unique_ptr<mlir::MLIRContext> mlirCtx;		std::unique_ptr<mlir::MLIRContext> mlirCtx;
/// }		/// }
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

flang/lib/Frontend/FrontendActions.cpp

Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines

#include "llvm/MC/TargetRegistry.h" #include "llvm/MC/TargetRegistry.h"

#include "llvm/Object/OffloadBinary.h" #include "llvm/Object/OffloadBinary.h"

#include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassBuilder.h"

#include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/PassPlugin.h"

#include "llvm/Passes/StandardInstrumentations.h" #include "llvm/Passes/StandardInstrumentations.h"

#include "llvm/Support/ErrorHandling.h" #include "llvm/Support/ErrorHandling.h"

#include "llvm/Support/SourceMgr.h" #include "llvm/Support/SourceMgr.h"

#include "llvm/Target/TargetMachine.h" #include "llvm/Target/TargetMachine.h"

#include "llvm/TargetParser/TargetParser.h"

#include "llvm/Transforms/Utils/ModuleUtils.h" #include "llvm/Transforms/Utils/ModuleUtils.h"

#include <memory> #include <memory>

using namespace Fortran::frontend; using namespace Fortran::frontend;

// Declare plugin extension function declarations. // Declare plugin extension function declarations.

#define HANDLE_EXTENSION(Ext) \ #define HANDLE_EXTENSION(Ext) \

llvm::PassPluginLibraryInfo get##Ext##PluginInfo(); llvm::PassPluginLibraryInfo get##Ext##PluginInfo();

Show All 19 Lines bool PrescanAndSemaDebugAction::beginSourceFileAction() {

// semantic checks are made to succeed unconditionally to prevent this action // semantic checks are made to succeed unconditionally to prevent this action

// from exiting early (i.e. in the presence of semantic errors). We should // from exiting early (i.e. in the presence of semantic errors). We should

// never do this in actions intended for end-users or otherwise regular // never do this in actions intended for end-users or otherwise regular

// compiler workflows! // compiler workflows!

return runPrescan() && runParse() && (runSemanticChecks() || true) && return runPrescan() && runParse() && (runSemanticChecks() || true) &&

(generateRtTypeTables() || true); (generateRtTypeTables() || true);

} }

std::string CodeGenAction::getAllTargetFeatures() {

awarzynskiUnsubmitted

Done

This method could be simplified by following https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code. For example:

std::string CodeGenAction::getAllTargetFeatures()  {
  if (!triple.isAMDGPU()) {
    allFeaturesStr = llvm::join(targetOpts.featuresAsWritten.begin(),
                                targetOpts.featuresAsWritten.end(), ",");
    return allFeaturesStr;
  }

  // The logic for AMDGPU
  // Perhaps add a dedicated hook: getExplicitAndImplicitAMDGPUTargetFeatures()
}

Btw, this method does different things depending on the triple. Perhaps rename it as something more generic, e.g. getTargetFeatures? I think that the current name, getAllTargetFeatures, is a bit misleading (i.e. what does "all" mean?).

Also:

make it static
document

awarzynski: This method could be simplified by following https://llvm.org/docs/CodingStandards.html#use…

domadaAuthorUnsubmitted

Done

Hi,
thanks for the review. I applied your remarks. I also moved OpenMP related changes to the child review.

domada: Hi, thanks for the review. I applied your remarks. I also moved OpenMP related changes to the…

std::string allFeaturesStr;

CompilerInstance &ci = this->getInstance();

const TargetOptions &targetOpts = ci.getInvocation().getTargetOpts();

const llvm::Triple triple(targetOpts.triple);

// Clang does not append all target features to the clang -cc1 invocation.

// Some AMDGPU features are passed implicitly by the Clang frontend.

// That's why we need to extract implicit AMDGPU target features and add

// them to the target features specified by the user

if (triple.isAMDGPU()) {

llvm::StringRef cpu = targetOpts.cpu;

llvm::StringMap<bool> implicitFeaturesMap;

std::string errorMsg;

// Get the set of implicit target features

llvm::AMDGPU::fillAMDGPUFeatureMap(cpu, triple, implicitFeaturesMap);

awarzynskiUnsubmitted

Done

Are you able to add a test that will trigger this?

awarzynski: Are you able to add a test that will trigger this?

domadaAuthorUnsubmitted

Done

Yes, please check the newest patch.

domada: Yes, please check the newest patch.

if (!llvm::AMDGPU::insertWaveSizeFeature(cpu, triple, implicitFeaturesMap,

errorMsg)) {

awarzynskiUnsubmitted

Done

errorMsg)) {

- llvm::SMDiagnostic err;

- err.print(errorMsg.data(), llvm::errs());

unsigned diagID = ci.getDiagnostics().getCustomDiagID(

- clang::DiagnosticsEngine::Error, "Unsupported feature ID");

- ci.getDiagnostics().Report(diagID);

+ clang::DiagnosticsEngine::Error, "Unsupported feature ID: %0");

+ ci.getDiagnostics().Report(diagID) << errorMsg.data();

return std::string();

awarzynski:

llvm::SMDiagnostic err;

CompilerInstance &ci = this->getInstance();

err.print(errorMsg.data(), llvm::errs());

unsigned diagID = ci.getDiagnostics().getCustomDiagID(

clang::DiagnosticsEngine::Error, "Unsupported feature ID");

ci.getDiagnostics().Report(diagID);

return allFeaturesStr;

}

// Add target features specified by the user

for (auto &userFeature : targetOpts.featuresAsWritten) {

std::string userKeyString = userFeature.substr(1);

implicitFeaturesMap[userKeyString] = (userFeature[0] == '+');

}

llvm::SmallVector<std::string> featuresVec;

for (auto &implicitFeatureItem : implicitFeaturesMap) {

featuresVec.push_back(

(llvm::Twine(implicitFeatureItem.second ? "+" : "-") +

implicitFeatureItem.first().str())

.str());

}

allFeaturesStr = llvm::join(featuresVec, ",");

} else {

allFeaturesStr = llvm::join(targetOpts.featuresAsWritten.begin(),

targetOpts.featuresAsWritten.end(), ",");

}

awarzynskiUnsubmitted

Not Done

[nit] I would probably flip this as:

if (triple.isAMDGPU()) {
  // Clang does not append all target features to the clang -cc1 invocation.
  // Some AMDGPU features are passed implicitly by the Clang frontend.
  // That's why we need to extract implicit AMDGPU target features and add
  // them to the target features specified by the user
  return getExplicitAndImplicitAMDGPUTargetFeatures(ci, targetOpts, triple);
  }

I know that I suggested the opposite in my previous comment, but you have simplified this since :) In any case, feel free to ignore.

awarzynski: [nit] I would probably flip this as: ``` if (triple.isAMDGPU()) { // Clang does not append…

return allFeaturesStr;

}

static void setMLIRDataLayout(mlir::ModuleOp &mlirModule, static void setMLIRDataLayout(mlir::ModuleOp &mlirModule,

const llvm::DataLayout &dl) { const llvm::DataLayout &dl) {

awarzynskiUnsubmitted

Not Done

[nit] IMHO this is documenting an implementation detail that would be more relevant inside getExplicitAndImplicitAMDGPUTargetFeatures.

More importantly, you are suggesting that the driver is doing whatever it is doing "because that's what Clang does". Consistency with Clang is important (you could call it out in the commit message) :) However, it would be even nicer to understand the actual rationale behind these implicit features. Any idea? Perhaps there are some clues in git history?

Also, perhaps a TODO to share this code with Clang? (to make it even clearer that the frontend driver aims for full consistency with Clang here).

awarzynski: [nit] IMHO this is documenting an implementation detail that would be more relevant inside…

domadaAuthorUnsubmitted

Not Done

I think that the main difference between Clang and Flang is the lack of TargetInfo class.

TargetInfo classes are Clang specific and they are responsible for parsing/adding default target features. Every target performs initialization in different way (compare for example AArch64 vs AMDGPU target initialization.

I don't want to make TargetInfo class Clang indendent (see discussion: https://discourse.llvm.org/t/rfc-targetinfo-library/64342 ). I also don't want to reimplement the whole TargetInfo class in Flang, because Flang already uses LLVM TargetMachine class (see: https://llvm.org/doxygen/classllvm_1_1TargetMachine.html and https://github.com/llvm/llvm-project/blob/main/flang/lib/Frontend/FrontendActions.cpp#L614 ) which can play similar role as Clang TargetInfo IMO.

That's why I decided to implement getExplicitAndImplicitAMDGPUTargetFeatures function which performs initialization of default AMDGPU target features.

domada: I think that the main difference between Clang and Flang is the lack of `TargetInfo` class.

awarzynskiUnsubmitted

Not Done

Thanks for this comprehensive overview! It would be helpful if this rationale was included in the summary (in the spirit of documenting things for our future selves).

So, there isn't anything special about AMDGPU here, is there? We will have to implement similar hooks for other targets at some point too, right? Or perhaps there's some reason to do this for AMDGPU sooner rather than later?

I'm not against this change, just want to better understand the wider context.

awarzynski: Thanks for this comprehensive overview! It would be helpful if this rationale was included in…

domadaAuthorUnsubmitted

Done

Hi,
I modified the comment above to be more informative.

IMO, we need to add similar hooks for other targets. For example:

clang --target=aarch64 t.c -S -emit-llvm -v 
// I see in the logs explicit target features:
// +neon,  +v8a ,  -fmv
// However, generated t.ll contains 4 target features:
"target-features"="+fp-armv8,+neon,+v8a,-fmv"
// It looks like target feature +fp-armv8 is implicit

domada: Hi, I modified the comment above to be more informative. IMO, we need to add similar hooks for…

awarzynskiUnsubmitted

Not Done

Thanks for checking!

awarzynski: Thanks for checking!

mlir::MLIRContext *context = mlirModule.getContext(); mlir::MLIRContext *context = mlirModule.getContext();

mlirModule->setAttr( mlirModule->setAttr(

mlir::LLVM::LLVMDialect::getDataLayoutAttrName(), mlir::LLVM::LLVMDialect::getDataLayoutAttrName(),

mlir::StringAttr::get(context, dl.getStringRepresentation())); mlir::StringAttr::get(context, dl.getStringRepresentation()));

mlir::DataLayoutSpecInterface dlSpec = mlir::translateDataLayout(dl, context); mlir::DataLayoutSpecInterface dlSpec = mlir::translateDataLayout(dl, context);

mlirModule->setAttr(mlir::DLTIDialect::kDataLayoutAttrName, dlSpec); mlirModule->setAttr(mlir::DLTIDialect::kDataLayoutAttrName, dlSpec);

} }

▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines lower::LoweringBridge lb = Fortran::lower::LoweringBridge::create(

ci.getInvocation().getSemanticsContext().targetCharacteristics(), ci.getInvocation().getSemanticsContext().targetCharacteristics(),

ci.getParsing().allCooked(), ci.getInvocation().getTargetOpts().triple, ci.getParsing().allCooked(), ci.getInvocation().getTargetOpts().triple,

kindMap, ci.getInvocation().getLoweringOpts(), kindMap, ci.getInvocation().getLoweringOpts(),

ci.getInvocation().getFrontendOpts().envDefaults); ci.getInvocation().getFrontendOpts().envDefaults);

// Fetch module from lb, so we can set // Fetch module from lb, so we can set

mlirModule = std::make_unique<mlir::ModuleOp>(lb.getModule()); mlirModule = std::make_unique<mlir::ModuleOp>(lb.getModule());

setUpTargetMachine();

if (ci.getInvocation().getFrontendOpts().features.IsEnabled( if (ci.getInvocation().getFrontendOpts().features.IsEnabled(

Fortran::common::LanguageFeature::OpenMP)) { Fortran::common::LanguageFeature::OpenMP)) {

mlir::omp::OpenMPDialect::setIsDevice( mlir::omp::OpenMPDialect::setIsDevice(

*mlirModule, ci.getInvocation().getLangOpts().OpenMPIsDevice); *mlirModule, ci.getInvocation().getLangOpts().OpenMPIsDevice);

mlir::omp::OpenMPDialect::setTargetCpu(*mlirModule, tm->getTargetCPU());

mlir::omp::OpenMPDialect::setTargetCpuFeatures(

*mlirModule, tm->getTargetFeatureString());

} }

setUpTargetMachine();

const llvm::DataLayout &dl = tm->createDataLayout(); const llvm::DataLayout &dl = tm->createDataLayout();

setMLIRDataLayout(*mlirModule, dl); setMLIRDataLayout(*mlirModule, dl);

// Create a parse tree and lower it to FIR // Create a parse tree and lower it to FIR

Fortran::parser::Program &parseTree{*ci.getParsing().parseTree()}; Fortran::parser::Program &parseTree{*ci.getParsing().parseTree()};

lb.lower(parseTree, ci.getInvocation().getSemanticsContext()); lb.lower(parseTree, ci.getInvocation().getSemanticsContext());

// run the default passes. // run the default passes.

▲ Show 20 Lines • Show All 402 Lines • ▼ Show 20 Lines void CodeGenAction::setUpTargetMachine() {

assert(theTarget && "Failed to create Target"); assert(theTarget && "Failed to create Target");

// Create `TargetMachine` // Create `TargetMachine`

const auto &CGOpts = ci.getInvocation().getCodeGenOpts(); const auto &CGOpts = ci.getInvocation().getCodeGenOpts();

std::optional<llvm::CodeGenOpt::Level> OptLevelOrNone = std::optional<llvm::CodeGenOpt::Level> OptLevelOrNone =

llvm::CodeGenOpt::getLevel(CGOpts.OptimizationLevel); llvm::CodeGenOpt::getLevel(CGOpts.OptimizationLevel);

assert(OptLevelOrNone && "Invalid optimization level!"); assert(OptLevelOrNone && "Invalid optimization level!");

llvm::CodeGenOpt::Level OptLevel = *OptLevelOrNone; llvm::CodeGenOpt::Level OptLevel = *OptLevelOrNone;

std::string featuresStr = llvm::join(targetOpts.featuresAsWritten.begin(), std::string featuresStr = getAllTargetFeatures();

targetOpts.featuresAsWritten.end(), ",");

tm.reset(theTarget->createTargetMachine( tm.reset(theTarget->createTargetMachine(

theTriple, /*CPU=*/targetOpts.cpu, theTriple, /*CPU=*/targetOpts.cpu,

/*Features=*/featuresStr, llvm::TargetOptions(), /*Features=*/featuresStr, llvm::TargetOptions(),

/*Reloc::Model=*/CGOpts.getRelocationModel(), /*Reloc::Model=*/CGOpts.getRelocationModel(),

/*CodeModel::Model=*/std::nullopt, OptLevel)); /*CodeModel::Model=*/std::nullopt, OptLevel));

assert(tm && "Failed to create TargetMachine"); assert(tm && "Failed to create TargetMachine");

} }

▲ Show 20 Lines • Show All 263 Lines • Show Last 20 Lines

flang/test/Driver/target-cpu-features.f90

	! REQUIRES: aarch64-registered-target, x86-registered-target			! REQUIRES: aarch64-registered-target, x86-registered-target, amdgpu-registered-target

	! Test that -mcpu/march are used and that the -target-cpu and -target-features			! Test that -mcpu/march are used and that the -target-cpu and -target-features
	! are also added to the fc1 command.			! are also added to the fc1 command.

	! RUN: %flang --target=aarch64-linux-gnu -mcpu=cortex-a57 -c %s -### 2>&1 \			! RUN: %flang --target=aarch64-linux-gnu -mcpu=cortex-a57 -c %s -### 2>&1 \
	! RUN: \| FileCheck %s -check-prefix=CHECK-A57			! RUN: \| FileCheck %s -check-prefix=CHECK-A57

	! RUN: %flang --target=aarch64-linux-gnu -mcpu=cortex-a76 -c %s -### 2>&1 \			! RUN: %flang --target=aarch64-linux-gnu -mcpu=cortex-a76 -c %s -### 2>&1 \
	! RUN: \| FileCheck %s -check-prefix=CHECK-A76			! RUN: \| FileCheck %s -check-prefix=CHECK-A76

	! RUN: %flang --target=aarch64-linux-gnu -march=armv9 -c %s -### 2>&1 \			! RUN: %flang --target=aarch64-linux-gnu -march=armv9 -c %s -### 2>&1 \
	! RUN: \| FileCheck %s -check-prefix=CHECK-ARMV9			! RUN: \| FileCheck %s -check-prefix=CHECK-ARMV9

	! Negative test. ARM cpu with x86 target.			! Negative test. ARM cpu with x86 target.
	! RUN: %flang --target=x86_64-linux-gnu -mcpu=cortex-a57 -c %s -### 2>&1 \			! RUN: %flang --target=x86_64-linux-gnu -mcpu=cortex-a57 -c %s -### 2>&1 \
	! RUN: \| FileCheck %s -check-prefix=CHECK-NO-A57			! RUN: \| FileCheck %s -check-prefix=CHECK-NO-A57

	! RUN: %flang --target=x86_64-linux-gnu -march=skylake -c %s -### 2>&1 \			! RUN: %flang --target=x86_64-linux-gnu -march=skylake -c %s -### 2>&1 \
	! RUN: \| FileCheck %s -check-prefix=CHECK-SKYLAKE			! RUN: \| FileCheck %s -check-prefix=CHECK-SKYLAKE

	! RUN: %flang --target=x86_64h-linux-gnu -c %s -### 2>&1 \			! RUN: %flang --target=x86_64h-linux-gnu -c %s -### 2>&1 \
	! RUN: \| FileCheck %s -check-prefix=CHECK-X86_64H			! RUN: \| FileCheck %s -check-prefix=CHECK-X86_64H

				! RUN: %flang --target=amdgcn-amd-amdhsa -mcpu=gfx908 -c %s -### 2>&1 \
				! RUN: \| FileCheck %s -check-prefix=CHECK-AMDGPU

	! Test that invalid cpu and features are ignored.			! Test that invalid cpu and features are ignored.

	! RUN: %flang_fc1 -triple aarch64-linux-gnu -target-cpu supercpu \			! RUN: %flang_fc1 -triple aarch64-linux-gnu -target-cpu supercpu \
	! RUN: -o /dev/null -S %s 2>&1 \| FileCheck %s -check-prefix=CHECK-INVALID-CPU			! RUN: -o /dev/null -S %s 2>&1 \| FileCheck %s -check-prefix=CHECK-INVALID-CPU

	! RUN: %flang_fc1 -triple aarch64-linux-gnu -target-feature +superspeed \			! RUN: %flang_fc1 -triple aarch64-linux-gnu -target-feature +superspeed \
	! RUN: -o /dev/null -S %s 2>&1 \| FileCheck %s -check-prefix=CHECK-INVALID-FEATURE			! RUN: -o /dev/null -S %s 2>&1 \| FileCheck %s -check-prefix=CHECK-INVALID-FEATURE
	Show All 14 Lines
	! CHECK-NO-A57-NOT: cortex-a57			! CHECK-NO-A57-NOT: cortex-a57

	! CHECK-SKYLAKE: "-fc1" "-triple" "x86_64-unknown-linux-gnu"			! CHECK-SKYLAKE: "-fc1" "-triple" "x86_64-unknown-linux-gnu"
	! CHECK-SKYLAKE-SAME: "-target-cpu" "skylake"			! CHECK-SKYLAKE-SAME: "-target-cpu" "skylake"

	! CHECK-X86_64H: "-fc1" "-triple" "x86_64h-unknown-linux-gnu"			! CHECK-X86_64H: "-fc1" "-triple" "x86_64h-unknown-linux-gnu"
	! CHECK-X86_64H-SAME: "-target-cpu" "x86-64" "-target-feature" "-rdrnd" "-target-feature" "-aes" "-target-feature" "-pclmul" "-target-feature" "-rtm" "-target-feature" "-fsgsbase"			! CHECK-X86_64H-SAME: "-target-cpu" "x86-64" "-target-feature" "-rdrnd" "-target-feature" "-aes" "-target-feature" "-pclmul" "-target-feature" "-rtm" "-target-feature" "-fsgsbase"

				! CHECK-AMDGPU: "-fc1" "-triple" "amdgcn-amd-amdhsa"
				! CHECK-AMDGPU-SAME: "-target-cpu" "gfx908"
	! CHECK-INVALID-CPU: 'supercpu' is not a recognized processor for this target (ignoring processor)			! CHECK-INVALID-CPU: 'supercpu' is not a recognized processor for this target (ignoring processor)
	! CHECK-INVALID-FEATURE: '+superspeed' is not a recognized feature for this target (ignoring feature)			! CHECK-INVALID-FEATURE: '+superspeed' is not a recognized feature for this target (ignoring feature)
				awarzynskiUnsubmitted Not Done Reply Inline Actions Hm, there aren't any "implicit" target features here. awarzynski: Hm, there aren't any "implicit" target features here.
				domadaAuthorUnsubmitted Done Reply Inline Actions You will see them in MLIR code: https://reviews.llvm.org/D146612#change-ciOHRHlq0yvL (file: mlir/test/Target/LLVMIR/openmp-llvm.mlir ) Clang also does not list these features in command line for AMDGPU. Slightly different situation is for example for ARM target. Clang list 3 options as -cc1 options and it attaches 4 target options to the generated LLVM IR: clang --target=aarch64 t.c -S -emit-llvm -v // I see in the command line 3 explicit target features: // +neon, +v8a , -fmv // However, generated t.ll contains 4 target features: "target-features"="+fp-armv8,+neon,+v8a,-fmv" // It looks like target feature +fp-armv8 is implicit domada: You will see them in MLIR code: https://reviews.llvm.org/D146612#change-ciOHRHlq0yvL (file…

flang/test/Lower/OpenMP/target_cpu_features.f90

This file was added.

				!REQUIRES: amdgpu-registered-target
				!RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp %s -o - \| FileCheck %s

				!===============================================================================
				! Target_Enter Simple
				!===============================================================================

				!CHECK: omp.target_cpu = "gfx908",
				!CHECK-SAME: omp.target_cpu_features = "+dot3-insts,+dot4-insts,+s-memtime-inst,
				!CHECK-SAME: +16-bit-insts,+s-memrealtime,+dot6-insts,+dl-insts,+wavefrontsize64,
				!CHECK-SAME: +gfx9-insts,+gfx8-insts,+ci-insts,+dot10-insts,+dot7-insts,
				!CHECK-SAME: +dot1-insts,+dot5-insts,+mai-insts,+dpp,+dot2-insts"
				!CHECK-LABEL: func.func @_QPomp_target_simple() {
				subroutine omp_target_simple
				end subroutine omp_target_simple

llvm/include/llvm/TargetParser/TargetParser.h

	//===-- TargetParser - Parser for target features ---------------- C++ --===//			//===-- TargetParser - Parser for target features ---------------- C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file implements a target parser to recognise hardware features such as			// This file implements a target parser to recognise hardware features such as
	// FPU/CPU/ARCH names as well as specific support such as HDIV, etc.			// FPU/CPU/ARCH names as well as specific support such as HDIV, etc.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_TARGETPARSER_TARGETPARSER_H			#ifndef LLVM_TARGETPARSER_TARGETPARSER_H
	#define LLVM_TARGETPARSER_TARGETPARSER_H			#define LLVM_TARGETPARSER_TARGETPARSER_H

				#include "llvm/ADT/StringMap.h"
	#include "llvm/ADT/StringRef.h"			#include "llvm/ADT/StringRef.h"

	namespace llvm {			namespace llvm {

	template <typename T> class SmallVectorImpl;			template <typename T> class SmallVectorImpl;
	class Triple;			class Triple;

	// Target specific information in their own namespaces.			// Target specific information in their own namespaces.
	▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines
	unsigned getArchAttrAMDGCN(GPUKind AK);			unsigned getArchAttrAMDGCN(GPUKind AK);
	unsigned getArchAttrR600(GPUKind AK);			unsigned getArchAttrR600(GPUKind AK);

	void fillValidArchListAMDGCN(SmallVectorImpl<StringRef> &Values);			void fillValidArchListAMDGCN(SmallVectorImpl<StringRef> &Values);
	void fillValidArchListR600(SmallVectorImpl<StringRef> &Values);			void fillValidArchListR600(SmallVectorImpl<StringRef> &Values);

	IsaVersion getIsaVersion(StringRef GPU);			IsaVersion getIsaVersion(StringRef GPU);

				/// Fills Features map with default values for given target GPU
				void fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
				StringMap<bool> &Features);

				/// Inserts wave size feature for given GPU into features map
				bool insertWaveSizeFeature(StringRef GPU, const Triple &T,
				StringMap<bool> &Features, std::string &ErrorMsg);

	} // namespace AMDGPU			} // namespace AMDGPU
	} // namespace llvm			} // namespace llvm

	#endif			#endif

llvm/lib/TargetParser/TargetParser.cpp

	Show First 20 Lines • Show All 245 Lines • ▼ Show 20 Lines
	StringRef AMDGPU::getCanonicalArchName(const Triple &T, StringRef Arch) {			StringRef AMDGPU::getCanonicalArchName(const Triple &T, StringRef Arch) {
	assert(T.isAMDGPU());			assert(T.isAMDGPU());
	auto ProcKind = T.isAMDGCN() ? parseArchAMDGCN(Arch) : parseArchR600(Arch);			auto ProcKind = T.isAMDGCN() ? parseArchAMDGCN(Arch) : parseArchR600(Arch);
	if (ProcKind == GK_NONE)			if (ProcKind == GK_NONE)
	return StringRef();			return StringRef();

	return T.isAMDGCN() ? getArchNameAMDGCN(ProcKind) : getArchNameR600(ProcKind);			return T.isAMDGCN() ? getArchNameAMDGCN(ProcKind) : getArchNameR600(ProcKind);
	}			}

				void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
				StringMap<bool> &Features) {
				// XXX - What does the member GPU mean if device name string passed here?
				if (T.isAMDGCN()) {
				switch (parseArchAMDGCN(GPU)) {
				case GK_GFX1103:
				case GK_GFX1102:
				case GK_GFX1101:
				case GK_GFX1100:
				Features["ci-insts"] = true;
				Features["dot5-insts"] = true;
				Features["dot7-insts"] = true;
				Features["dot8-insts"] = true;
				Features["dot9-insts"] = true;
				Features["dot10-insts"] = true;
				Features["dl-insts"] = true;
				Features["16-bit-insts"] = true;
				Features["dpp"] = true;
				Features["gfx8-insts"] = true;
				Features["gfx9-insts"] = true;
				Features["gfx10-insts"] = true;
				Features["gfx10-3-insts"] = true;
				Features["gfx11-insts"] = true;
				break;
				case GK_GFX1036:
				case GK_GFX1035:
				case GK_GFX1034:
				case GK_GFX1033:
				case GK_GFX1032:
				case GK_GFX1031:
				case GK_GFX1030:
				Features["ci-insts"] = true;
				Features["dot1-insts"] = true;
				Features["dot2-insts"] = true;
				Features["dot5-insts"] = true;
				Features["dot6-insts"] = true;
				Features["dot7-insts"] = true;
				Features["dot10-insts"] = true;
				Features["dl-insts"] = true;
				Features["16-bit-insts"] = true;
				Features["dpp"] = true;
				Features["gfx8-insts"] = true;
				Features["gfx9-insts"] = true;
				Features["gfx10-insts"] = true;
				Features["gfx10-3-insts"] = true;
				Features["s-memrealtime"] = true;
				Features["s-memtime-inst"] = true;
				break;
				case GK_GFX1012:
				case GK_GFX1011:
				Features["dot1-insts"] = true;
				Features["dot2-insts"] = true;
				Features["dot5-insts"] = true;
				Features["dot6-insts"] = true;
				Features["dot7-insts"] = true;
				Features["dot10-insts"] = true;
				[[fallthrough]];
				case GK_GFX1013:
				case GK_GFX1010:
				Features["dl-insts"] = true;
				Features["ci-insts"] = true;
				Features["16-bit-insts"] = true;
				Features["dpp"] = true;
				Features["gfx8-insts"] = true;
				Features["gfx9-insts"] = true;
				Features["gfx10-insts"] = true;
				Features["s-memrealtime"] = true;
				Features["s-memtime-inst"] = true;
				break;
				case GK_GFX940:
				Features["gfx940-insts"] = true;
				Features["fp8-insts"] = true;
				[[fallthrough]];
				case GK_GFX90A:
				Features["gfx90a-insts"] = true;
				[[fallthrough]];
				case GK_GFX908:
				Features["dot3-insts"] = true;
				Features["dot4-insts"] = true;
				Features["dot5-insts"] = true;
				Features["dot6-insts"] = true;
				Features["mai-insts"] = true;
				[[fallthrough]];
				case GK_GFX906:
				Features["dl-insts"] = true;
				Features["dot1-insts"] = true;
				Features["dot2-insts"] = true;
				Features["dot7-insts"] = true;
				Features["dot10-insts"] = true;
				[[fallthrough]];
				case GK_GFX90C:
				case GK_GFX909:
				case GK_GFX904:
				case GK_GFX902:
				case GK_GFX900:
				Features["gfx9-insts"] = true;
				[[fallthrough]];
				case GK_GFX810:
				case GK_GFX805:
				case GK_GFX803:
				case GK_GFX802:
				case GK_GFX801:
				Features["gfx8-insts"] = true;
				Features["16-bit-insts"] = true;
				Features["dpp"] = true;
				Features["s-memrealtime"] = true;
				[[fallthrough]];
				case GK_GFX705:
				case GK_GFX704:
				case GK_GFX703:
				case GK_GFX702:
				case GK_GFX701:
				case GK_GFX700:
				Features["ci-insts"] = true;
				[[fallthrough]];
				case GK_GFX602:
				case GK_GFX601:
				case GK_GFX600:
				Features["s-memtime-inst"] = true;
				break;
				case GK_NONE:
				break;
				default:
				llvm_unreachable("Unhandled GPU!");
				}
				} else {
				if (GPU.empty())
				GPU = "r600";

				switch (llvm::AMDGPU::parseArchR600(GPU)) {
				case GK_CAYMAN:
				case GK_CYPRESS:
				case GK_RV770:
				case GK_RV670:
				// TODO: Add fp64 when implemented.
				break;
				case GK_TURKS:
				case GK_CAICOS:
				case GK_BARTS:
				case GK_SUMO:
				case GK_REDWOOD:
				case GK_JUNIPER:
				case GK_CEDAR:
				case GK_RV730:
				case GK_RV710:
				case GK_RS880:
				case GK_R630:
				case GK_R600:
				break;
				default:
				llvm_unreachable("Unhandled GPU!");
				}
				}
				}

				static bool isWave32Capable(StringRef GPU, const Triple &T) {
				bool IsWave32Capable = false;
				// XXX - What does the member GPU mean if device name string passed here?
				if (T.isAMDGCN()) {
				switch (parseArchAMDGCN(GPU)) {
				case GK_GFX1103:
				case GK_GFX1102:
				case GK_GFX1101:
				case GK_GFX1100:
				case GK_GFX1036:
				case GK_GFX1035:
				case GK_GFX1034:
				case GK_GFX1033:
				case GK_GFX1032:
				case GK_GFX1031:
				case GK_GFX1030:
				case GK_GFX1012:
				case GK_GFX1011:
				case GK_GFX1013:
				case GK_GFX1010:
				IsWave32Capable = true;
				break;
				default:
				break;
				}
				}
				return IsWave32Capable;
				}

				bool AMDGPU::insertWaveSizeFeature(StringRef GPU, const Triple &T,
				StringMap<bool> &Features,
				std::string &ErrorMsg) {
				bool IsWave32Capable = isWave32Capable(GPU, T);
				const bool IsNullGPU = GPU.empty();
				// FIXME: Not diagnosing wavefrontsize32 on wave64 only targets.
				const bool HaveWave32 =
				(IsWave32Capable \|\| IsNullGPU) && Features.count("wavefrontsize32");
				const bool HaveWave64 = Features.count("wavefrontsize64");
				if (HaveWave32 && HaveWave64) {
				ErrorMsg = "'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive";
				return false;
				}
				// Don't assume any wavesize with an unknown subtarget.
				if (!IsNullGPU) {
				// Default to wave32 if available, or wave64 if not
				if (!HaveWave32 && !HaveWave64) {
				StringRef DefaultWaveSizeFeature =
				IsWave32Capable ? "wavefrontsize32" : "wavefrontsize64";
				Features.insert(std::make_pair(DefaultWaveSizeFeature, true));
				}
				}
				return true;
				}

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

Show All 30 Lines	def OpenMP_Dialect : Dialect {

let extraClassDeclaration = [{		let extraClassDeclaration = [{
// Set the omp.is_device attribute on the module with the specified boolean		// Set the omp.is_device attribute on the module with the specified boolean
static void setIsDevice(Operation* module, bool isDevice);		static void setIsDevice(Operation* module, bool isDevice);

// Return the value of the omp.is_device attribute stored in the module if it		// Return the value of the omp.is_device attribute stored in the module if it
// exists, otherwise return false by default		// exists, otherwise return false by default
static bool getIsDevice(Operation* module);		static bool getIsDevice(Operation* module);

		// Set the omp.target_cpu attribute on the module with the specified string
		static void setTargetCpu(Operation* module, StringRef cpu);

		// Return the value of the omp.target_cpu attribute stored in the module if it
		// exists, otherwise return empty by default
		static std::string getTargetCpu(Operation* module);

		// Set the omp.target_cpu_features attribute on the module with
		// the specified string
		static void setTargetCpuFeatures(Operation* module, StringRef cpuFeatures);

		// Return the value of the omp.target_cpu_features attribute stored in
		// the module if it exists, otherwise return empty by default
		static std::string getTargetCpuFeatures(Operation* module);
}];		}];
}		}

// OmpCommon requires definition of OpenACC_Dialect.		// OmpCommon requires definition of OpenACC_Dialect.
include "mlir/Dialect/OpenMP/OmpCommon.td"		include "mlir/Dialect/OpenMP/OmpCommon.td"

class OpenMP_Op<string mnemonic, list<Trait> traits = []> :		class OpenMP_Op<string mnemonic, list<Trait> traits = []> :
Op<OpenMP_Dialect, mnemonic, traits>;		Op<OpenMP_Dialect, mnemonic, traits>;
▲ Show 20 Lines • Show All 1,560 Lines • Show Last 20 Lines

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp

	Show First 20 Lines • Show All 1,431 Lines • ▼ Show 20 Lines
	// exists, otherwise return false by default			// exists, otherwise return false by default
	bool OpenMPDialect::getIsDevice(Operation* module) {			bool OpenMPDialect::getIsDevice(Operation* module) {
	if (Attribute isDevice = module->getAttr("omp.is_device"))			if (Attribute isDevice = module->getAttr("omp.is_device"))
	if (isDevice.isa<mlir::BoolAttr>())			if (isDevice.isa<mlir::BoolAttr>())
	return isDevice.dyn_cast<BoolAttr>().getValue();			return isDevice.dyn_cast<BoolAttr>().getValue();
	return false;			return false;
	}			}

				// Set the omp.target_cpu attribute on the module with the specified string
				void OpenMPDialect::setTargetCpu(Operation *module, llvm::StringRef cpu) {
				module->setAttr(mlir::StringAttr::get(module->getContext(),
				llvm::Twine{"omp.target_cpu"}),
				mlir::StringAttr::get(module->getContext(), cpu));
				}

				// Return the value of the omp.target_cpu attribute stored in the module if it
				// exists, otherwise return empty by default
				std::string OpenMPDialect::getTargetCpu(Operation *module) {
				if (Attribute targetCpu = module->getAttr("omp.target_cpu"))
				if (targetCpu.isa<mlir::StringAttr>())
				return targetCpu.dyn_cast<StringAttr>().getValue().str();
				return llvm::Twine{""}.str();
				}

				// Set the omp.target_cpu_features attribute on the module with
				// the specified string
				void OpenMPDialect::setTargetCpuFeatures(Operation *module,
				llvm::StringRef cpuFeatures) {
				module->setAttr(mlir::StringAttr::get(module->getContext(),
				llvm::Twine{"omp.target_cpu_features"}),
				mlir::StringAttr::get(module->getContext(), cpuFeatures));
				}

				// Return the value of the omp.target_cpu_features attribute stored in the
				// module if it exists, otherwise return empty by default
				std::string OpenMPDialect::getTargetCpuFeatures(Operation *module) {
				if (Attribute targetCpu = module->getAttr("omp.target_cpu_features"))
				if (targetCpu.isa<mlir::StringAttr>())
				return targetCpu.dyn_cast<StringAttr>().getValue().str();
				return llvm::Twine{""}.str();
				}
	#define GET_ATTRDEF_CLASSES			#define GET_ATTRDEF_CLASSES
	#include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.cpp.inc"			#include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.cpp.inc"

	#define GET_OP_CLASSES			#define GET_OP_CLASSES
	#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"			#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"

This is an archive of the discontinued LLVM Phabricator instance.

[Clang][Flang][AMDGPU] Add support for AMDGPU to Flang driverClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 503324

clang/lib/Basic/Targets/AMDGPU.cpp

clang/lib/Driver/ToolChains/CommonArgs.cpp

clang/lib/Driver/ToolChains/Flang.cpp

flang/include/flang/Frontend/FrontendActions.h

flang/lib/Frontend/FrontendActions.cpp

flang/test/Driver/target-cpu-features.f90

flang/test/Lower/OpenMP/target_cpu_features.f90

llvm/include/llvm/TargetParser/TargetParser.h

llvm/lib/TargetParser/TargetParser.cpp

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp

[Clang][Flang][AMDGPU] Add support for AMDGPU to Flang driver
ClosedPublic