This is an archive of the discontinued LLVM Phabricator instance.

[flang] Add AMDGPU target in flang
ClosedPublic

Authored by jsjodin on Feb 1 2023, 12:40 PM.

Download Raw Diff

Details

Reviewers

awarzynski
clementval
jdoerfert
kiranktp
gregrodgers
domada
TIFitis
sscalpone
kiranchandramohan
peixin
psoni2628

Commits

rG08749a9137a5: [flang] Add AMDGPU target in flang

Summary

This is the first patch of several that will enable generating code for AMD GPUs. It adds the AMDGPU target so it can be used with the --target and -mcpu options.

Diff Detail

Event Timeline

jsjodin created this revision.Feb 1 2023, 12:40 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 12:40 PM

Herald added subscribers: kosarev, sunshaoce, mehdi_amini and 4 others. · View Herald Transcript

jsjodin requested review of this revision.Feb 1 2023, 12:40 PM

Herald added a subscriber: wdng. · View Herald TranscriptFeb 1 2023, 12:40 PM

Harbormaster completed remote builds in B211292: Diff 494037.Feb 1 2023, 1:12 PM

It will be helpful if you can share a godbolt link with the types that you see for complex argument and return type like in https://godbolt.org/z/x73xT3r83. I see some difference from what you have given here. Not sure whether it is because of an incorrect invocation.

Please add tests as well (refer to previous commits https://reviews.llvm.org/D136547).

I would have expected something ala:

flang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda example.f90

https://openmp.org/wp-content/uploads/SC18-BoothTalks-Ozen.pdf

gfx902 is not a cpu.

In D143102#4097753, @kiranchandramohan wrote:

It will be helpful if you can share a godbolt link with the types that you see for complex argument and return type like in https://godbolt.org/z/x73xT3r83. I see some difference from what you have given here. Not sure whether it is because of an incorrect invocation.

Please add tests as well (refer to previous commits https://reviews.llvm.org/D136547).

Yes, the code for complex is probably premature. It might be better to remove it and have stubs with TODO("adds support for complex ") instead. We cannot generate code for amdgcn right now because there are a lot of missing pieces (function attributes, metadata, address spaces) and without creating huge patch it is better to create small ones and incrementally build up the capability until we have something that works. Complex is pretty far down the list of priorities right now.

In D143102#4097773, @tschuett wrote:
I would have expected something ala:
flang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda example.f90
https://openmp.org/wp-content/uploads/SC18-BoothTalks-Ozen.pdf

gfx902 is not a cpu.

This is not for OpenMP, this is just to allow compilation of a file using the AMDGPU target. You can try the same command in clang:
clang --target=amdgcn-amd-amdhsa -mcpu=gfx902 -c hello.c -###

"/home/jsjodin/git/trunk/llvm-project/build/bin/clang-17" "-cc1" "-triple" "amdgcn-amd-amdhsa" "-emit-obj" "-mrelax-all" "-disable-free" "-clear-ast-before-backend" "-main-file-name" "hello.c" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/opencl.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_isa_version_902.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx902" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "/home/jsjodin/git/trunk/llvm-project/build/lib/clang/17" "-fdebug-compilation-dir=/home/jsjodin/git/trunk/llvm-project/flang" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fcolor-diagnostics" "-faddrsig" "-o" "hello.o" "-x" "c" "/home/jsjodin/test/fortran/hello.c"

As you can see there are lots more options being generated here, but the -triple and -target-cpu are there.

Remove complex arg/return type code.

Harbormaster completed remote builds in B211469: Diff 494278.Feb 2 2023, 6:21 AM

Would it be possible to add a test for the defaultWidth? Like in flang/test/Fir/target-rewrite-boxchar.fir?

arsenm added a subscriber: arsenm.Feb 3 2023, 10:38 AM

arsenm added inline comments.

flang/lib/Optimizer/CodeGen/Target.cpp
476	default width of what?

In D143102#4103090, @kiranchandramohan wrote:

Would it be possible to add a test for the defaultWidth? Like in flang/test/Fir/target-rewrite-boxchar.fir?

Yes, I will update the patch with the added test.

jsjodin added inline comments.Feb 7 2023, 7:26 AM

flang/lib/Optimizer/CodeGen/Target.cpp
476	It is the default width of the index type for a string. I assumed that the offsets were 64-bits by default.

Add index width test.

LGTM. Please wait for a day in case there are comments from others.

This revision is now accepted and ready to land.Feb 7 2023, 7:30 AM

arsenm added inline comments.Feb 7 2023, 7:32 AM

flang/lib/Optimizer/CodeGen/Target.cpp
476	Could use a better name or at least documentation

Harbormaster completed remote builds in B212393: Diff 495533.Feb 7 2023, 8:22 AM

jsjodin added inline comments.Feb 7 2023, 8:27 AM

flang/lib/Optimizer/CodeGen/Target.cpp
476	Yes, I agree that the name is not very descriptive but it is already part of the code and this is just an override. I can add a comment.

Closed by commit rG08749a9137a5: [flang] Add AMDGPU target in flang (authored by jsjodin). · Explain WhyFeb 7 2023, 11:59 AM

This revision was automatically updated to reflect the committed changes.

jsjodin added a commit: rG08749a9137a5: [flang] Add AMDGPU target in flang.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2023, 11:59 AM

Revision Contents

Path

Size

flang/

lib/

Optimizer/

CodeGen/

Target.cpp

29 lines

test/

Driver/

target-gpu-features.f90

10 lines

Diff 494278

flang/lib/Optimizer/CodeGen/Target.cpp

Show First 20 Lines • Show All 459 Lines • ▼ Show 20 Lines	complexReturnType(mlir::Location loc, mlir::Type eleTy) const override {
} else {		} else {
TODO(loc, "complex for this precision");		TODO(loc, "complex for this precision");
}		}
return marshal;		return marshal;
}		}
};		};
} // namespace		} // namespace

		//===----------------------------------------------------------------------===//
		// AMDGPU linux target specifics.
		//===----------------------------------------------------------------------===//

		namespace {
		struct TargetAMDGPU : public GenericTarget<TargetAMDGPU> {
		using GenericTarget::GenericTarget;

		static constexpr int defaultWidth = 64;
		arsenmUnsubmitted Not Done Reply Inline Actions default width of what? arsenm: default width of what?
		jsjodinAuthorUnsubmitted Done Reply Inline Actions It is the default width of the index type for a string. I assumed that the offsets were 64-bits by default. jsjodin: It is the default width of the index type for a string. I assumed that the offsets were 64-bits…
		arsenmUnsubmitted Not Done Reply Inline Actions Could use a better name or at least documentation arsenm: Could use a better name or at least documentation
		jsjodinAuthorUnsubmitted Done Reply Inline Actions Yes, I agree that the name is not very descriptive but it is already part of the code and this is just an override. I can add a comment. jsjodin: Yes, I agree that the name is not very descriptive but it is already part of the code and this…

		CodeGenSpecifics::Marshalling
		complexArgumentType(mlir::Location loc, mlir::Type eleTy) const override {
		CodeGenSpecifics::Marshalling marshal;
		TODO(loc, "handle complex argument types");
		return marshal;
		}

		CodeGenSpecifics::Marshalling
		complexReturnType(mlir::Location loc, mlir::Type eleTy) const override {
		CodeGenSpecifics::Marshalling marshal;
		TODO(loc, "handle complex return types");
		return marshal;
		}
		};
		} // namespace

// Instantiate the overloaded target instance based on the triple value.		// Instantiate the overloaded target instance based on the triple value.
// TODO: Add other targets to this file as needed.		// TODO: Add other targets to this file as needed.
std::unique_ptr<fir::CodeGenSpecifics>		std::unique_ptr<fir::CodeGenSpecifics>
fir::CodeGenSpecifics::get(mlir::MLIRContext *ctx, llvm::Triple &&trp,		fir::CodeGenSpecifics::get(mlir::MLIRContext *ctx, llvm::Triple &&trp,
KindMapping &&kindMap) {		KindMapping &&kindMap) {
switch (trp.getArch()) {		switch (trp.getArch()) {
default:		default:
break;		break;
Show All 16 Lines	case llvm::Triple::ArchType::sparc:
return std::make_unique<TargetSparc>(ctx, std::move(trp),		return std::make_unique<TargetSparc>(ctx, std::move(trp),
std::move(kindMap));		std::move(kindMap));
case llvm::Triple::ArchType::sparcv9:		case llvm::Triple::ArchType::sparcv9:
return std::make_unique<TargetSparcV9>(ctx, std::move(trp),		return std::make_unique<TargetSparcV9>(ctx, std::move(trp),
std::move(kindMap));		std::move(kindMap));
case llvm::Triple::ArchType::riscv64:		case llvm::Triple::ArchType::riscv64:
return std::make_unique<TargetRISCV64>(ctx, std::move(trp),		return std::make_unique<TargetRISCV64>(ctx, std::move(trp),
std::move(kindMap));		std::move(kindMap));
		case llvm::Triple::ArchType::amdgcn:
		return std::make_unique<TargetAMDGPU>(ctx, std::move(trp),
		std::move(kindMap));
}		}
TODO(mlir::UnknownLoc::get(ctx), "target not implemented");		TODO(mlir::UnknownLoc::get(ctx), "target not implemented");
}		}

flang/test/Driver/target-gpu-features.f90

This file was added.

				! REQUIRES: amdgpu-registered-target

				! Test that -mcpu are used and that the -target-cpu and -target-features
				! are also added to the fc1 command.

				! RUN: %flang --target=amdgcn-amd-amdhsa -mcpu=gfx902 -c %s -### 2>&1 \
				! RUN: \| FileCheck %s -check-prefix=CHECK-AMDGCN

				! CHECK-AMDGCN: "-fc1" "-triple" "amdgcn-amd-amdhsa"
				! CHECK-AMDGCN-SAME: "-target-cpu" "gfx902"