This is an archive of the discontinued LLVM Phabricator instance.

[flang] Add AMDGPU target in flang
ClosedPublic

Authored by jsjodin on Feb 1 2023, 12:40 PM.

Details

Summary

This is the first patch of several that will enable generating code for AMD GPUs. It adds the AMDGPU target so it can be used with the --target and -mcpu options.

Diff Detail

Event Timeline

jsjodin created this revision.Feb 1 2023, 12:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 12:40 PM
jsjodin requested review of this revision.Feb 1 2023, 12:40 PM

It will be helpful if you can share a godbolt link with the types that you see for complex argument and return type like in https://godbolt.org/z/x73xT3r83. I see some difference from what you have given here. Not sure whether it is because of an incorrect invocation.

Please add tests as well (refer to previous commits https://reviews.llvm.org/D136547).

I would have expected something ala:

flang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda example.f90

https://openmp.org/wp-content/uploads/SC18-BoothTalks-Ozen.pdf

gfx902 is not a cpu.

jsjodin added a comment.EditedFeb 2 2023, 5:04 AM

It will be helpful if you can share a godbolt link with the types that you see for complex argument and return type like in https://godbolt.org/z/x73xT3r83. I see some difference from what you have given here. Not sure whether it is because of an incorrect invocation.

Please add tests as well (refer to previous commits https://reviews.llvm.org/D136547).

Yes, the code for complex is probably premature. It might be better to remove it and have stubs with TODO("adds support for complex ") instead. We cannot generate code for amdgcn right now because there are a lot of missing pieces (function attributes, metadata, address spaces) and without creating huge patch it is better to create small ones and incrementally build up the capability until we have something that works. Complex is pretty far down the list of priorities right now.

I would have expected something ala:

flang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda example.f90

https://openmp.org/wp-content/uploads/SC18-BoothTalks-Ozen.pdf

gfx902 is not a cpu.

This is not for OpenMP, this is just to allow compilation of a file using the AMDGPU target. You can try the same command in clang:
clang --target=amdgcn-amd-amdhsa -mcpu=gfx902 -c hello.c -###

"/home/jsjodin/git/trunk/llvm-project/build/bin/clang-17" "-cc1" "-triple" "amdgcn-amd-amdhsa" "-emit-obj" "-mrelax-all" "-disable-free" "-clear-ast-before-backend" "-main-file-name" "hello.c" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/opencl.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_isa_version_902.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx902" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "/home/jsjodin/git/trunk/llvm-project/build/lib/clang/17" "-fdebug-compilation-dir=/home/jsjodin/git/trunk/llvm-project/flang" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fcolor-diagnostics" "-faddrsig" "-o" "hello.o" "-x" "c" "/home/jsjodin/test/fortran/hello.c"

As you can see there are lots more options being generated here, but the -triple and -target-cpu are there.

jsjodin updated this revision to Diff 494278.Feb 2 2023, 5:52 AM

Remove complex arg/return type code.

Would it be possible to add a test for the defaultWidth? Like in flang/test/Fir/target-rewrite-boxchar.fir?

arsenm added a subscriber: arsenm.Feb 3 2023, 10:38 AM
arsenm added inline comments.
flang/lib/Optimizer/CodeGen/Target.cpp
476

default width of what?

Would it be possible to add a test for the defaultWidth? Like in flang/test/Fir/target-rewrite-boxchar.fir?

Yes, I will update the patch with the added test.

jsjodin added inline comments.Feb 7 2023, 7:26 AM
flang/lib/Optimizer/CodeGen/Target.cpp
476

It is the default width of the index type for a string. I assumed that the offsets were 64-bits by default.

jsjodin updated this revision to Diff 495533.Feb 7 2023, 7:27 AM

Add index width test.

kiranchandramohan accepted this revision.Feb 7 2023, 7:30 AM

LGTM. Please wait for a day in case there are comments from others.

This revision is now accepted and ready to land.Feb 7 2023, 7:30 AM
arsenm added inline comments.Feb 7 2023, 7:32 AM
flang/lib/Optimizer/CodeGen/Target.cpp
476

Could use a better name or at least documentation

jsjodin added inline comments.Feb 7 2023, 8:27 AM
flang/lib/Optimizer/CodeGen/Target.cpp
476

Yes, I agree that the name is not very descriptive but it is already part of the code and this is just an override. I can add a comment.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2023, 11:59 AM