This is the first patch of several that will enable generating code for AMD GPUs. It adds the AMDGPU target so it can be used with the --target and -mcpu options.
Details
Diff Detail
Event Timeline
It will be helpful if you can share a godbolt link with the types that you see for complex argument and return type like in https://godbolt.org/z/x73xT3r83. I see some difference from what you have given here. Not sure whether it is because of an incorrect invocation.
Please add tests as well (refer to previous commits https://reviews.llvm.org/D136547).
I would have expected something ala:
flang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda example.f90
https://openmp.org/wp-content/uploads/SC18-BoothTalks-Ozen.pdf
gfx902 is not a cpu.
Yes, the code for complex is probably premature. It might be better to remove it and have stubs with TODO("adds support for complex ") instead. We cannot generate code for amdgcn right now because there are a lot of missing pieces (function attributes, metadata, address spaces) and without creating huge patch it is better to create small ones and incrementally build up the capability until we have something that works. Complex is pretty far down the list of priorities right now.
This is not for OpenMP, this is just to allow compilation of a file using the AMDGPU target. You can try the same command in clang:
clang --target=amdgcn-amd-amdhsa -mcpu=gfx902 -c hello.c -###
"/home/jsjodin/git/trunk/llvm-project/build/bin/clang-17" "-cc1" "-triple" "amdgcn-amd-amdhsa" "-emit-obj" "-mrelax-all" "-disable-free" "-clear-ast-before-backend" "-main-file-name" "hello.c" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/opencl.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_isa_version_902.bc" "-mlink-builtin-bitcode" "/opt/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx902" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "/home/jsjodin/git/trunk/llvm-project/build/lib/clang/17" "-fdebug-compilation-dir=/home/jsjodin/git/trunk/llvm-project/flang" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fcolor-diagnostics" "-faddrsig" "-o" "hello.o" "-x" "c" "/home/jsjodin/test/fortran/hello.c"
As you can see there are lots more options being generated here, but the -triple and -target-cpu are there.
Would it be possible to add a test for the defaultWidth? Like in flang/test/Fir/target-rewrite-boxchar.fir?
flang/lib/Optimizer/CodeGen/Target.cpp | ||
---|---|---|
476 | default width of what? |
flang/lib/Optimizer/CodeGen/Target.cpp | ||
---|---|---|
476 | It is the default width of the index type for a string. I assumed that the offsets were 64-bits by default. |
flang/lib/Optimizer/CodeGen/Target.cpp | ||
---|---|---|
476 | Could use a better name or at least documentation |
flang/lib/Optimizer/CodeGen/Target.cpp | ||
---|---|---|
476 | Yes, I agree that the name is not very descriptive but it is already part of the code and this is just an override. I can add a comment. |
default width of what?