This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] OpenMPOpt Support for Globalization Remarks
ClosedPublic

Authored by jhuber6 on Sep 24 2020, 9:54 AM.

Details

Summary

This patch add support for printing analysis messages relating to data globalization on the GPU. This occurs when data is shared between the threads in a GPU context and must be pushed to global or shared memory.

Diff Detail

Event Timeline

jhuber6 created this revision.Sep 24 2020, 9:54 AM
jhuber6 requested review of this revision.Sep 24 2020, 9:54 AM

Test?

I tested it just using the example file to see if it works. I tried just calling opt on it but it was giving me some dumb error so I figured I'd just get it set up first.

clang++ -fopenmp clang/tests/OpenMP/declare_target_codegen_globalization.cpp -Rpass=openmp-opt -Rpass-analysis=openmp-opt -S -emit-llvm -fopenmp-targets=nvptx64-nvidia-cuda -O3
jdoerfert added inline comments.Sep 24 2020, 10:18 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
708

Can't we do foreachuse?

Test?

I tested it just using the example file to see if it works. I tried just calling opt on it but it was giving me some dumb error so I figured I'd just get it set up first.

clang++ -fopenmp clang/tests/OpenMP/declare_target_codegen_globalization.cpp -Rpass=openmp-opt -Rpass-analysis=openmp-opt -S -emit-llvm -fopenmp-targets=nvptx64-nvidia-cuda -O3

I think the problem is calling opt on the file since it's combined with the nvptx IR. Is there some opt command line option that does that correctly?

jhuber6 updated this revision to Diff 294111.Sep 24 2020, 10:43 AM

Adding test case and changing analysis to use forEachUse. The nvptx file doesn't have the line number information after compiling with debugging symbols so the remark just says "unknown." When you get the remarks from clang it seems to print the information more than necessary.

declare_target_codegen_globalization.cpp:7:5: remark: Found thread data sharing on the GPU. Expect degraded performance due to data globalization. [-Rpass-analysis=openmp-opt]
int bar() {
    ^
remark: Found thread data sharing on the GPU. Expect degraded performance due to data globalization. [-Rpass-analysis=openmp-opt]
remark: Found thread data sharing on the GPU. Expect degraded performance due to data globalization. [-Rpass-analysis=openmp-opt]
jhuber6 updated this revision to Diff 294151.Sep 24 2020, 1:38 PM

Manually adding missing debug info for stack pushing not being generated by Clang. A real solution will require modifying clang's code generation.

jdoerfert accepted this revision.Sep 24 2020, 2:59 PM

LGTM

llvm/test/Transforms/OpenMP/globalization_remarks.ll
63 ↗(On Diff #294151)

I think you want !31 but I guess it doesn't really matter.

This revision is now accepted and ready to land.Sep 24 2020, 2:59 PM
This revision was landed with ongoing or failed builds.Sep 24 2020, 3:23 PM
This revision was automatically updated to reflect the committed changes.