Changed the documentation of amdgpu_flat_work_group_size under AMD GPU Attributes which suggested that attribute is an optimization hint. But as suggested in the bug https://bugs.llvm.org/show_bug.cgi?id=42989, it should be made mandatory.
|2,120 ms||x64 debian > libarcher.races::lock-unrelated.c|
Script: -- : 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/lock-unrelated.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/lock-unrelated.c
Hi @aaron.ballman. Thanks for your feedback! I am an outreachy applicant and totally new to this project. I am currently trying to understand the code base. So thought to update the documentation meanwhile. Later on we can change the default max working group size with your suggestion. What do you say, should we directly change the default max working group size and not the documentation?
So thought to update the documentation meanwhile. Later on we can change the default max working group size with your suggestion. What do you say, should we directly change the default max working group size and not the documentation?
I'm not an AMD person and so I'm not certain I'm the *best* person to answer this, but my feeling is that this is a case where the implementation should be updated rather than the docs. Otherwise, we're effectively encouraging users to churn their code (add the attribute to places they didn't use it before) with the intention of undoing that in the future. However, I'm hoping someone more familiar with AMDGPU can pipe up with their opinions. @arsenm?
You're updating this with outdated information. In general functions should be conservatively correct by default with no attribute specified. This was broken at one point in the past. The default assumed workgroup size is now 1024, but for opencl clang will always default to a max of 256