Changed the documentation of amdgpu_flat_work_group_size under AMD GPU Attributes which suggested that attribute is an optimization hint. But as suggested in the bug https://bugs.llvm.org/show_bug.cgi?id=42989, it should be made mandatory.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
2,120 ms | x64 debian > libarcher.races::lock-unrelated.c |
Event Timeline
Minor wordsmithing on the documentation changes, but more importantly: why is the correct fix to the documentation as opposed to changing the default max working group size?
clang/include/clang/Basic/AttrDocs.td | ||
---|---|---|
2244–2247 |
Hi @aaron.ballman. Thanks for your feedback! I am an outreachy applicant and totally new to this project. I am currently trying to understand the code base. So thought to update the documentation meanwhile. Later on we can change the default max working group size with your suggestion. What do you say, should we directly change the default max working group size and not the documentation?
Welcome!
So thought to update the documentation meanwhile. Later on we can change the default max working group size with your suggestion. What do you say, should we directly change the default max working group size and not the documentation?
I'm not an AMD person and so I'm not certain I'm the *best* person to answer this, but my feeling is that this is a case where the implementation should be updated rather than the docs. Otherwise, we're effectively encouraging users to churn their code (add the attribute to places they didn't use it before) with the intention of undoing that in the future. However, I'm hoping someone more familiar with AMDGPU can pipe up with their opinions. @arsenm?
clang/include/clang/Basic/AttrDocs.td | ||
---|---|---|
2244–2247 | You're updating this with outdated information. In general functions should be conservatively correct by default with no attribute specified. This was broken at one point in the past. The default assumed workgroup size is now 1024, but for opencl clang will always default to a max of 256 |
I am really not an idol reviewer for this patch -:) don't know anything about AMDGPU.
clang/include/clang/Basic/AttrDocs.td | ||
---|---|---|
2244–2247 | Ohh. Thanks for your feedback. Will update it |
Closing this issue because the default workgroup size is 1024 now, so no changes are required.