When packet size equals packet align and is power of 2, transform
read_pipe* and write_pipe* to specialized library function.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Please use AMDGPULibFunc instead of manual function name manipulations.
lib/Target/AMDGPU/AMDGPULibCalls.cpp | ||
---|---|---|
598 ↗ | (On Diff #111514) | "llvm::" prefix is not needed. |
616 ↗ | (On Diff #111514) | You should not use getOrInsertFunction directly, but use AMDGPULibCalls::getFunction(). |
645 ↗ | (On Diff #111514) | You should use AMDGPULibFunc and parser/mangler used across the whole this source. |
lib/Target/AMDGPU/AMDGPULibCalls.cpp | ||
---|---|---|
598 ↗ | (On Diff #111514) | will remove |
616 ↗ | (On Diff #111514) | whether in pre-link can be checked by whether the function is declaration. __read_pipe_* and __write_pipe_* are unmangled functions. AMDGPULibFunc::getFunction relies on argument information encoded in mangled function names, which does not work for unmangled functions. |
645 ↗ | (On Diff #111514) | will refactor this. |
You still fail to use getFunction and thus also fail to check prelink.
lib/Target/AMDGPU/AMDGPULibCalls.cpp | ||
---|---|---|
67 ↗ | (On Diff #111912) | It is better hide these details from the client. AMDGPULibFunc should care about all of it and FuncInfo shall be sufficient structure to describe a function. Simplifier should not care about details. |
616 ↗ | (On Diff #111514) | You are still using M->getOrInsertFunction() directly. |
lib/Target/AMDGPU/AMDGPULibFunc.cpp | ||
987 ↗ | (On Diff #111912) | I believe we can go without bruteforce search loop. |
lib/Target/AMDGPU/AMDGPULibFunc.h | ||
240 ↗ | (On Diff #111912) | There is already EI_READ_PIPE above. Is it mangled? Then why these are unmangled? |
366 ↗ | (On Diff #111912) | StringRef designed to be passed by value, here and below. |
test/CodeGen/AMDGPU/simplify-libcalls.ll | ||
693 ↗ | (On Diff #111912) | Run tests through opt -instnamer. |
Clang does not emit mangled functions for pipe builtin functions. Instead, it emits unmangled functions e.g. __read_pipe_*. These functions are not overloaded and will be implemented by device library.
The original implementation of AMDGPULibFunc assumes all library functions have mangled name. It extracts function argument information based on the mangled function name and use it to implement getOrInsertFunction.
For unmangled library functions, those functions regarding mangling or function parameter info are either irrelevant or useless. Those functions are needed for mangled functions because the mangling requires such information.
Therefore I subclass AMDGPULibFunc as AMDGPUMangledLibFunc and AMDGPUUnmangledLibFunc and move all the stuff needed for mangling to AMDGPUMangledLibFunc.
For AMDGPUUnmangledLibFunc, since there is no argument type information from mangling, AMDGPULibFunc::getOrInsertFunction cannot be implemented. There is no special trick about identifying whether it is prelink or not in the original implementation. Just checks whether the function is declaration or not. I see no point of implementing a special getOrInsertFunction for AMDGPUUnmangledLibFunc.
What about EI_READ_PIPE? Is it mangled? Shall it be removed?
Whatever you subclass all the changes shall be contained within AMDGPULibFunc and not in its clients. If some of the arguments on FuncInfo are unused in this case that is fine. You can also check the former implementation from HSAIL which was also handling EDG style magling (unmagled in therms you are using here).
All in all that is only legal to do it on prelink, and that is exactly what getFunction() does.
lib/Target/AMDGPU/AMDGPULibCalls.cpp | ||
---|---|---|
67 ↗ | (On Diff #111912) | The handling of mangled lib function requires function argument information, which is not available for unmangled lib function. The transformation of unmangled lib function does not require function argument information. There is no point of implementing many of the member functions of mangled lib functions. |
lib/Target/AMDGPU/AMDGPULibFunc.cpp | ||
987 ↗ | (On Diff #111912) | will add a map. |
lib/Target/AMDGPU/AMDGPULibFunc.h | ||
240 ↗ | (On Diff #111912) | clang does not emit mangled read_pipe functions. Will remove EI_READ_PIPE. |
366 ↗ | (On Diff #111912) | This comes from the original implementation. Will fix. |
test/CodeGen/AMDGPU/simplify-libcalls.ll | ||
693 ↗ | (On Diff #111912) | will fix. |
For pipe functions, clang currently does not do any mangling at all. It is not like the EDG style mangling.
As I said, getFunction() requires function argument information, which is not available for unmangled functions. To implement getFunction() requires providing argument information for unmangled functions, which is time consuming and also useless.
Run it without -amdgpu-prelink. It will fail to link. It will also fail to build library.
The device library has not implemented these functions yet. I think that's why it fails to link.
I will investigate why it fails to build library.
For mangled lib functions, my patch does not change how they are handled. They still go through getFunction.
For unmangled lib functions, I only transform them if they are declarations. In post-linking pass, they are already linked and are not declarations, therefore they stay unchanged.
I am wondering why there will be link failure.
Library build works before link, but you do not check that prelink transfirmations allowed.
library contains definition of unmangled functions, since they are not declaration, the pass will not change them.
Splitting AMDGPULibFunc in two classes looks a huge overkill. How about modifying AMDGPULibFunc::parse so it could accept unmangled names and just return an enum id for the function (using some fast lookup approach)? Type info for such functions can be left unpopulated and supposed to be handled by the client (as in fold_read_write_pipe).
The unmangled lib function is different from mangled function in the way how the function names and type information are handled. I have found a way to reuse the interface mangle, getOrInsertFunction, getFunction, and getFunctionType. However to achieve that we really need to have different classes for mangled and unmangled lib function and take advantages of some virtual functions. Having different classes for unmangled and mangled lib functions also have a cleaner design where all name mangling stuff are kept in where they belong.
Ok, unmangled part looks different indeed. If the issue with pre-link checking is solved this patch is ok with me.
lib/Target/AMDGPU/AMDGPULibCalls.cpp | ||
---|---|---|
616 ↗ | (On Diff #111514) | I see you are now checking for the declaration vs definition, but it is still only legal on pre-link and that is not checked. |
lib/Target/AMDGPU/AMDGPULibFunc.cpp | ||
987 ↗ | (On Diff #111912) | Not done. |
lib/Target/AMDGPU/AMDGPULibFunc.h | ||
366 ↗ | (On Diff #111912) | Not done. |
test/CodeGen/AMDGPU/simplify-libcalls.ll | ||
693 ↗ | (On Diff #111912) | Not done. |
Revised by Stas' comments. Use AMDGPULibFunc::getOrInsertFunction to create function.
lib/Target/AMDGPU/AMDGPULibFunc.h | ||
---|---|---|
366 ↗ | (On Diff #111912) | Initially (non-const) StringRef& was introduced intentionally to have mangledName with stripped name on return. |
lib/Target/AMDGPU/AMDGPULibCalls.cpp | ||
---|---|---|
636 ↗ | (On Diff #113673) | The point of using it is to check its result and bail if null. Also no modifications shall be done before this check. |
67 ↗ | (On Diff #111912) | The whole point of moving logic to distinguish between mangled and unmangled into the parser is to have no massive changes in the client like this and to let client to not bother differentiating on every other line. |
test/CodeGen/AMDGPU/simplify-libcalls.ll | ||
693 ↗ | (On Diff #111912) | I still see it. |
test/CodeGen/AMDGPU/simplify-libcalls.ll | ||
---|---|---|
693 ↗ | (On Diff #111912) | I've add -instnamer to the RUN line. Do you mean I should get the original .ll through opt -instnamer so that the .ll contains named instructions? |
test/CodeGen/AMDGPU/simplify-libcalls.ll | ||
---|---|---|
693 ↗ | (On Diff #111912) | Yes, there shall be no numbered variables in the test as it has to be easily editable. |
test/CodeGen/AMDGPU/simplify-libcalls.ll | ||
---|---|---|
1 ↗ | (On Diff #113906) | -instanamer is not needed here. |
lib/Target/AMDGPU/AMDGPULibCalls.cpp | ||
---|---|---|
612 ↗ | (On Diff #113906) | Please move the actual IR change below "return false" statement. |