Similar to D105787, this patch tries to fold __kmpc_parallel_level if possible.
Note that __kmpc_parallel_level doesn't take activeness into consideration,
based on current deviceRTLs, its return value can be such as 0, 1, 2, instead
of 0, 129, 130, etc. that also indicate activeness.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
add noinline to the microtrask function to avoid us accidentally doing the wrong thing. I'll put a patch up that removes those noinline in the module pass.
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
3408 | Seems unnecessary because this variable is only written in the function call but I can do it. |
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
3820 | I changed this a little bit. If CallerKernelInfoAA.ParallelLevels is empty, we can only tell we currently cannot fold it, but it doesn't mean it cannot be folded in the future. If the size is more than 1, yeah, definitely it is pessimistic. |
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
3846 | not nullptr but it shuld continue to be none |
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
3846 | Aha, I recall it. Yeah, will do it. |
openmp/libomptarget/deviceRTLs/interface.h | ||
---|---|---|
444 | Isn't there a NOINLINE definition in target_impl.h? |
These changes are covered by D106149, which will not be part of this patch eventually.