Const/local/shared address spaces are all < 4GB and we can always use
32-bit pointers to access them. This has substantial performance impact
on kernels that uses shared memory for intermediary results.
The feature is disabled by default.
Paths
| Differential D46147
[NVPTX] Added a feature to use short pointers for const/local/shared AS. ClosedPublic Authored by tra on Apr 26 2018, 3:18 PM.
Details Summary Const/local/shared address spaces are all < 4GB and we can always use The feature is disabled by default.
Diff Detail
Event TimelineHerald added subscribers: bixia, hiraditya, sanjoy, jholewinski. · View Herald TranscriptApr 26 2018, 3:18 PM Comment Actions Ben, are you comfortable reviewing this?
This revision is now accepted and ready to land.Apr 27 2018, 2:00 PM Comment Actions NVPTX wasn't already doing this? As this is a module property, a subtarget feature can't be used for this. For example what happens for a global value? Comment Actions
No, NVPTX is always using 64-bit pointers for all address spaces and that's one of the common situations where it gives NVCC an edge. It's not clear what concerns you about global values. Could you elaborate or give me an example? Comment Actions
The subtarget is a function level property. The pointer size / datalayout is a module property and can't be different between functions. If you have multiple functions with different pointer sizes in a module (or no functions at all), then what pointer size ends up used for a global? The global itself has a size, and any ConstantExpr pointers in its initializer will also need to have a size. Comment Actions
I think I see. So, someone could add "target-attributes" to a function and that may potentially override the features. It does not harm us in this case as we only set datalayout once when it's called during creation NVPTXTargetMachine so only the features supplied on command line will be affected. That said, I do see your point that subtarget is not the right place for this option. After a bit of digging I've found a curious comment in LLVMTargetMachine, so this approach is both used in other places and considered problemetic. // FIXME: Having an MCSubtargetInfo on the target machine is a hack due // to some backends having subtarget feature dependent module level // code generation. This is similar to the hack in the AsmPrinter for // module level assembly etc. STI = TheTarget.createMCSubtargetInfo(getTargetTriple().str(), getTargetCPU(), getTargetFeatureString()); One way out of this is to add a separate command-line option to control this. Is there a better way to do it? Comment Actions Moved control of the feature from subtarget feature to --nvptx-short-ptr command-line option. Comment Actions Would it be possible to get rid of the flag entirely and make this controlled only by the data layout? That's defined on the module level. Comment Actions
Yes, ultimately the switch is just changing the datalayout. If this is just a short term thing for migration I think a global flag is fine. If you want to be able to swap these properly long term, this is a different arch name / TargetMachine. Comment Actions
I'm still not quite sure how clang's setting of data layout in the module plays with LLVM's. When they are out of sync, LLVM throws an error which suggests that we currently can't just have arbitrary datalayout specified by the module, it must also match LLVM's idea of the data layout used by particular back-end. At least, that's my understanding how things currently work. I might be wrong. In any case, I do want the short pointers to eventually become the default. However, it's a somewhat risky change, so I need an escape hatch to be able to test the changes without breaking everyone. I know of at least one issue in the split-GEP pass (will send a patch soon). There are likely more. Comment Actions
You need to repeat the datalayout in clang as the target expects Closed by commit rL331941: [NVPTX] Added a feature to use short pointers for const/local/shared AS. (authored by tra). · Explain WhyMay 9 2018, 4:49 PM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 146033 llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.h
llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
llvm/trunk/lib/Target/NVPTX/NVPTXISelLowering.cpp
llvm/trunk/lib/Target/NVPTX/NVPTXInstrInfo.td
llvm/trunk/lib/Target/NVPTX/NVPTXIntrinsics.td
llvm/trunk/lib/Target/NVPTX/NVPTXSubtarget.h
llvm/trunk/lib/Target/NVPTX/NVPTXTargetMachine.h
llvm/trunk/lib/Target/NVPTX/NVPTXTargetMachine.cpp
llvm/trunk/test/CodeGen/NVPTX/addrspacecast.ll
llvm/trunk/test/CodeGen/NVPTX/ld-addrspace.ll
llvm/trunk/test/CodeGen/NVPTX/st-addrspace.ll
|