Without this patch, LoopVectorizer in certain cases (see loop-vectorize.ll)
produces code with complex control flow which hurts later optimizations. Since
NVPTX doesn't have vector registers in LLVM's sense
(NVPTXTTI::getRegisterBitWidth(true) == 32), we for now declare no vector
registers to effectively disable loop vectorization.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
Justin, I wonder why NVPTX doesn't leverage vector instructions (such as
vadd) at all. llc on
fadd <2 x float> %a, %b
gives me two add.f32 instead of vadd.f32 or like.
Jingyue
Comment Actions
The short answer is that ptxas doesn't handle vector registers very well. It may be good to revisit this, but ptxas currently prefers scalar ops.
Last time I looked into this, the implementation cost greatly outweighed any potential benefits. At the SASS level, we don't have vector fp ops anyway.