GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords.
This is a re-submission.
Paths
| Differential D44179
[AMDGPU] Widened vector length for global/constant address space. ClosedPublic Authored by FarhanaAleen on Mar 6 2018, 4:11 PM.
Details Summary GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. This is a re-submission.
Diff Detail
Event TimelineHerald added subscribers: t-tye, tpr, dstuttard and 5 others. · View Herald TranscriptMar 6 2018, 4:11 PM This revision is now accepted and ready to land.Mar 6 2018, 4:15 PM Closed by commit rL326910: [AMDGPU] Increased vector length for global/constant loads. (authored by faaleen). · Explain WhyMar 7 2018, 9:11 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 137411 llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/trunk/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
llvm/trunk/test/CodeGen/AMDGPU/load-constant-f32.ll
llvm/trunk/test/CodeGen/AMDGPU/load-constant-f64.ll
llvm/trunk/test/CodeGen/AMDGPU/waitcnt-looptest.ll
|