GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp | ||
---|---|---|
262 ↗ | (On Diff #134126) | Does amdgpu only support gfx6 (si) and above? I thought northern islands was supported by the r600 backend. |
lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h | ||
121 ↗ | (On Diff #134126) | I did not see where in this patch these new functions are being used. |
Does amdgpu only support gfx6 (si) and above? I thought northern islands was supported by the r600 backend.
AMDGPU contains the general target information for all gpus and specific target implementation is supported by the specific backend; such as r600 and SI backends.
lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h | ||
---|---|---|
121 ↗ | (On Diff #134126) | These are not new functions; the default definitions can be found in TargetTransformInfoImpl.h. loadStoreVectorizer was using the default implementation of these functions for AMDGPU since there were no implementation of these functions for AMDGPU. unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize, unsigned ChainSizeInBytes, VectorType *VecTy) const { return VF; } unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize, unsigned ChainSizeInBytes, VectorType *VecTy) const { return VF; } |
Please fix the tests, otherwise looks good.
test/CodeGen/AMDGPU/load-constant-f32.ll | ||
---|---|---|
12 ↗ | (On Diff #134126) | Run tests through opt -instnamer. There should be no numeric values. |