Avoids stack access.
Also handle extract hi elt pattern from truncate + shift
to avoid a couple test regressions.
Paths
| Differential D46828
AMDGPU: Custom lower v4i16/v4f16 vector operations ClosedPublic Authored by arsenm on May 14 2018, 4:08 AM.
Details
Diff Detail Event TimelineHerald added subscribers: t-tye, tpr, dstuttard and 3 others. · View Herald TranscriptMay 14 2018, 4:08 AM
This revision is now accepted and ready to land.May 15 2018, 1:27 PM
Revision Contents
Diff 146905 lib/Target/AMDGPU/AMDGPUISelLowering.h
lib/Target/AMDGPU/AMDGPUISelLowering.cpp
lib/Target/AMDGPU/SIISelLowering.h
lib/Target/AMDGPU/SIISelLowering.cpp
test/CodeGen/AMDGPU/extload-align.ll
test/CodeGen/AMDGPU/extract_vector_elt-f16.ll
test/CodeGen/AMDGPU/extract_vector_elt-i16.ll
test/CodeGen/AMDGPU/insert_vector_elt.ll
test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
test/CodeGen/AMDGPU/min.ll
|
Need to bail if vector size is not an expected 64.