When we are inserting 1 "inline" element, and zeroing 2 of the other elements then we can safely commute the insertps source inputs to improve memory folding.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Paths
| Differential D56843
[X86][SSE] Add selective commutation support for insertps (PR40340) ClosedPublic Authored by RKSimon on Jan 17 2019, 3:48 AM.
Details Summary When we are inserting 1 "inline" element, and zeroing 2 of the other elements then we can safely commute the insertps source inputs to improve memory folding.
Diff Detail
Event TimelineThis revision is now accepted and ready to land.Jan 21 2019, 12:01 PM Closed by commit rL351807: [X86][SSE] Add selective commutation support for insertps (PR40340) (authored by RKSimon). · Explain WhyJan 22 2019, 4:18 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 182885 llvm/trunk/lib/Target/X86/X86InstrAVX512.td
llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
llvm/trunk/lib/Target/X86/X86InstrSSE.td
llvm/trunk/test/CodeGen/X86/insertps-combine.ll
|