Remove dependency of SDWA pass on SIShrinkInstructions.
The goal is to move SDWA even higher in the stack to avoid second run
of MachineLICM, MachineCSE and SIFoldOperands.
Also added handling to preserve original src modifiers.
Paths
| Differential D33860
[AMDGPU] Untangle SDWA pass from SIShrinkInstructions ClosedPublic Authored by rampitec on Jun 3 2017, 12:02 AM.
Details Summary Remove dependency of SDWA pass on SIShrinkInstructions. Also added handling to preserve original src modifiers.
Diff Detail
Event TimelineHerald added subscribers: t-tye, tpr, dstuttard and 4 others. · View Herald TranscriptJun 3 2017, 12:03 AM Comment Actions This is good change. I wanted to propose it myself:)
This revision is now accepted and ready to land.Jun 3 2017, 10:24 AM Closed by commit rL304665: [AMDGPU] Untangle SDWA pass from SIShrinkInstructions (authored by rampitec). · Explain WhyJun 3 2017, 10:40 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 101309 lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
lib/Target/AMDGPU/SIPeepholeSDWA.cpp
test/CodeGen/AMDGPU/add.v2i16.ll
test/CodeGen/AMDGPU/ashr.v2i16.ll
test/CodeGen/AMDGPU/fabs.f16.ll
test/CodeGen/AMDGPU/fadd.f16.ll
test/CodeGen/AMDGPU/fcanonicalize.f16.ll
test/CodeGen/AMDGPU/fmul.f16.ll
test/CodeGen/AMDGPU/fneg-fabs.f16.ll
test/CodeGen/AMDGPU/fneg.f16.ll
test/CodeGen/AMDGPU/fptosi.f16.ll
test/CodeGen/AMDGPU/fptoui.f16.ll
test/CodeGen/AMDGPU/fsub.f16.ll
test/CodeGen/AMDGPU/immv216.ll
test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
test/CodeGen/AMDGPU/llvm.fmuladd.f16.ll
test/CodeGen/AMDGPU/llvm.maxnum.f16.ll
test/CodeGen/AMDGPU/llvm.minnum.f16.ll
test/CodeGen/AMDGPU/scratch-simple.ll
test/CodeGen/AMDGPU/sdwa-peephole.ll
test/CodeGen/AMDGPU/shl.v2i16.ll
test/CodeGen/AMDGPU/sminmax.v2i16.ll
test/CodeGen/AMDGPU/sub.v2i16.ll
test/CodeGen/AMDGPU/v_mac_f16.ll
|
Why do you use XOR here?