Page MenuHomePhabricator

[PowerPC] Remove UseVSXReg
ClosedPublic

Authored by Yi-Hong.Lyu on Feb 26 2019, 10:00 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

Yi-Hong.Lyu created this revision.Feb 26 2019, 10:00 AM
Yi-Hong.Lyu retitled this revision from [NFC][PPC] Remove UseVSXReg to [NFC][PowerPC] Remove UseVSXReg.Feb 26 2019, 10:02 AM
jsji added a comment.Feb 26 2019, 3:35 PM

I like the idea of trying to use register class to avoid using UseVSXReg flag.

However, are we should this will work?

Mixing FRC and VSRC may cause unexpected failures, or at least causing confusion,

eg: If we add -ppc-asm-full-reg-names, then lxsibzx 0, 0, 3 in vsx-partword-int-loads-and-stores.ll in the patch will show up like:

lxsibzx f0, 0, r3
xxspltw vs34, f0, 1

Why we use f0 to VSX instruction lxsibzx?
This is confusing and may cause problem!
Although f0 is mapped to vs0, but they are still different: f0 is only half of vs0.

BTW: You should not mark this as NFC , you have changed register classes, there ARE functional changes!

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
167 ↗(On Diff #188402)

Are we sure we don't need mapping any more?

llvm/lib/Target/PowerPC/PPCInstrInfo.h
430 ↗(On Diff #188402)

Can we add default:, even if it is unreachable?

llvm/lib/Target/PowerPC/PPCInstrVSX.td
896 ↗(On Diff #188402)

Why we want to use vsfrc instead of vfrc? I think allowing F8RC may cause problem.

Yi-Hong.Lyu marked 3 inline comments as done.Mar 3 2019, 2:26 AM

I think using f0 here is OK. Take another instruction lxsdx for example:

$ echo '0x7c 0xe5 0xfc 0x98' | ./build/release/bin/llvm-mc --disassemble -triple powerpc64-unknown-linux-gnu -mcpu=pwr7 -ppc-asm-full-reg-names
        .text
        lxsdx f7, r5, r31

According to the spec, lxsdx will output to VSX[XT]. It uses vsfrc since it only uses first dword (VSX[XT].dword[1] is undefined). That's our intention to print f7 instead of vs7 here. Go back the this case, lxsibzx has the same situation as lxsdx. Besides, XXSPLTWs only uses in the pattern xxspltw XT,XB, 1[1] so it is well-defined to use f0 here (f0.word[1] = vs0.word[1]).

[1]

yihongl@kuat:/data/yihongl/dev/src/llvm/lib/Target/PowerPC$ grep -r XXSPLTWs
P9InstrResources.td:    XXSPLTWs,
PPCMIPeephole.cpp:          (MyOpcode == PPC::XXSPLTW && DefOpcode == PPC::XXSPLTWs) ||
PPCInstrVSX.td:  def XXSPLTWs : XX2Form_2<60, 164,
PPCInstrVSX.td:           (v4i32 (XXSPLTWs (LXSIBZX xoaddr:$src), 1))>;
PPCInstrVSX.td:            (v4i32 (XXSPLTWs (VEXTSB2Ws (LXSIBZX xoaddr:$src)), 1))>;
PPCInstrVSX.td:            (v4i32 (XXSPLTWs (LXSIHZX xoaddr:$src), 1))>;
PPCInstrVSX.td:            (v4i32 (XXSPLTWs (VEXTSH2Ws (LXSIHZX xoaddr:$src)), 1))>;
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
167 ↗(On Diff #188402)

I believe so. I apply the following patch and there is no failure in LIT/LNT tests. It seems MO.getReg() is not changed at all after invoking getRegNumForOperand. In contrast, if I keep it, for some inline assembly test case (e.g., CodeGen/Generic/2007-04-08-MultipleFrameIndices.ll), Desc.OpInfo would be null and I need additional logic to handle this case even the invocation is useless.

$ git diff
diff --git a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
index f4ef95c..aeb9c23 100644
--- a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+++ b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
@@ -165,7 +165,7 @@ void PPCAsmPrinter::printOperand(const MachineInstr *MI, unsigned OpNo,
   switch (MO.getType()) {
   case MachineOperand::MO_Register: {
     unsigned Reg = PPCInstrInfo::getRegNumForOperand(MI->getDesc(),
-                                                     MO.getReg(), OpNo);
+                                                     MO.getReg(), OpNo, true);

     const char *RegName = PPCInstPrinter::getRegisterName(Reg);

diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.h b/llvm/lib/Target/PowerPC/PPCInstrInfo.h
index ee16d43..ebd9024 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrInfo.h
+++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.h
@@ -427,13 +427,17 @@ public:
   /// to instructions that use both Altivec and VSX numbering (for different
   /// operands).
   static unsigned getRegNumForOperand(const MCInstrDesc &Desc, unsigned Reg,
-                                      unsigned OpNo) {
+                                      unsigned OpNo, bool equal = false) {
+    unsigned oldReg = Reg;
     if (Desc.TSFlags & PPCII::UseVSXReg) {
       if (isVRRegister(Reg))
         Reg = PPC::VSX32 + (Reg - PPC::V0);
       else if (isVFRegister(Reg))
         Reg = PPC::VSX32 + (Reg - PPC::VF0);
     }
+    if (equal) {
+      assert(oldReg == Reg && "Reg is altered");
+    }
     return Reg;
   }
 };
llvm/lib/Target/PowerPC/PPCInstrInfo.h
430 ↗(On Diff #188402)

default: is not unreachable because regClass might be VRRCRegClassID, VFRCRegClassID ... etc

llvm/lib/Target/PowerPC/PPCInstrVSX.td
896 ↗(On Diff #188402)

I think we can use vsfrc here since xxspltw is able to address whole 64 VSR registers. Besides, XXSPLTWs only uses second word of VSR register so there is no problem to use F8RC.

Yi-Hong.Lyu marked 3 inline comments as done.Mar 3 2019, 2:29 AM

By the way, this patch passes Bootstrap, SPEC 2006, SPEC 2017, llvm-on-power/Benchmarks/ tests on power8/power9

jsji added a comment.EditedMar 3 2019, 9:04 PM

Thanks for addressing comments, however, I am still not convinced that this is safe.

I am also curious how this will pass machine verification, when we check register classes?
Can you please try to run machine verification to all LIT/LNT test to see whether there is any register class mismatch due to this? Thanks.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
167 ↗(On Diff #188402)
In contrast, if I keep it, for some inline assembly test case (e.g., CodeGen/Generic/2007-04-08-MultipleFrameIndices.ll), Desc.OpInfo would be null and I need additional logic to handle this case even the invocation is useless.

Do we know why it works before but fail now?

167 ↗(On Diff #188402)
I apply the following patch and there is no failure in LIT/LNT tests.

Interesting. To double confirm, you applied the patch against ToT (without your proposed change of td files in this review), right? I am curious why we add this logic before?

No failure in LIT/LNT tests does NOT prove that this is safe to me.
Can we get some testcase that are supposed to run into this logic (eg: some MI with UseVSXReg flags set) and check the 'Reg` values returned here? And compare to the Reg values with your proposed change?

llvm/lib/Target/PowerPC/PPCInstrInfo.h
430 ↗(On Diff #188402)

then can we add a default: fall through with comments?

llvm/lib/Target/PowerPC/PPCInstrVSX.td
896 ↗(On Diff #188402)

Besides, XXSPLTWs only uses second word of VSR register so there is no problem to use F8RC.

Looks like to me that current patterns only use 1 for $UIM does not lead to conclusion that $UIM can only be 1?
What if I write a new pattern to use XXSPLTWs, ... 3?

Yi-Hong.Lyu retitled this revision from [NFC][PowerPC] Remove UseVSXReg to [PowerPC] Remove UseVSXReg.Mar 4 2019, 9:10 AM
Yi-Hong.Lyu marked 6 inline comments as done.Mar 4 2019, 11:52 AM

I export LLVM_VERIFY_MACHINEINSTRS=1 and run the LIT/LNT tests with my patch and it pass either.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
167 ↗(On Diff #188402)

For CodeGen/Generic/2007-04-08-MultipleFrameIndices.ll , %A = call i32 asm sideeffect "inline asm $0 $2 $3 $4", "=r,0,i,m,m"( i32 0, i32 1, i8** %foo, i32** %bar ), MI->getOpcode() is INLINEASM and Desc.OpInfo would be null. In original version, we don't use Desc.OpInfo at all.

167 ↗(On Diff #188402)

Yes, I apply the patch mentioned above to the master (without my proposed change of td files in this review) and there is no failure.

'lxsdx 32, 5, 31' is the test case supposed to run into this logic (the instruction has UseVSXReg flag set). Here is the Reg values for the three operand:

echo 'lxsdx 32, 5, 31' | ./build/release/bin/llvm-mc --assemble -triple powerpc64-unknown-linux-gnu -mcpu=pwr7 test.s > /dev/null
oldReg: 183
Reg: 247
oldReg: 284
Reg: 284
oldReg: 310
Reg: 310

Both the master branch and the branch with my proposed change have the same result (I don't duplicate the output since they are the same).

llvm/lib/Target/PowerPC/PPCInstrVSX.td
896 ↗(On Diff #188402)

The original used vfrc is RegisterOperand<VFRC>, VFRC consist of VF registers which are sub-register (i.e., first 64bit) of VSX32-VSX63. They don't work for XXSPLTWs, ... 3 neither.

Yi-Hong.Lyu marked 4 inline comments as done.Mar 4 2019, 12:57 PM
Yi-Hong.Lyu added inline comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
167 ↗(On Diff #188402)

@nemanjai Could you give some insight that why we need invoke getRegNumForOperand() here? I cannot come up with a test case that use the mapping in that function.

Add default in the switch to address Jinsong's comment

Yi-Hong.Lyu marked an inline comment as done.Mar 5 2019, 1:21 PM
Yi-Hong.Lyu marked an inline comment as done.Mar 18 2019, 9:32 PM
Yi-Hong.Lyu added inline comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
167 ↗(On Diff #188402)

After investigation, a comment is added here https://reviews.llvm.org/rL355848

I think I addressed all the comments above. This is a gentle ping ...

jsji accepted this revision.Mar 21 2019, 11:38 AM

Thought we are waiting for @nemanjai 's insight about line 167... But LGTM given that you have run full test.
Thanks for improving this.

This revision is now accepted and ready to land.Mar 21 2019, 11:38 AM
Closed by commit rL357028: [PowerPC] Remove UseVSXReg (authored by stefanp, committed by ). · Explain WhyMar 26 2019, 1:27 PM
This revision was automatically updated to reflect the committed changes.