Details
Diff Detail
Event Timeline
I like the idea of trying to use register class to avoid using UseVSXReg flag.
However, are we should this will work?
Mixing FRC and VSRC may cause unexpected failures, or at least causing confusion,
eg: If we add -ppc-asm-full-reg-names, then lxsibzx 0, 0, 3 in vsx-partword-int-loads-and-stores.ll in the patch will show up like:
lxsibzx f0, 0, r3 xxspltw vs34, f0, 1
Why we use f0 to VSX instruction lxsibzx?
This is confusing and may cause problem!
Although f0 is mapped to vs0, but they are still different: f0 is only half of vs0.
BTW: You should not mark this as NFC , you have changed register classes, there ARE functional changes!
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
167 | Are we sure we don't need mapping any more? | |
llvm/lib/Target/PowerPC/PPCInstrInfo.h | ||
430 | Can we add default:, even if it is unreachable? | |
llvm/lib/Target/PowerPC/PPCInstrVSX.td | ||
896 | Why we want to use vsfrc instead of vfrc? I think allowing F8RC may cause problem. |
I think using f0 here is OK. Take another instruction lxsdx for example:
$ echo '0x7c 0xe5 0xfc 0x98' | ./build/release/bin/llvm-mc --disassemble -triple powerpc64-unknown-linux-gnu -mcpu=pwr7 -ppc-asm-full-reg-names .text lxsdx f7, r5, r31
According to the spec, lxsdx will output to VSX[XT]. It uses vsfrc since it only uses first dword (VSX[XT].dword[1] is undefined). That's our intention to print f7 instead of vs7 here. Go back the this case, lxsibzx has the same situation as lxsdx. Besides, XXSPLTWs only uses in the pattern xxspltw XT,XB, 1[1] so it is well-defined to use f0 here (f0.word[1] = vs0.word[1]).
[1]
yihongl@kuat:/data/yihongl/dev/src/llvm/lib/Target/PowerPC$ grep -r XXSPLTWs P9InstrResources.td: XXSPLTWs, PPCMIPeephole.cpp: (MyOpcode == PPC::XXSPLTW && DefOpcode == PPC::XXSPLTWs) || PPCInstrVSX.td: def XXSPLTWs : XX2Form_2<60, 164, PPCInstrVSX.td: (v4i32 (XXSPLTWs (LXSIBZX xoaddr:$src), 1))>; PPCInstrVSX.td: (v4i32 (XXSPLTWs (VEXTSB2Ws (LXSIBZX xoaddr:$src)), 1))>; PPCInstrVSX.td: (v4i32 (XXSPLTWs (LXSIHZX xoaddr:$src), 1))>; PPCInstrVSX.td: (v4i32 (XXSPLTWs (VEXTSH2Ws (LXSIHZX xoaddr:$src)), 1))>;
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
167 | I believe so. I apply the following patch and there is no failure in LIT/LNT tests. It seems MO.getReg() is not changed at all after invoking getRegNumForOperand. In contrast, if I keep it, for some inline assembly test case (e.g., CodeGen/Generic/2007-04-08-MultipleFrameIndices.ll), Desc.OpInfo would be null and I need additional logic to handle this case even the invocation is useless. $ git diff diff --git a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp index f4ef95c..aeb9c23 100644 --- a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp +++ b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp @@ -165,7 +165,7 @@ void PPCAsmPrinter::printOperand(const MachineInstr *MI, unsigned OpNo, switch (MO.getType()) { case MachineOperand::MO_Register: { unsigned Reg = PPCInstrInfo::getRegNumForOperand(MI->getDesc(), - MO.getReg(), OpNo); + MO.getReg(), OpNo, true); const char *RegName = PPCInstPrinter::getRegisterName(Reg); diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.h b/llvm/lib/Target/PowerPC/PPCInstrInfo.h index ee16d43..ebd9024 100644 --- a/llvm/lib/Target/PowerPC/PPCInstrInfo.h +++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.h @@ -427,13 +427,17 @@ public: /// to instructions that use both Altivec and VSX numbering (for different /// operands). static unsigned getRegNumForOperand(const MCInstrDesc &Desc, unsigned Reg, - unsigned OpNo) { + unsigned OpNo, bool equal = false) { + unsigned oldReg = Reg; if (Desc.TSFlags & PPCII::UseVSXReg) { if (isVRRegister(Reg)) Reg = PPC::VSX32 + (Reg - PPC::V0); else if (isVFRegister(Reg)) Reg = PPC::VSX32 + (Reg - PPC::VF0); } + if (equal) { + assert(oldReg == Reg && "Reg is altered"); + } return Reg; } }; | |
llvm/lib/Target/PowerPC/PPCInstrInfo.h | ||
430 | default: is not unreachable because regClass might be VRRCRegClassID, VFRCRegClassID ... etc | |
llvm/lib/Target/PowerPC/PPCInstrVSX.td | ||
896 | I think we can use vsfrc here since xxspltw is able to address whole 64 VSR registers. Besides, XXSPLTWs only uses second word of VSR register so there is no problem to use F8RC. |
By the way, this patch passes Bootstrap, SPEC 2006, SPEC 2017, llvm-on-power/Benchmarks/ tests on power8/power9
Thanks for addressing comments, however, I am still not convinced that this is safe.
I am also curious how this will pass machine verification, when we check register classes?
Can you please try to run machine verification to all LIT/LNT test to see whether there is any register class mismatch due to this? Thanks.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
167 |
Do we know why it works before but fail now? | |
167 |
Interesting. To double confirm, you applied the patch against ToT (without your proposed change of td files in this review), right? I am curious why we add this logic before? No failure in LIT/LNT tests does NOT prove that this is safe to me. | |
llvm/lib/Target/PowerPC/PPCInstrInfo.h | ||
430 | then can we add a default: fall through with comments? | |
llvm/lib/Target/PowerPC/PPCInstrVSX.td | ||
896 |
Looks like to me that current patterns only use 1 for $UIM does not lead to conclusion that $UIM can only be 1? |
I export LLVM_VERIFY_MACHINEINSTRS=1 and run the LIT/LNT tests with my patch and it pass either.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
167 | For CodeGen/Generic/2007-04-08-MultipleFrameIndices.ll , %A = call i32 asm sideeffect "inline asm $0 $2 $3 $4", "=r,0,i,m,m"( i32 0, i32 1, i8** %foo, i32** %bar ), MI->getOpcode() is INLINEASM and Desc.OpInfo would be null. In original version, we don't use Desc.OpInfo at all. | |
167 | Yes, I apply the patch mentioned above to the master (without my proposed change of td files in this review) and there is no failure. 'lxsdx 32, 5, 31' is the test case supposed to run into this logic (the instruction has UseVSXReg flag set). Here is the Reg values for the three operand: echo 'lxsdx 32, 5, 31' | ./build/release/bin/llvm-mc --assemble -triple powerpc64-unknown-linux-gnu -mcpu=pwr7 test.s > /dev/null oldReg: 183 Reg: 247 oldReg: 284 Reg: 284 oldReg: 310 Reg: 310 Both the master branch and the branch with my proposed change have the same result (I don't duplicate the output since they are the same). | |
llvm/lib/Target/PowerPC/PPCInstrVSX.td | ||
896 | The original used vfrc is RegisterOperand<VFRC>, VFRC consist of VF registers which are sub-register (i.e., first 64bit) of VSX32-VSX63. They don't work for XXSPLTWs, ... 3 neither. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
167 | After investigation, a comment is added here https://reviews.llvm.org/rL355848 |
Thought we are waiting for @nemanjai 's insight about line 167... But LGTM given that you have run full test.
Thanks for improving this.
@nemanjai Could you give some insight that why we need invoke getRegNumForOperand() here? I cannot come up with a test case that use the mapping in that function.