Purpose of patch is to add byte replicate recognition for "vmov" and "vmvn" instructions. It is kind of compatibility fix for GCC. Since GCC generates .s files with illegal vmov and vmvn byte-replicator instructions (e. g. vmov.i32 d0, #abababab).
Current version introduces new AsmOperand classes for vmov.i16, vmov.i32, vmvn.i16 and vmvn.i32 instructions, that detects bytes replication and encodes them with proper way (actually it converts them into vmov.i8).
Thanks for reviews!
Details
- Reviewers
rengolin t.p.northover • rafael
Diff Detail
Event Timeline
Hi Stepan,
Apart from the comments, looks good to me.
Thanks,
--renato
lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
---|---|---|
2294 | Typos: compatibility, "even if they are illegal", gcc-as -> gnu-as. |
Thanks, Renato! I'll try to find way to do it from TableGen in Monday. If there is no way in TableGen, we could accept this fix for a while then.
Straight after :vararg directive, I could start work on improving tablegen if needed.
-Cheers!
Hi Renato,
Please take a look at this version of fix.
Previous version could potentially give us wrong encoding. Since we change opcode in place where nobody expects.
New one fits existing style. And guarantees proper encoding.
Thanks!
I'm not convinced about this one. I don't think 0xffffffff is such a special case. If we're going to do this compatibility alias (*sigh*), we've probably got to handle things like "vmov.i32 dD, 0xabababab" as well (and goodness knows what else).
Cheers.
Tim.
Tim, actually you right )
I have just checked. "vmov.i32 d0, 0xabababab" is legal for GAS. And "vmvn.i32 d0, 0xabababab" as well...
(*sigh*).
Though recognition algorithm looks simple to me: "all bytes are equal".
I could fix it...
Are you sure that this is the only special case? It seems to me that gas
accepts almost anything and try to be super smart.
Cheers,
Renato
Hello Renato. We did do our small research with Tim.
GAS behaves like he reads instruction bit width (i16 or i32). And then checks whether these bits contains repeated bytes.
So, for example: "vmov.i16 d0, 0xabab" is legal. "vmob.i16 d0, 0xabcd) - not.
This logic could fit pretty nice in "isNEONi32splatOnes" and friends. Of course it should be renamed to somewhat "isNEONi32splatSameBytes"
Right, I was afraid gas would recognise garbage and create a string of
instructions. Seems not to be the case. Thanks for checking.
Renato
Please, review updated fix.
- It implements byte replication for VMVN and VMOV.
So "vmov.i16 d0, #0xabab" - becomes legal.
As result MC fullfills d0 with #0xab bytes.
Zero byte #0 is exception. It passed "as-is".
- It forbids byte replication for VORR and VBIC. We can enable it. But, interesting fact, it forbidden in GCC. + It is illegal. So, the less workarounds the better IMHO.
- Few more reasons.
In current llvm-mc state (before fix) 16-bit cases are *legal*, but llvm-mc implicitly converts
"vmov.i16 d0, #0xabab"
into
"vmov.i16 q2, #0xab00"
That seems to be wrong.
And for i32: #0xabababab --> #0xab000000
All the same with "vorr" and "vbic": bytes zeroizing at the tail.
The reasoning looks good to me, though I can't review the code right now. If Tim is ok with it, I'm good.
Currently I'm working on combination AsmPseudoInst + PseudoExpand solution. It could be really horrible :-)
Renato,
Thanks!
Well... below is my little confession.
It is really not easy to decide whether we should have this workaround. And whether it should be implemented that way.
Though I'm 99% sure now, that it is only possible way.
Actually I have spent about 3-4 hours one day, and 5-6 hours another day, and all the day today, looking in TableGen internals. All just to find how to implement something like :
def : NEONInstAlias<"instrA $Vm", (InstrB DPR:$Vd, ZeroizeImm AllOnesImm:$Vm)>;
IMHO, ideal solution would be like below:
def nImmByteReplicateXForm: SDNodeXForm<imm, [{ unsigned Value = 0xff & N->getZExtValue(); Value |= 0xe00; return CurDAG->getTargetConstant(Value, MVT::i32); }]>; def nImmVMOVI32ByteReplicate : Operand<i32>, PatLeaf<(imm), [{ unsigned Imm = N->getZExtValue(); if (!Imm) return false; // Don't bother with zero. unsigned char B = Imm & 0xff; for (unsigned i = 1; i < NumBytes; ++i) { Imm >>= 8; if ((Imm & 0xff) != B) return false; } return true; }], ImmByteReplicateXForm> { let PrintMethod = "printNEONModImmOperand"; let ParserMatchClass = nImmVMOVI32AsmOperand; } def : NEONInstAlias<"vmov${p}.i32 $Vd, $Vm", (VMOVv8i8 DPR:$Vd, nImmVMOVI32ByteReplicate:$Vm, pred:$p)>;
But no. It doesn't work :-(
Inlined C++ code fragments actually are PatFrag's predicate code, and SDNodeXForm's XFormFunction. We need to much assembler instructions with specific imm values. There are two possible ways: NEONAsmPseudo and NEONInstAlias. But these classes has nothing common with DAG patterns and SDNodeXForms.
Actually I wanted to implement whole chain:
<asm> --> <dag-transform> --> <aliasing>
just in .td files.
But there is not such support. And we had to implement new class hierarchy from fundamental TableGen classes.
Even operand classes like imm0_4095 have ARMAsmParser handlers.
Also deviation from common style isn't that good. So I've looked at rest of similar checks. Every checks like that (detection of some specific imm values, for example) are implemented as C++ code in ARMAsmParser.cpp, as set of handlers for AsmOperandClass.
So I did the same: I have just implemented ways of how to recognize immediates we accept, and then how to encode them.
So here we are. I understand circumstances, and ready to wait as long as we need.
Thanks )
Simplified previous version:
- For nImmVMOVI32AsmOperandByteReplicate and friends, has been customized PredicateMethod and RenderMethod. That allowed to reduce code size in ARMAsmParser.
lib/Target/ARM/ARMInstrNEON.td | ||
---|---|---|
5347 | "assembly generator", no? In any case, it is not really relevant in the assembly is generated or user written. Just say that we (and gas) support this extension. The comment is also out of date since the code now supports 0xabab for example. | |
lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
1640 | please clang-format the patch. |
Hi Rafael,
Thanks for review!
I have fixed comments and passed patch through clang format as you asked.
lib/Target/ARM/ARMInstrNEON.td | ||
---|---|---|
5348 | Yeah. I see my typo: instrustions -> instructions. |
You don't have a test showing an error with vmov or vmvn (vmov 0xfffffffe for example).
I think the patch is OK with the extra test, but since this is ARM I will let Tim or Renato do the final review.
"assembly generator", no?
In any case, it is not really relevant in the assembly is generated or user written. Just say that we (and gas) support this extension.
The comment is also out of date since the code now supports 0xabab for example.