This is an archive of the discontinued LLVM Phabricator instance.

PR18921, vmov part
ClosedPublic

Authored by dyatkovskiy on Apr 8 2014, 4:25 AM.

Details

Summary

Purpose of patch is to add byte replicate recognition for "vmov" and "vmvn" instructions. It is kind of compatibility fix for GCC. Since GCC generates .s files with illegal vmov and vmvn byte-replicator instructions (e. g. vmov.i32 d0, #abababab).
Current version introduces new AsmOperand classes for vmov.i16, vmov.i32, vmvn.i16 and vmvn.i32 instructions, that detects bytes replication and encodes them with proper way (actually it converts them into vmov.i8).
Thanks for reviews!

Diff Detail

Event Timeline

rengolin accepted this revision.Apr 9 2014, 7:15 AM

Hi Stepan,

Apart from the comments, looks good to me.

Thanks,
--renato

lib/Target/ARM/AsmParser/ARMAsmParser.cpp
2294

Typos: compatibility, "even if they are illegal", gcc-as -> gnu-as.

Thanks, Renato! I'll try to find way to do it from TableGen in Monday. If there is no way in TableGen, we could accept this fix for a while then.
Straight after :vararg directive, I could start work on improving tablegen if needed.
-Cheers!

dyatkovskiy updated this revision to Unknown Object (????).Apr 15 2014, 4:36 AM

Hi Renato,
Please take a look at this version of fix.
Previous version could potentially give us wrong encoding. Since we change opcode in place where nobody expects.
New one fits existing style. And guarantees proper encoding.
Thanks!

I'm not convinced about this one. I don't think 0xffffffff is such a special case. If we're going to do this compatibility alias (*sigh*), we've probably got to handle things like "vmov.i32 dD, 0xabababab" as well (and goodness knows what else).

Cheers.

Tim.

Tim, actually you right )
I have just checked. "vmov.i32 d0, 0xabababab" is legal for GAS. And "vmvn.i32 d0, 0xabababab" as well...
(*sigh*).
Though recognition algorithm looks simple to me: "all bytes are equal".
I could fix it...

Are you sure that this is the only special case? It seems to me that gas
accepts almost anything and try to be super smart.

Cheers,
Renato

Hello Renato. We did do our small research with Tim.
GAS behaves like he reads instruction bit width (i16 or i32). And then checks whether these bits contains repeated bytes.
So, for example: "vmov.i16 d0, 0xabab" is legal. "vmob.i16 d0, 0xabcd) - not.

This logic could fit pretty nice in "isNEONi32splatOnes" and friends. Of course it should be renamed to somewhat "isNEONi32splatSameBytes"

Right, I was afraid gas would recognise garbage and create a string of
instructions. Seems not to be the case. Thanks for checking.

Renato

dyatkovskiy updated this revision to Unknown Object (????).Apr 16 2014, 1:04 AM

Please, review updated fix.

  1. It implements byte replication for VMVN and VMOV.

So "vmov.i16 d0, #0xabab" - becomes legal.
As result MC fullfills d0 with #0xab bytes.
Zero byte #0 is exception. It passed "as-is".

  1. It forbids byte replication for VORR and VBIC. We can enable it. But, interesting fact, it forbidden in GCC. + It is illegal. So, the less workarounds the better IMHO.
  1. Few more reasons.

In current llvm-mc state (before fix) 16-bit cases are *legal*, but llvm-mc implicitly converts
"vmov.i16 d0, #0xabab"
into
"vmov.i16 q2, #0xab00"
That seems to be wrong.

And for i32: #0xabababab --> #0xab000000

All the same with "vorr" and "vbic": bytes zeroizing at the tail.

The reasoning looks good to me, though I can't review the code right now. If Tim is ok with it, I'm good.

Currently I'm working on combination AsmPseudoInst + PseudoExpand solution. It could be really horrible :-)

Renato,
Thanks!

 Well... below is my little confession.

It is really not easy to decide whether we should have this workaround. And whether it should be implemented that way.
Though I'm 99% sure now, that it is only possible way.

Actually I have spent about 3-4 hours one day, and 5-6 hours another day, and all the day today, looking in TableGen internals. All just to find how to implement something like :

def : NEONInstAlias<"instrA $Vm", (InstrB DPR:$Vd, ZeroizeImm AllOnesImm:$Vm)>;

IMHO, ideal solution would be like below:

def nImmByteReplicateXForm: SDNodeXForm<imm, [{
  unsigned Value = 0xff & N->getZExtValue();
  Value |= 0xe00;
  return CurDAG->getTargetConstant(Value, MVT::i32);
}]>;
def nImmVMOVI32ByteReplicate : Operand<i32>, PatLeaf<(imm), [{
    unsigned Imm = N->getZExtValue();
    if (!Imm)
      return false; // Don't bother with zero.
    unsigned char B = Imm & 0xff;
    for (unsigned i = 1; i < NumBytes; ++i) {
      Imm >>= 8;
      if ((Imm & 0xff) != B)
        return false;
    }
    return true;
  }], ImmByteReplicateXForm> {
  let PrintMethod = "printNEONModImmOperand";
  let ParserMatchClass = nImmVMOVI32AsmOperand;
}
def : NEONInstAlias<"vmov${p}.i32 $Vd, $Vm",
                    (VMOVv8i8 DPR:$Vd, nImmVMOVI32ByteReplicate:$Vm, pred:$p)>;

But no. It doesn't work :-(
Inlined C++ code fragments actually are PatFrag's predicate code, and SDNodeXForm's XFormFunction. We need to much assembler instructions with specific imm values. There are two possible ways: NEONAsmPseudo and NEONInstAlias. But these classes has nothing common with DAG patterns and SDNodeXForms.

Actually I wanted to implement whole chain:

<asm> --> <dag-transform> --> <aliasing>

just in .td files.
But there is not such support. And we had to implement new class hierarchy from fundamental TableGen classes.

Even operand classes like imm0_4095 have ARMAsmParser handlers.

Also deviation from common style isn't that good. So I've looked at rest of similar checks. Every checks like that (detection of some specific imm values, for example) are implemented as C++ code in ARMAsmParser.cpp, as set of handlers for AsmOperandClass.

So I did the same: I have just implemented ways of how to recognize immediates we accept, and then how to encode them.

So here we are. I understand circumstances, and ready to wait as long as we need.

Thanks )

dyatkovskiy updated this revision to Unknown Object (????).Apr 19 2014, 12:28 AM

Simplified previous version:

  • For nImmVMOVI32AsmOperandByteReplicate and friends, has been customized PredicateMethod and RenderMethod. That allowed to reduce code size in ARMAsmParser.
rafael added inline comments.Apr 21 2014, 7:56 AM
lib/Target/ARM/ARMInstrNEON.td
5347

"assembly generator", no?

In any case, it is not really relevant in the assembly is generated or user written. Just say that we (and gas) support this extension.

The comment is also out of date since the code now supports 0xabab for example.

lib/Target/ARM/AsmParser/ARMAsmParser.cpp
1640

please clang-format the patch.

dyatkovskiy updated this revision to Unknown Object (????).Apr 21 2014, 8:39 AM

Hi Rafael,
Thanks for review!
I have fixed comments and passed patch through clang format as you asked.

dyatkovskiy added inline comments.Apr 21 2014, 8:42 AM
lib/Target/ARM/ARMInstrNEON.td
5348

Yeah. I see my typo: instrustions -> instructions.

You don't have a test showing an error with vmov or vmvn (vmov 0xfffffffe for example).

I think the patch is OK with the extra test, but since this is ARM I will let Tim or Renato do the final review.

dyatkovskiy updated this revision to Unknown Object (????).Apr 21 2014, 10:59 PM

Added test for illegal cases.

Fine by me. Renato, Tim, what do you guys think?

Hi Rafael,

Looks good to me too.

cheers,
--renato

dyatkovskiy closed this revision.Apr 24 2014, 12:24 AM

r207080
Thanks for reviews!