This is an archive of the discontinued LLVM Phabricator instance.

[MC][ARM] add implicit immediate form for ldrsbt/ldrht/ldrsht
AcceptedPublic

Authored by falstaff84 on Mar 1 2020, 2:43 PM.

Details

Summary

Add pseudo instructions for ldrsbt/ldrht/ldrsht with implicit immediate
and add fall back C++ code to transform the instruction to the
equivalent LDRSBTi/LDRHTi/LDRSHTi form.

This is similar to how it has been done in commit
fb3950ec6312dfa4317d8cbf83a1db4aae7428ce

This fixes:
https://bugs.llvm.org/show_bug.cgi?id=45070

Diff Detail

Event Timeline

falstaff84 created this revision.Mar 1 2020, 2:43 PM

It seems a multiclass AI3ldrT is used (and only used) to define ldrsbt/ldrht/ldrsht . Maybe we can add one more class in it to handle the implicit immediate? This way we don't need to change ARMAsmParser.cpp. I tried your test cases with a local LLVM patch as below, and seemed it worked, although I had to remove #-0 from the encoding, e.g. @ CHECK: ldrsht r5, [r6], @ encoding: [0xf0,0x50,0x76,0xe0]" instead of "@ CHECK: ldrsht r5, [r6], #-0 @ encoding: [0xf0,0x50,0x76,0xe0]"

  • a/llvm/lib/Target/ARM/ARMInstrInfo.td

+++ b/llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -2939,6 +2939,16 @@ multiclass AI3ldrT<bits<4> op, string opc> {

  let Inst{11-8} = offset{7-4};
  let Inst{3-0} = offset{3-0};
}

+ def i0 : AI3ldstidxT<op, 1, (outs GPR:$Rt, GPR:$base_wb),
+ (ins addr_offset_none:$addr),
+ IndexModePost, LdMiscFrm, IIC_iLoad_bh_ru, opc,
+ "\t$Rt, $addr", "$addr.base = $base_wb", []> {
+ bits<9> offset;
+ let Inst{23} = 0;
+ let Inst{22} = 1;
+ let Inst{11-8} = 0;
+ let Inst{3-0} = 0;
+ }

def r : AI3ldstidxT<op, 1, (outs GPRnopc:$Rt, GPRnopc:$base_wb),
                    (ins addr_offset_none:$addr, postidx_reg:$Rm),
                    IndexModePost, LdMiscFrm, IIC_iLoad_bh_ru, opc,

This indeed looks much nicer. I was somehow under the assumption that this was not an option since we moved away from using separate instructions for {ld,st}r{,b}t:
https://github.com/llvm/llvm-project/commit/fb3950ec6312dfa4317d8cbf83a1db4aae7428ce

That said, if using separate instructions is acceptable, I am all for it.

The fact that this approach sets the unsigned bit is actually a good thing as it aligns with the behavior of GNU binutils.

Adding some more reviewers. I'm not aware of any hard and fast rule as to when to use a new instruction or an alias, I think it is a case by case decision, although people more familiar with tablegen may know otherwise.
I'd prefer a solution where the disassembly of ldrsht r5, [r6] is either ldrsht r5, [r6] or ldrsht r5, [r6], #0. ldrsht r5, [r6], #-0 is legal but not really canonical, maybe always set the U bit (23) to make the immediate positive.

Generally, we prefer to use an alias unless there's some reason we need logic written in C++. For example, some "instructions" don't translate to a single instruction. And some instructions have complex logic for which encoding to use.

In this case, ARMAsmPseudo seems like overkill; an InstAlias should be sufficient.

falstaff84 added a comment.EditedMar 2 2020, 2:44 PM

@efriedma I tried the InstAlias approach at first, but it did not work for reasons I don't exactly recall.

Later, I also found this hint in commit fb3950ec6312dfa4317d8cbf83a1db4aae7428ce (back from 2014):

An InstAlias is insufficient in this case as the necessary due to the need to add a new additional operand for the implicit zero.

So maybe this really does not work? Do you happen to know if that is still true?

I spent a couple minutes experimenting; the following "works":

def : InstAlias<"ldrsbt${q} $Rt, $Rn2, $Rn", (LDRSBTi GPR:$Rt, GPR:$Rn2, addr_offset_none:$Rn, 256, pred:$q)>, Requires<[IsARM]>;

The immediate doesn't pose a problem, as far as I can tell.

The problem, of course, is that this syntax isn't right. And there isn't any way to make it right. For instructions, we do some magic with "Constraints" to infer the second output register, but that currently doesn't work for aliases.

So I guess this approach is fine. (Of course, there's still the #0 vs #-0 issue.)

jcai19 added a comment.EditedMar 2 2020, 4:38 PM

I spent a couple minutes experimenting; the following "works":

def : InstAlias<"ldrsbt${q} $Rt, $Rn2, $Rn", (LDRSBTi GPR:$Rt, GPR:$Rn2, addr_offset_none:$Rn, 256, pred:$q)>, Requires<[IsARM]>;

The immediate doesn't pose a problem, as far as I can tell.

The problem, of course, is that this syntax isn't right. And there isn't any way to make it right. For instructions, we do some magic with "Constraints" to infer the second output register, but that currently doesn't work for aliases.

So I guess this approach is fine. (Of course, there's still the #0 vs #-0 issue.)

Thank you for the clarification! How about adding a new class in the multiclass AI3ldrT for implicit immediate as I proposed above? Out of curiosity, do we prefer adding pseudo instructions to adding instruction classes in general for this kind of situations? Thanks.

How we choose to represent instructions is a case-by-case thing; often, the way the instruction manuals are written is sort of arbitrary anyway. Ideally, we want to represent instructions in some way that's intuitive for someone who has read the manual. That usually means one "instruction" corresponds to one instruction/encoding in the instruction manual. But sometimes the manual's description makes the code harder to understand, so we do something different. (For example, the ARM "pop" instruction is represented as LDMIA_UPD internally.)

One restriction we do want to maintain, though, is that there should be a unique, correct disassembly for every instruction. Otherwise the behavior of the autogenerated decoder is, at best, difficult to follow. So if we want LDRSBTi0, we would need to change the definition of LDRSBTi so it doesn't overlap.

We could split LDRSBTi into two instructions: one for the zero-offset case, and one for the non-zero offset cast. But that isn't the way the manual is written, and we would need C++ code to throw away the immediate in "ldrsbt r0, [r0], #0". So it's not really an improvement, IMO.

You could save a few lines of TableGen by moving the definition of the pseudoinstructions into the multiclass "AI3ldrT".

jcai19 added a comment.Mar 2 2020, 5:15 PM

Thank you for the clarification.

jcai19 added a comment.EditedMar 3 2020, 12:02 PM

You can move the new ARMAsmPseudo into AI3ldrT as @efriedma suggested as follows,

iff --git a/llvm/lib/Target/ARM/ARMInstrInfo.td b/llvm/lib/Target/ARM/ARMInstrInfo.td
index c9fc8333da8..ac078901d4d 100644

  • a/llvm/lib/Target/ARM/ARMInstrInfo.td

+++ b/llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -2951,6 +2951,8 @@ multiclass AI3ldrT<bits<4> op, string opc> {

  let Inst{3-0} = Rm{3-0};
  let DecoderMethod = "DecodeLDR";
}

+ def ii : ARMAsmPseudo<!strconcat(opc, "${p} $Rt, $addr"), (ins addr_offset_none:$addr, pred:$p),
+ (outs GPR:$Rt)>;
}

defm LDRSBT : AI3ldrT<0b1101, "ldrsbt">;
diff --git a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
index 992b8755384..6f4525887f3 100644

  • a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp

+++ b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
@@ -8241,6 +8241,26 @@ bool ARMAsmParser::processInstruction(MCInst &Inst,

  Inst = TmpInst;
  return true;
}

+ // Alias for 'ldr{sb,h,sh}t Rt, [Rn] {, #imm}' for ommitted immediate.
+ case ARM::LDRSBTii:
+ case ARM::LDRHTii:
+ case ARM::LDRSHTii: {
+ MCInst TmpInst;
+
+ if (Inst.getOpcode() == ARM::LDRSBTii)
+ TmpInst.setOpcode(ARM::LDRSBTi);
+ else if (Inst.getOpcode() == ARM::LDRHTii)
+ TmpInst.setOpcode(ARM::LDRHTi);
+ else if (Inst.getOpcode() == ARM::LDRSHTii)
+ TmpInst.setOpcode(ARM::LDRSHTi);
+ TmpInst.addOperand(Inst.getOperand(0));
+ TmpInst.addOperand(Inst.getOperand(1));
+ TmpInst.addOperand(Inst.getOperand(1));
+ TmpInst.addOperand(MCOperand::createImm(0));
+ TmpInst.addOperand(Inst.getOperand(2));
+ Inst = TmpInst;
+ return true;
+ }

// Alias for alternate form of 'str{,b}t Rt, [Rn], #imm' instruction.
case ARM::STRT_POST:
case ARM::STRBT_POST: {

This however does not solve the #-0 issue. I tried to follow @psmith 's suggestion let Inst{23}=1 but ended up with an error: error: Value 'Inst' unknown!.

The "-0" issue is because of the way the immediate is encoded. Try MCOperand::createImm(256).

jcai19 added a comment.Mar 3 2020, 1:55 PM

I see. Yes it worked. Thank you!

falstaff84 updated this revision to Diff 249249.Mar 9 2020, 5:09 PM

@efriedma, @jcai19, thanks for feedback, updated the patch accordingly. This now assembles and disassembles to the same encoding as binutils does (#0).

efriedma accepted this revision.Mar 9 2020, 6:57 PM

LGTM

This revision is now accepted and ready to land.Mar 9 2020, 6:57 PM

Hi @falstaff84 , do you need help committing this?

@nickdesaulniers I actually tired to get commit access myself but haven't heard back from Chris yet. So yes would be good if somebody could commit this already.

@falstaff84 Can you email me your github id and svn id (if you had one), and I will get you access.

Thank you for the fix! FYI it's recommended to include your differential link in the commit message https://llvm.org/docs/Phabricator.html#committing-a-change. That should also automatically close it when the change is upstreamed.

When I merge changes, I do:

$ git checkout master
$ git pull
$ arc patch D75428
$ git rebase master
$ ninja checkall # rebuild and rerun tests
$ git checkout master
$ git pull # if there's update, go back to D75428 and rebase
$ git merge -
$ git push

pulling in a reviewed patch via arc patch formats the commit message with all the information about who reviewed it, the links, etc.

Thanks for the hint! Did pretty much all of that, but used git am instead of arc, thought it does not really make a difference. Will use arc next time!

I only use arc for:

  1. pushing changes to phabricator via arc diff. When I make changes in response to code review, I git commit -a --amend, or squash multiple patches all into one, then run arc.
  2. fetching changes from phabricator via arc patch D<12345> to either commit, or help test.