This is an archive of the discontinued LLVM Phabricator instance.

[Codegen][ARM] Add addressing modes from masked loads and stores
ClosedPublic

Authored by dmgreen on Nov 13 2019, 6:16 AM.

Details

Summary

MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do.

To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores the same as normal loads/stores. This involves:

  • Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase.
  • Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads.
  • The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way.
  • The ARM backend is then adjusted to make use of these indexed masked loads/stores.
  • The X86 backend is adjusted to hopefully be no functional changes.

Diff Detail

Event Timeline

dmgreen created this revision.Nov 13 2019, 6:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 13 2019, 6:16 AM

Nice one. Haven't looked at everything yet, but some first nits inlined.

llvm/include/llvm/CodeGen/SelectionDAGNodes.h
2293

This should be ISD::MLOAD?

llvm/include/llvm/CodeGen/TargetLowering.h
2067

abort -> about?

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13670–13671

While you're at it, to make things a bit more consistent: isLoad -> IsLoad ?

13946–13947

Same here?

13948–13949

Is this big if-statement exactly the same as in DAGCombiner::CombineToPreIndexedLoadStore, except the ISD nodes? Can this be a helper function?

samparker added inline comments.Nov 13 2019, 7:29 AM
llvm/include/llvm/CodeGen/TargetLowering.h
1145

Is there a way that we can avoid these magic hex values?

dmgreen marked 2 inline comments as done.Nov 14 2019, 1:50 AM
dmgreen added inline comments.
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
2293

Yeah. Nice spot. You can tell how many times this function gets called directly.

llvm/include/llvm/CodeGen/TargetLowering.h
1145

I'm not sure. We are trying to pull 4 bits out of an 16bit value, so the hex seems to fit perfectly!

dmgreen updated this revision to Diff 229247.Nov 14 2019, 1:59 AM
SjoerdMeijer added inline comments.Nov 14 2019, 5:43 AM
llvm/include/llvm/CodeGen/TargetLowering.h
1145

Agreed, but how about we do that inside a few simple getter/setter helper functions? I guess that would improve readability here a bit.

dmgreen marked an inline comment as done.Nov 14 2019, 1:51 PM
dmgreen added inline comments.
llvm/include/llvm/CodeGen/TargetLowering.h
1145

I think this is the getter helper function? I would prefer to keep it consistent with everything else in this file, which seem to just use shifts and Ands.

I could change them to (LegalizeAction)((IndexedModeActions[Ty][IdxMode] >> 12) & 0xf); if that would be better?

samparker added inline comments.Nov 15 2019, 1:54 AM
llvm/include/llvm/CodeGen/TargetLowering.h
1145

I don't mind adds and shifts, but what is twelve and why four bits? Can't we at least some enums for readability?

dmgreen updated this revision to Diff 229894.Nov 18 2019, 12:07 PM

I would prefer that we kept this code consistent with the rest of the file. It's quite internal to lowering, and I'd prefer it to be explicit and declarative about what it's doing.

I've tried to update it a little and added extra comments from elsewhere in the file.

craig.topper added inline comments.Nov 18 2019, 1:25 PM
llvm/lib/Target/X86/X86InstrFragmentsSIMD.td
1127

This looks to be unused?

dmgreen updated this revision to Diff 229938.Nov 18 2019, 3:21 PM

Oh yeah. Thanks for taking a look.

Removed the unnecessary SDTX86MaskedLoad and moved SDTX86MaskedStore up to the other SDTypeProfiles.

Thanks for the changes. I am happy with this patch.
I could do a suggestion about readability, for me this would greatly improve that (a diff on top of this diff) because this has all the bit twiddling in one place, with all the patterns easy to see, and then you don't need to read any of that in the rest of the code:

diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 36e7509..ef7730c 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -1150,8 +1150,6 @@ public:
        getIndexedLoadAction(IdxMode, VT.getSimpleVT()) == Custom);
   }
 
   /// Return how the indexed store should be treated: either it is legal, needs
   /// to be promoted to a larger size, needs to be expanded to some other code
   /// sequence, or the target has a custom expander for it.
@@ -1171,15 +1169,59 @@ public:
        getIndexedStoreAction(IdxMode, VT.getSimpleVT()) == Custom);
   }
 
+  // The Action is a 16 bit value in the table, and we encode the action for
+  // the different instructions in the follow upper/lower parts of these 2
+  // bytes:
+  //
+  //          top         bottom
+  //    15            |           0 bit
+  //    --------------------------|
+  //    |  ML  |  MS  |  L  |  S  |
+  //    --------------------------|
+  //
+  // where:
+  //   ML = indexed masked load
+  //   MS = indexed masked store
+  //   L  = indexed load
+  //   S  = indexed store
+  //
+  void writeIndexedModeActionLowerBottomByte(unsigned Idx, unsigned IdxMode,
+                                             uint16_t Action) {
+    IndexedModeActions[Idx][IdxMode] &= ~0xf;
+    IndexedModeActions[Idx][IdxMode] |= Action;
+  }
+  void writeIndexedModeActionUpperBottomByte(unsigned Idx, unsigned IdxMode,
+                                             uint16_t Action) {
+    IndexedModeActions[Idx][IdxMode] &= ~(0xf << 4);
+    IndexedModeActions[Idx][IdxMode] |= Action << 4;
+  }
+  void writeIndexedModeActionLowerTopByte(unsigned Idx, unsigned IdxMode,
+                                          uint16_t Action) {
+    IndexedModeActions[Idx][IdxMode] &= ~(0xf << 8);
+    IndexedModeActions[Idx][IdxMode] |= Action << 8;
+  }
+  void writeIndexedModeActionUpperTopByte(unsigned Idx, unsigned IdxMode,
+                                          uint16_t Action) {
+    IndexedModeActions[Idx][IdxMode] &= ~(0xf << 12);
+    IndexedModeActions[Idx][IdxMode] |= Action << 12;
+  }
+  LegalizeAction getIndexedModeLowerTopByte(unsigned Ty, unsigned IdxMode) const {
+    return (LegalizeAction)((IndexedModeActions[Ty][IdxMode] >> 8) & 0xf);
+  }
+  LegalizeAction getIndexedModeUpperTopByte(unsigned Ty, unsigned IdxMode) const {
+    return (LegalizeAction)((IndexedModeActions[Ty][IdxMode] >> 12) & 0xf);
+  }
+  /*
+     TODO: the others
+  */
+
   /// Return how the indexed load should be treated: either it is legal, needs
   /// to be promoted to a larger size, needs to be expanded to some other code
   /// sequence, or the target has a custom expander for it.
   LegalizeAction getIndexedMaskedLoadAction(unsigned IdxMode, MVT VT) const {
     assert(IdxMode < ISD::LAST_INDEXED_MODE && VT.isValid() &&
            "Table isn't big enough!");
-    unsigned Ty = (unsigned)VT.SimpleTy;
-    // Masked Load action are kept in the upper half of the top byte.
-    return (LegalizeAction)((IndexedModeActions[Ty][IdxMode] >> 12) & 0xf);
+    return getIndexedModeUpperTopByte(VT.SimpleTy, IdxMode);
   }
 
   /// Return true if the specified indexed load is legal on this target.
@@ -1195,9 +1237,7 @@ public:
   LegalizeAction getIndexedMaskedStoreAction(unsigned IdxMode, MVT VT) const {
     assert(IdxMode < ISD::LAST_INDEXED_MODE && VT.isValid() &&
            "Table isn't big enough!");
-    unsigned Ty = (unsigned)VT.SimpleTy;
-    // Masked Store action are kept in the lower half of the top byte.
-    return (LegalizeAction)((IndexedModeActions[Ty][IdxMode] >> 8) & 0xf);
+    return getIndexedModeLowerTopByte(VT.SimpleTy, IdxMode);
   }
 
   /// Return true if the specified indexed load is legal on this target.
@@ -2099,9 +2139,7 @@ protected:
     assert(VT.isValid() && IdxMode < ISD::LAST_INDEXED_MODE &&
            (unsigned)Action < 0xf && "Table isn't big enough!");
     // Load action are kept in the upper half of the bottom byte.
-    IndexedModeActions[(unsigned)VT.SimpleTy][IdxMode] &= ~(0xf << 4);
-    IndexedModeActions[(unsigned)VT.SimpleTy][IdxMode] |= ((uint16_t)Action)
-                                                          << 4;
+    writeIndexedModeActionUpperBottomByte(VT.SimpleTy, IdxMode, Action);
   }
 
   /// Indicate that the specified indexed store does or does not work with the
@@ -2113,9 +2151,7 @@ protected:
                              LegalizeAction Action) {
     assert(VT.isValid() && IdxMode < ISD::LAST_INDEXED_MODE &&
            (unsigned)Action < 0xf && "Table isn't big enough!");
-    // Store action are kept in the lower half of the bottom byte.
-    IndexedModeActions[(unsigned)VT.SimpleTy][IdxMode] &= ~0xf;
-    IndexedModeActions[(unsigned)VT.SimpleTy][IdxMode] |= ((uint16_t)Action);
+    writeIndexedModeActionLowerBottomByte(VT.SimpleTy, IdxMode, Action);
   }
 
   /// Indicate that the specified indexed masked load does or does not work with
@@ -2127,10 +2163,7 @@ protected:
                                   LegalizeAction Action) {
     assert(VT.isValid() && IdxMode < ISD::LAST_INDEXED_MODE &&
            (unsigned)Action < 0xf && "Table isn't big enough!");
-    // Masked Load action are kept in the upper half of the top byte.
-    IndexedModeActions[(unsigned)VT.SimpleTy][IdxMode] &= ~(0xf << 12);
-    IndexedModeActions[(unsigned)VT.SimpleTy][IdxMode] |= ((uint16_t)Action)
-                                                          << 12;
+    writeIndexedModeActionUpperTopByte(VT.SimpleTy, IdxMode, Action);
   }
 
   /// Indicate that the specified indexed masked store does or does not work
@@ -2142,10 +2175,7 @@ protected:
                                    LegalizeAction Action) {
     assert(VT.isValid() && IdxMode < ISD::LAST_INDEXED_MODE &&
            (unsigned)Action < 0xf && "Table isn't big enough!");
-    // Masked Store action are kept in the lower half of the top byte.
-    IndexedModeActions[(unsigned)VT.SimpleTy][IdxMode] &= ~(0xf << 8);
-    IndexedModeActions[(unsigned)VT.SimpleTy][IdxMode] |= ((uint16_t)Action)
-                                                          << 8;
+    writeIndexedModeActionLowerTopByte(VT.SimpleTy, IdxMode, Action);
   }
 
   /// Indicate that the specified condition code is or isn't supported on the

Alright, Awesome.

I had written something similar, but different in what parts were altered. I didn't like it in the end though, I felt like it ended up being harder to read and maintain because you just end up spreading the information that you need all over the file. I may not be the best judge of such things though, I often favour a declarative style over one that is more elegant, and am aware that that opinion isn't always popular. It's worth remembering that the aim is for simplicity and maintainability over something that is prettier (although they are often related).

But if you are having trouble reading this in its current form, I will try and make some modifications to it. Watch this space.

I will say that there is lots of other code in this patch that could do with extra eyes though! You've already found one issue, and it's worth making sure we are not focusing on the wrong thing, missing the forest for the trees. There's lots of other code in here that could do with a bit of extra scrutiny.

It would be good to see this rebased since the tail predication changes went in.

llvm/lib/Target/ARM/ARMISelLowering.cpp
385

Why not v4i32 and floats too?

15208

Maybe add a comment here for why we have this restriction?

llvm/lib/Target/ARM/ARMInstrMVE.td
5337

I don't think we shouldn't be restricting the base to a T1 register.

dmgreen updated this revision to Diff 230426.Nov 21 2019, 4:48 AM
dmgreen marked 5 inline comments as done.

Prettify and address comments.

llvm/lib/Target/ARM/ARMISelLowering.cpp
385

This is "extending masked post-inc stores", so is only the extended types that will be extended. The others are above.

We might well want to "zero extend" fp16s into a wider register at some point, especially if we are converting them to floats, but thats not a job for here.

llvm/lib/Target/ARM/ARMInstrMVE.td
5337

Ooof, the double negatives! This uses the same as the MVE_vector_offset_store_typed, which I think is OK for "normal" loads/stores. It's the extending loads/stores below that might be the problem (and don't really look right to me).

I'll make it the same a non-masked for the moment, and try to fixup what doesn't look right in another commit.

samparker accepted this revision.Nov 21 2019, 5:03 AM

Cheers. LGTM

llvm/lib/Target/ARM/ARMISelLowering.cpp
385

Face palm. Yeah ok, we can cross fp16 if/when we need it.

llvm/lib/Target/ARM/ARMInstrMVE.td
5337

Haha, my bad. Okay.

This revision is now accepted and ready to land.Nov 21 2019, 5:03 AM
This revision was automatically updated to reflect the committed changes.