This is an archive of the discontinued LLVM Phabricator instance.

[AVR] Optimize 32-bit shift: move bytes around
ClosedPublic

Authored by aykevl on Dec 22 2022, 11:16 AM.

Details

Summary

This patch optimizes 32-bit constant shifts by renaming registers. This
is very effective as the compiler would otherwise need to do a lot of
single bit shift instructions. Instead, the registers are renamed at the
SSA level which means the register allocator will insert the necessary
mov instructions.

Unfortunately, the register allocator will insert some unnecessary movs
with the current code. This will be fixed in a later patch.

Diff Detail

Event Timeline

aykevl created this revision.Dec 22 2022, 11:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 22 2022, 11:16 AM
Herald added subscribers: Jim, hiraditya. · View Herald Transcript
aykevl requested review of this revision.Dec 22 2022, 11:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 22 2022, 11:16 AM
benshi001 added inline comments.Dec 23 2022, 3:19 AM
llvm/lib/Target/AVR/AVRISelLowering.cpp
1887

I am a bit confused about here,

std::pair(Out, 0) is first put to the tail, then is dropped by drop_back(1) ?

In the first round of this loop, Regs comes in with size==4, then in the next round, Regs only keeps 3 elements before the moves?

1917

I am also confused by here as your did for the left shift, the std::pair(ShrExtendReg, 0) is put at the head of Regs, and then dropped by drop_front(1) ? Is it necessary to create a temporary copy of Regs ?

benshi001 added inline comments.Dec 23 2022, 5:41 AM
llvm/lib/Target/AVR/AVRISelLowering.cpp
1887

If the last element is dropped, is the above Regs[Regs.size() - 1] = std::pair(Out, 0); redundant ?

1917

If Regs.drop_front(1);, is the above Regs[0] = std::pair(ShrExtendReg, 0); redundant ?

aykevl added inline comments.Dec 25 2022, 5:37 AM
llvm/lib/Target/AVR/AVRISelLowering.cpp
1887

std::pair(Out, 0) is first put to the tail, then is dropped by drop_back(1) ?

Correct. This is a MutableArrayRef and is modified in-place (it directly modifies the Registers array inside insertWideShift). For example, when shifting left by 8 the least significant register in the Registers array (Registers[3]) is zeroed. The Regs = Regs.drop_back(1) line only modifies the _view_ on the underlying array. That means that in the next iteration of the loop, Regs will only have a length of 3 (while Registers still has a length of four).

See my next comment, where it is hopefully more clear.

1917

If Regs.drop_front(1);, is the above Regs[0] = std::pair(ShrExtendReg, 0); redundant ?

No. When shifting right by 8, Regs[0] = std::pair(ShrExtendReg, 0); directly modifies Registers[0]. After Regs = Regs.drop_front(1), Regs[0] refers to Registers[1], Regs[1] refers to Registers[2], and Regs[2] refers to Registers[3]. This means that if you logically shift right by 16, in the first iteration Regs[0] will zero Registers[0] but in the second iteration Regs[0] will zero Registers[1].

Regs is simply a view on Registers. It is not a separate array.
(This is very similar to slices in Go, which is the language I often use).

benshi001 added inline comments.Dec 25 2022, 7:31 PM
llvm/lib/Target/AVR/AVRISelLowering.cpp
1866–1869

Can this be const DebugLoc dl ?

1917

I see. Your way is good !

benshi001 added inline comments.Dec 25 2022, 7:49 PM
llvm/lib/Target/AVR/AVRISelLowering.cpp
1873

These two variables are only used inside while (ShiftAmt >= 8), so how about moving them down to the head of the loop ?

benshi001 added inline comments.Dec 25 2022, 8:25 PM
llvm/lib/Target/AVR/AVRISelLowering.cpp
1898

I think this if (ShrExtendReg == 0) can be moved out of current loop, something like

Register ShrExtendReg = 0;
if (ShiftAmt >= 8) {
  auto &MSBReg = Regs[0];
  ShrExtendReg = MRI.createVirtualRegister(&AVR::GPR8RegClass);
  if (ArithmeticShift) {
    ...
  } else {
    ...
  }
}
while (ShiftAmt >= 8) {
  Regs[0] = std::pair(ShrExtendReg, 0);
  Regs = Regs.drop_front(1);
  ShiftAmt -= 8;
}

Actually

  1. we need not check if (ShrExtendReg == 0) time and time again.
  2. we need not expose auto &MSBReg = Regs[0] which is only used locally.
benshi001 added inline comments.Dec 25 2022, 8:29 PM
llvm/lib/Target/AVR/AVRISelLowering.cpp
1883

Do we need to do AVR::COPY for each iteration ? Can it be a unique Register Out shared by all iterations, and even shared with lshr ?

benshi001 requested changes to this revision.Dec 25 2022, 8:36 PM
This revision now requires changes to proceed.Dec 25 2022, 8:36 PM
benshi001 added inline comments.Dec 25 2022, 11:16 PM
llvm/test/CodeGen/AVR/shift32.ll
131

How about adding a lshr_i32_9 and see what will happen ?

219

How about adding a ashr_i32_17 and see what will happen ?

aykevl updated this revision to Diff 485817.Jan 1 2023, 1:27 PM

Applied all review feedback and rebased the patch.

llvm/lib/Target/AVR/AVRISelLowering.cpp
1866–1869

This is part of D140569.

1873

Fixed.

1883

Interesting idea! I've tried it, and it seems to work just as well.

1898

Fixed.

aykevl updated this revision to Diff 485819.Jan 1 2023, 1:32 PM
  • rebase (somehow the pre-merge checks failed because of an error?)
benshi001 accepted this revision.Jan 1 2023, 6:54 PM
benshi001 added inline comments.
llvm/lib/Target/AVR/AVRISelLowering.cpp
1878

This can be for (; ShiftLeft && ShiftAmt >= 8; ShiftAmt -= 8) as you did in the next loop.

You can change that while committing.

This revision is now accepted and ready to land.Jan 1 2023, 6:54 PM
This revision was automatically updated to reflect the committed changes.