This is an archive of the discontinued LLVM Phabricator instance.

[AVR] Expand large shifts early in IR
ClosedPublic

Authored by aykevl on Feb 14 2021, 4:10 PM.

Details

Summary

This patch makes sure shift instructions such as this one:

%result = shl i32 %n, %amount

are expanded just before the IR to SelectionDAG conversion to a loop so that calls to non-existing library functions such as __ashlsi3 are avoided. The generated code is currently pretty bad but there's a lot of room for improvement: the shift itself can be done in just four instructions.


I have tested this patch locally with my set of compiler-rt based tests and all tests that previously passed still pass. The difference is that there is no __ashlsi3, __ashlsi3, or __lshrsi3 call anymore in the code.

Diff Detail

Event Timeline

aykevl created this revision.Feb 14 2021, 4:10 PM
aykevl requested review of this revision.Feb 14 2021, 4:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2021, 4:10 PM
aykevl updated this revision to Diff 323648.Feb 14 2021, 4:11 PM
aykevl edited the summary of this revision. (Show Details)

Oops I uploaded the wrong patch.

benshi001 added inline comments.
llvm/lib/Target/AVR/AVRShiftExpand.cpp
52

can this line be simpilfied to

for (Instruction &I : instructions(F))

as in Transforms/IPO/Attributor.h ?

54–63

Is it better to combine all these conditions to only one if statement?

67–70

Is this line needed?

the following for-statement won't run if ShiftInsts.size() == 0

And we just return ShiftInsts.size() > 0 at the end.

92

Is it possible to generate a check about ShiftAmount>32, like that

  cmp ShiftAmount, 32
  brlt _labloop
  mov Rdst, 0
  ret
_labloop:
  the shift loop

However avr-gcc does not generate the check against 32.

llvm/test/CodeGen/AVR/shift-expand.ll
11

How about try to generate instruction serial with llvm/utils/update_llc_test_checks.py ? It already support AVR.

aykevl added inline comments.Feb 20 2021, 4:06 PM
llvm/lib/Target/AVR/AVRShiftExpand.cpp
52

Thank you! Yes, that looks a lot better.

54–63

I think keeping them separate is better for readability. Now every condition has a comment explaining why the condition needs to be checked.

67–70

I've modified the code accordingly. Either seems fine by me.

92

Shifts larger or equal to the bit width (>=32) are a poison value according to the LangRef. Therefore, in practice they should not occur. I believe they're undefined behavior according to the C standard.

llvm/test/CodeGen/AVR/shift-expand.ll
11

Good idea, I'll use update_test_checks.py.

Note that the output is IR, not AVR assembly. This is a pure IR level pass.

aykevl updated this revision to Diff 325255.Feb 20 2021, 4:08 PM
  • fix lint checks (hopefully)
  • simplify pass a bit, with suggestions from @benshi001
  • use update_test_checks.py for the test

I am not sure such a specific pass is needed. Why __ashlsi3 is not called in other backends? Is there a config flag/option to prevent calling __ashlsi3 ?

I am not sure such a specific pass is needed. Why __ashlsi3 is not called in other backends? Is there a config flag/option to prevent calling __ashlsi3 ?

Because most instruction sets do support 32-bit shifts but AVR does not. For example, it appears that the MSP430 has the same problem: https://reviews.llvm.org/D78663#2215170.

I've investigated whether there are any other options to this but I couldn't come up with any. The builtin calls are created inside SelectionDAG which converts non-constant shifts to library calls. Therefore, this pass converts non-constant shifts to constant shifts in a loop to match avr-gcc so that the resulting code does not contain any 32-bit non-constant shifts.

I am not sure such a specific pass is needed. Why __ashlsi3 is not called in other backends? Is there a config flag/option to prevent calling __ashlsi3 ?

Because most instruction sets do support 32-bit shifts but AVR does not. For example, it appears that the MSP430 has the same problem: https://reviews.llvm.org/D78663#2215170.

I've investigated whether there are any other options to this but I couldn't come up with any. The builtin calls are created inside SelectionDAG which converts non-constant shifts to library calls. Therefore, this pass converts non-constant shifts to constant shifts in a loop to match avr-gcc so that the resulting code does not contain any 32-bit non-constant shifts.

I see and I am OK with this solution. What about Dylan's opinion?

aykevl updated this revision to Diff 327763.Mar 3 2021, 6:11 AM
  • fix lint warnings
aykevl updated this revision to Diff 327932.Mar 3 2021, 2:51 PM
  • fixes so that shl i32 undef, undef doesn't crash
dylanmckay accepted this revision.Jun 28 2021, 5:19 AM

This looks good to me, and, it is especially nice to remove the dependency on the nonexistent runtime lib functions which has been a big issue. Nice work

This revision is now accepted and ready to land.Jun 28 2021, 5:19 AM
This revision was landed with ongoing or failed builds.Jul 24 2021, 5:04 AM
This revision was automatically updated to reflect the committed changes.