Page MenuHomePhabricator

[compiler-rt][builtins] Add several helper functions for AVR
ClosedPublic

Authored by benshi001 on Apr 6 2022, 3:50 AM.

Details

Summary
__mulqi3: int8 multiplication
__mulhi3: int16 multiplication
_exit   : golobal terminator

Diff Detail

Event Timeline

benshi001 created this revision.Apr 6 2022, 3:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2022, 3:50 AM
benshi001 requested review of this revision.Apr 6 2022, 3:50 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 6 2022, 3:50 AM
Herald added subscribers: Restricted Project, cfe-commits, jacquesguan, aheejin. · View Herald Transcript
benshi001 added a comment.EditedApr 6 2022, 3:51 AM

Build libclang_rt.builtins-avr.a for target AVR, which contains math functions, such as mulsi3, divmodsi4, mulsf3, ...

benshi001 retitled this revision from [compiler-rt][AVR] Add initial support of target AVR to [compiler-rt][CMake] Add initial support of target AVR.Apr 6 2022, 11:03 PM
aykevl added a comment.EditedApr 7 2022, 6:26 AM

What about chip families other than avr2 (the default)? Some have a different PC pointer size (1-3 bytes, therefore affecting the ABI), some have different instructions, etc.

benshi001 added a comment.EditedApr 7 2022, 7:05 AM

What about chip families other than avr2 (the default)? Some have a different PC pointer size (1-3 bytes, therefore affecting the ABI), some have different instructions, etc.

There are two options CMAKE_CXX_FLAGS and CMAKE_C_FLAGS can be specified when doing cmake against compiler-rt, we can specify -mmcu=avrX via them.

Since libgcc-avr has unique source code but built/distributed into several different libgcc.a files with different instruction set, I expect compiler-rt would be done in the way.

And of cource this patch just is an initial support, we still need much more patches to make it fully functional.

Initial support for a new target has a high standard. It needs to show decent support, otherwise it may just invite fixups which may in the end waste review resources.
Showing that these files buildable is far from enough. You'd need to show that the run-time tests actually work.

MaskRay added inline comments.Apr 11 2022, 8:46 PM
clang/lib/Basic/Targets/AVR.cpp
380 ↗(On Diff #420772)

The clang preprocessor changes are unrelated to compiler-rt and should be contribute separated with tests.

benshi001 marked an inline comment as done.Apr 11 2022, 9:15 PM
benshi001 added inline comments.
clang/lib/Basic/Targets/AVR.cpp
380 ↗(On Diff #420772)

I have created another patch as you suggested, thanks.

https://reviews.llvm.org/D123567

benshi001 marked an inline comment as done.Apr 11 2022, 9:32 PM

Initial support for a new target has a high standard. It needs to show decent support, otherwise it may just invite fixups which may in the end waste review resources.
Showing that these files buildable is far from enough. You'd need to show that the run-time tests actually work.

Could you please tell me more about show that the run-time tests actually work ? AVR is 8-bit MCU with any OS, so I just want some math functions in compiler-rt/lib/builtins, to avoid linking to libgcc.a.

I have checked that there are tests at compiler-rt/test/builtins/Unit/arm, and I guess them to be run on a ARM host. So I have to make a new directory compiler-rt/test/builtins/Unit/avr with some test cases, and promiss that they can run on an AVR board? (https://www.arduino.cc/en/hardware#boards-1)

benshi001 updated this revision to Diff 423805.Apr 19 2022, 8:54 PM
benshi001 edited the summary of this revision. (Show Details)Apr 19 2022, 9:34 PM
benshi001 edited the summary of this revision. (Show Details)
benshi001 added a comment.EditedApr 19 2022, 9:44 PM

I added three functions

__mulqi3: int8 multiplication
__mulhi3: int16 multiplication
_exit   : golobal terminator

All of them are contained in libgcc and are necessary.

Some key points,

  1. The functions are written in the minimal avrtiny instruction set, so they should be compatible for all devices.
  1. We build them to higher/upper AVR ELF files by specifying CMAKE_CXX_FLAGS and CMAKE_C_FLAGS when doing cmake.
  1. I have refered to libgcc, but not simply copy. Actually my implementation is shorter than libgcc's, but may cost more cycles. However this is something about tradeoff.
  1. I have tested against avr hardware multiplier, the results are all the same on ArduinoNano (atmega328).
  1. My test program locates at https://github.com/benshi001/avr-test/blob/main/mul_int16_int8.ino, I only added 300 pairs of random numbers as test cases, due to the limitation of my device (atmega328 has only 2KB SRAM, more test cases will lead to too large data section). Acutally I have tested 900 pairs.
benshi001 retitled this revision from [compiler-rt][CMake] Add initial support of target AVR to [compiler-rt][builtins] Add several helper functions for AVR.Apr 19 2022, 9:48 PM
MaskRay accepted this revision.Apr 30 2022, 1:01 PM
MaskRay added inline comments.
compiler-rt/cmake/Modules/CompilerRTUtils.cmake
172

Keep the list in alphabetical order. Move avr to the beginning. Ignore some entries which are unordered.

218

Keep the list in alphabetical order. Move avr to the beginning. Ignore some entries which are unordered.

compiler-rt/cmake/builtin-config-ix.cmake
56

Keep the list in alphabetical order. Move avr to the beginning. Ignore some entries which are unordered.

compiler-rt/lib/builtins/CMakeLists.txt
678

Keep the *_SOURCES in alphabetical order. Move avr to the beginning. Ignore some entries which are unordered.

This revision is now accepted and ready to land.Apr 30 2022, 1:01 PM
benshi001 marked an inline comment as done.May 1 2022, 5:22 PM
benshi001 added inline comments.
compiler-rt/lib/builtins/CMakeLists.txt
678

I will fix all your concerns about "alphabetical order" when committing.

This revision was automatically updated to reflect the committed changes.
benshi001 marked an inline comment as done.
aykevl added a comment.EditedMay 5 2022, 4:01 PM

@benshi001 I have been looking through the GCC code and I think avr-gcc also has a special calling convention for many other functions, including __mulqi3 and __mulhi3.

Source:

  1. I think this is where the ABI is specified in the compiler: https://github.com/gcc-mirror/gcc/blob/releases/gcc-5.4.0/gcc/config/avr/avr.md#L1543-L1549 You can see that it multiplies R24 with R22 and stores the result in R24, and clobbers R22 in the process. But no other registers.
  2. In this code sample, avr-gcc doesn't save char c (r20) across the __mulqi3 and __mulhi3 calls, which is normally call-clobbered.

Therefore, I think we need to be a bit more careful with defining these AVR builtins and check the ABI in avr-gcc first.
Also, we can make use of this and optimize the AVR backend more (you can see that the Clang generated code is much worse than avr-gcc in the above examples).

@benshi001 I have been looking through the GCC code and I think avr-gcc also has a special calling convention for many other functions, including __mulqi3 and __mulhi3.

Source:

  1. I think this is where the ABI is specified in the compiler: https://github.com/gcc-mirror/gcc/blob/releases/gcc-5.4.0/gcc/config/avr/avr.md#L1543-L1549 You can see that it multiplies R24 with R22 and stores the result in R24, and clobbers R22 in the process. But no other registers.
  2. In this code sample, avr-gcc doesn't save char c (r20) across the __mulqi3 and __mulhi3 calls, which is normally call-clobbered.

Therefore, I think we need to be a bit more careful with defining these AVR builtins and check the ABI in avr-gcc first.
Also, we can make use of this and optimize the AVR backend more (you can see that the Clang generated code is much worse than avr-gcc in the above examples).

Let us discuss that at https://github.com/llvm/llvm-project/issues/55279