This is an archive of the discontinued LLVM Phabricator instance.

[AVR][clang] Pass the address of the data section to the linker for ATmega328
ClosedPublic

Authored by dylanmckay on Aug 26 2020, 8:28 AM.

Details

Summary

This patch modifies the Clang AVR toolchain so that it always passes
the '-Tdata=0x800100' to the linker for ATmega328 devices. This matches
AVR-GCC behaviour, and also corresponds to the address of the start of
the data section in data space according to the ATmega328 datasheet.

Without this, clang does not produce a valid ATmega328 binary.

When targeting all non-ATmega328 chips, a warning will be emitted due to
the fact that proper handling for the chips data section address is
not yet implemented.

I've held off adding other microcontrollers for now, mostly because the
AVR toolchain logic is smeared across LLVM core TableGen files, and two Clang
libraries. The 'family detection' logic is also only implemented for
ATmega328 at the moment, for similar reasons.

In the future, I aim to write an RFC to llvm-dev to find a better way
for LLVM to expose target-specific details such as these to compiler
frontends.

Diff Detail

Event Timeline

dylanmckay created this revision.Aug 26 2020, 8:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2020, 8:28 AM
Herald added a subscriber: Jim. · View Herald Transcript
dylanmckay requested review of this revision.Aug 26 2020, 8:28 AM

Hey @aykevl, could you please confirm that this patch looks okay?

Looks reasonable to me, although I can't really comment on the contents of this as I'm not very familiar with this code.

I would like to see just a single MCU table that contains all the information (including start addresses). Maybe it can even be generated from ATDF files (http://packs.download.atmel.com/), at least the first time. However, this is a good start (not least because now it warns when producing invalid binaries).

clang/lib/Driver/ToolChains/AVR.cpp
40

I don't think the LLVM coding style says something about this, but coming from Go I'm more used to capitalized abbreviations (MCU, GetMCUSectionAddressData).

However, this is just a superficial thing, feel free to ignore.

benshi001 added a subscriber: benshi001.EditedSep 28 2020, 8:36 PM

I think this patch is OK to be committed.

And I hope

  1. Temporarily using an array for more devices, before a solution of getting info from the tblgen.
  1. Distinghuish the arguments for "-L" and "-m". It is correct for atmega328 that avr-ld needs ("-L/usr/lib/avr/lib/avr5" ... "-latmega328p" "-mavr5") But for some special devices, such as attiny24, the following options is needed by avr-ld ("-L/usr/lib/avr/lib/avr25/tiny-stack" ... "-lattiny24" "-mavr25") , due to the avr-lib's file organization.

I am glad to maintain an array before a formal tblgen solution, which includes device family (-m),
lib file sub path (-L), data segment address (-T)

clang/lib/Driver/ToolChains/AVR.cpp
40

I prefer to "Mcu"

dylanmckay added inline comments.Oct 28 2020, 9:42 AM
clang/lib/Driver/ToolChains/AVR.cpp
40

I will change, thanks for the suggestions. FWIW, Rust prefers the Mcu variant IIRC which is where I adapted this pattern

Regarding TableGen; I would like to send an RFC to llvm-dev to come up with a proper API to expose backend-specific device-specific information and constants to LLVM frontends, as I can imagine that many backends could stand to benefit. I note that there is in general a lot of duplication in individual frontends defining information LLVM already knows about (off the top of my head, inline assembly constraints and registers is one of them but may not easily fit into the paradigm I have in mind).

This revision was not accepted when it landed; it landed in state Needs Review.Oct 28 2020, 10:35 AM
This revision was automatically updated to reflect the committed changes.