This is an archive of the discontinued LLVM Phabricator instance.

[AVR] Use correct relocation for function pointers in globals
AbandonedPublic

Authored by aykevl on Jun 25 2020, 2:56 AM.

Details

Summary

Before this patch, function pointers would be emitted with the R_AVR_16
relocation. This relocation is correct for global addresses but not for
function addresses. Function pointers must use the relocation
R_AVR_16_PM.

This patch hopefully fixes this. I'm not entirely sure whether the patch
is entirely correct, but it does fix a number of cases (see the test
file).

Diff Detail

Event Timeline

aykevl created this revision.Jun 25 2020, 2:56 AM

There is a test failure in llvm/test/MC/AVR/relocations-abs.s. With this change the LLVM behavior would start to differ from the avr-gcc behavior. But actually, the avr-gcc behavior might be a bug.

This is the affected code:

bar:
    .short 1f
1:

This is considered a function (not data) by the assembler, which you can see if you run nm:

$ avr-nm relocations-abs.o
00000000 t bar

I think all function pointers in globals should be using the R_AVR_16_PM relocation, no matter whether it's created in assembly or IR.

aykevl updated this revision to Diff 273289.Jun 25 2020, 3:40 AM
  • fixed MC test failure

I could perhaps add a test to relocations-abs.s that .type foo,@object creates a relocation of type R_AVR_16?

shepmaster added inline comments.Jun 26 2020, 6:35 AM
llvm/lib/Target/AVR/MCTargetDesc/AVRELFObjectWriter.cpp
74

Running this patch with my original code, it still doesn't work. If I print out symbol.getType(), it's 2, which seems to be STT_FUNC:

enum {
  STT_NOTYPE = 0,     // Symbol's type is not specified
  STT_OBJECT = 1,     // Symbol is a data object (variable, array, etc.)
  STT_FUNC = 2,       // Symbol is executable code (function, etc.)

I changed it to include both:

-      if (symbol.getType() == ELF::STT_NOTYPE)
+      auto type = symbol.getType();
+      if (type == ELF::STT_NOTYPE || type == ELF::STT_FUNC)

And this allowed my code to work.

I experimented a bit more but I really can't find a way to tell the difference between undefined functions and undefined globals. Assuming one or the other will still be incorrect in some cases.

Maybe the situation can be improved from the current situation by only creating the R_AVR_16_PM relocation for defined functions? That part is easy, and fixes this bug in at least some cases (while not making it worse in other cases). Fixing this bug for undefined functions may require changes somewhere else in LLVM, and might be a bit too much for me.

This patch doesn't look right. Maybe the following is helpful? https://godbolt.org/z/_JCX5d

@efriedma yeah I know, it's not right in the current state. See my previous comment. Do you have any idea how it is possible to get the object type (function or data) when determining the relocation type, when the referenced symbol is external?

Do you have any idea how it is possible to get the object type (function or data) when determining the relocation type, when the referenced symbol is external?

The way avr-gcc works, the compiler is responsible for figuring this out, not the assembler. A code pointer should be emitted as .word gs(func) i.e. VK_AVR_GS.

aykevl abandoned this revision.Jul 25 2020, 9:16 AM

The way avr-gcc works, the compiler is responsible for figuring this out, not the assembler. A code pointer should be emitted as .word gs(func) i.e. VK_AVR_GS.

Thank you for this!
In that case, I believe this patch is invalid and this feature should be implemented in a different way. However, I honestly have no idea where to even look to get this fixed.

I will abandon this now. The change may be useful for some people but can't be integrated into LLVM as-is.

This patch has been superseded by D87631, which should actually be correct (unlike this patch).