This is an archive of the discontinued LLVM Phabricator instance.

[yaml2obj][obj2yaml] - Add support for SHT_ARM_EXIDX section.
ClosedPublic

Authored by grimar on Sep 24 2020, 6:49 AM.

Details

Summary

This adds the support for SHT_ARM_EXIDX sections to obj2yaml/yaml2obj tools.

SHT_ARM_EXIDX is a ARM specific index table filled with entries.
Each entry consists of two 4-bytes values (words).
(https://developer.arm.com/documentation/ihi0038/c/?lang=en#index-table-entries)

Note: I am not sure if we should respect endianness for these words.
This patch does. But seems everywhere in LLVM they are written/read as LE.
At the same time GNU readelf -u behaves differently for a section that is described as:

- Name:    .ARM.exidx
  Type:    SHT_ARM_EXIDX
  Content: "00000000000000010000000001000000"

For a little endian object it dumps:

> readelf -u test.o

Unwind section '.ARM.exidx' at offset 0x40 contains 1 entry:

0x0: @0x1000004
readelf: Warning: Could not locate .ARM.extab section containing 0x1000004.

0x8: 0x1 [cantunwind]

And for a big endian:

> readelf -u test.o

Unwind section '.ARM.exidx' at offset 0x40 contains 1 entry:

0x0: 0x1 [cantunwind]

0x8: @0x100000c
readelf: Warning: Could not locate .ARM.extab section containing 0x100000c.

I also haven't found in specification that these values must always be little-endian.

Diff Detail

Event Timeline

grimar created this revision.Sep 24 2020, 6:49 AM
grimar requested review of this revision.Sep 24 2020, 6:49 AM
grimar retitled this revision from [yaml2obj][obj2yaml] - Add a support for SHT_ARM_EXIDX section. to [yaml2obj][obj2yaml] - Add support for SHT_ARM_EXIDX section..
grimar planned changes to this revision.Sep 24 2020, 6:57 AM
grimar updated this revision to Diff 294047.Sep 24 2020, 7:05 AM
  • Rebased, updated llvm-readobj/ELF/ARM/unwind-non-relocatable.test test case which needed an update, because ContentArray can only be used for regular sections currently and with this patch yam2obj failed to parse the YAML description.

I'll try and take a look tomorrow.

In general for endianness .ARM.exidx is counted as data and data will take the endianness of the object. In LLD I only supported little endian as the story for instructions is complicated in big-endian. In a big-endian relocatable object both instructions and data are big-endian. In an big-endian executable or shared library instructions are little-endian but data is big-endian so the linker needs to endian-reverse instructions.

jhenderson added inline comments.Sep 25 2020, 12:54 AM
llvm/lib/ObjectYAML/ELFYAML.cpp
1322

I think you should drop the EM_ARM requirement here. It doesn't really add anything, and makes it more restrictive on what one can use this code for. Imagine I have a downstream target that is ARM-based - I might have a different EM_* value, but still want to be able to use SHT_ARM_EXIDX, and therefore I'd want to be able to write tests using this format.

llvm/test/tools/obj2yaml/ELF/arm-exidx-section.yaml
4
24–27

I might be missing something, but why are the entries in a different order here? I wouldn't expect endianness ot have any impact outside individual values.

44
llvm/test/tools/yaml2obj/ELF/arm-exidx-section.yaml
52
119
125
grimar updated this revision to Diff 294246.Sep 25 2020, 1:33 AM
grimar marked 6 inline comments as done.
grimar edited the summary of this revision. (Show Details)
  • Addressed review comments.
llvm/lib/ObjectYAML/ELFYAML.cpp
1322

The problem is that SHT_ARM_EXIDX shares the value with SHT_X86_64_UNWIND (0x70000001U). We might have other machine specific conflicts, e.g. SHT_ARM_ATTRIBUTES vs SHT_MSP430_ATTRIBUTES vs SHT_RISCV_ATTRIBUTES (0x70000003U).
For your case the best solution I believe is to have a private patch to expand this condition.

llvm/test/tools/obj2yaml/ELF/arm-exidx-section.yaml
24–27

They are not in different order. See, the input data is:

## 4 words: <arbitrary>, EXIDX_CANTUNWIND in big-endian,
##          <arbitrary> and EXIDX_CANTUNWIND in little-endian.
    Content: "AABBCCDD00000001EEFF889901000000"

We have: <AABBCCD, 00000001>, <EEFF8899, 01000000>

0x01000000 for a little-endian object is just a value, while for big-endian object it is EXIDX_CANTUNWIND (0x1).
The same happens for 0x00000001.

This revision is now accepted and ready to land.Sep 25 2020, 5:47 AM

What's here looks good to me to.

Just to check I understand it, if for a relocatable object I wanted to model the R_ARM_PREL31 relocation on the first entry of the table, and optionally on the second entry for a .ARM.extab reference then I'd need to add the PREL relocations myself? A possible future enhancement could do this automatically. For example all the relocations would be with respect to the section symbol of the sh_link section so the R_ARM_PREL31 relocations could be added automatically.

grimar added a comment.EditedSep 28 2020, 12:28 AM

Just to check I understand it, if for a relocatable object I wanted to model the R_ARM_PREL31 relocation on the first entry of the table, and optionally on the second entry for a .ARM.extab reference then I'd need to add the PREL relocations myself?

Yes.

A possible future enhancement could do this automatically. For example all the relocations would be with respect to the section symbol of the sh_link section so the R_ARM_PREL31 relocations could be added automatically.

We probably shouldn't. yaml2obj currently does just a few minor things automatically. E.g. it links a relocation section to .symtab by default. But we never do something major, like adding sections with relocations.
To add relocations automatically we will need to create a relocation section and add relocations to it. I see the following problem of doing this implicitly:

  1. yaml2obj is often used for creating broken inputs. We should be able to have a way to create any layout we want. E.g. without relocations, or with relocations of not-R_ARM_PREL31 types etc. It is also probably will be confusing that an object will contain more sections and relocations than were explicitly described in a YAML.
  2. After applying of obj2yaml to an object we usually assume to get the same YAML output that we had as yaml2obj input.

If we want to add the functionality that adds R_ARM_PREL31 relocations automatically, it perhaps can be a separate command line option, or may be an explicit YAML flag.
E.g. we support the following syntax currently:

SectionHeaderTable:
  NoHeaders: true

Which says to yaml2obj to not emit a section header table, that is created by default. Perhaps we could add such a flag, like "EmitRelocations". Honestly it feels a bit excessive to me though:
I don`t expect to see many relocations in tests and I guess it is not that hard to add them manually in favor of simplicity of code. Also having explicit tests might be better: perhaps
we can't assume that all people are so good familar with yaml2obj that knows about "auto R_ARM_PREL31 relocations" feature.