This is an archive of the discontinued LLVM Phabricator instance.

[yaml2obj] ELF Relocations Support
ClosedPublic

Authored by atanasyan on Apr 6 2014, 2:34 AM.

Details

Summary

The patch implements support for both relocation record formats: Elf_Rel and Elf_Rela. It is possible to define relocation against symbol only.
Relocations against sections will be implemented later. Now yaml2obj recognizes X86_64, MIPS and Hexagon relocation types.

Example of relocation section specification:

Sections:
- Name: .text
  Type: SHT_PROGBITS
  Content: "0000000000000000"
  AddressAlign: 16
  Flags: [SHF_ALLOC]

- !Relocations
  Name: .rel.text
  Type: SHT_REL
  Info: .text
  AddressAlign: 4
  Relocations:
    - Offset: 0x1
      Symbol: glob1
      Type: R_MIPS_32
    - Offset: 0x2
      Symbol: glob2
      Type: R_MIPS_CALL16

Diff Detail

Event Timeline

shankarke requested changes to this revision.Apr 7 2014, 12:40 PM
shankarke added inline comments.
lib/Object/ELFYAML.cpp
339–379

will this switch become too long ? also I think some of the relocation values might overlap too.

atanasyan added inline comments.Apr 7 2014, 12:55 PM
lib/Object/ELFYAML.cpp
339–379
  1. Yes, the switch is long. But I cannot figure out how and why do we need to leave only subset of X86_64 relocations here.
  2. All these relocation types are unique constants. But in general llvm::yaml::Input::matchEnumScalar and llvm::yaml::Output::matchEnumScalar methods accept overlapped constants. In case of a YAML file reading that allow to use different names for the same constant. In case of YAML file writing, we write out the first name matched to the numerical code.
shankarke accepted this revision.Apr 7 2014, 1:19 PM
shankarke added inline comments.
lib/Object/ELFYAML.cpp
339–379

should the architecture be specified as a namespace too like how its with lld ?

atanasyan added inline comments.Apr 7 2014, 1:39 PM
lib/Object/ELFYAML.cpp
339–379

Ah, now I probably see the point. This code is not ready for writing yaml file indeed. To make this routine better we need to check current target architecture and select an appropriate set of relocation type constants. Unfortunately now we cannot test this approach because the obj2yaml tool does not handle ELF files. That is why I keep this function simple.

Anyway I will look at this function once again. Maybe I will be able to add some target checks and cover them by tests even now.

silvas accepted this revision.Apr 7 2014, 4:29 PM

LGTM.

The duplication between Elf_Rel and Elf_Rela is a bit disturbing but I can't think of a significantly better way to do it. The best I can think of is to skip putting Elf_Rel{,a} in a std::vector (and calling writeArrayData) and directly output to the OS in buildRelocationArray (which would have to be renamed); that would let you push the distinction between Rel and Rela way down to the inner loop, but I'm not sure if that would end up being simpler.

atanasyan updated this revision to Unknown Object (????).Apr 8 2014, 2:12 AM
  1. Reduced code duplication between Elf_Rel and Elf_Rela relocation handling.
  2. ScalarEnumerationTraits<ELFYAML::ELF_REL>::enumeration() now accepts only relocation type names which are valid for the current architecture.
silvas added a comment.Apr 8 2014, 2:44 PM

Nice job reducing the Rel/Rela duplication!

On a second look I have a some small questions about the tag handling.

lib/Object/ELFYAML.cpp
582–591

Is there a reason that on one side there is "!Relocation" and on the other "!Relocations" (plural)?

test/Object/yaml2obj-elf-rel.yaml
41

Is this per-relocation "!Relocation" tag needed? Doesn't the "!Relocations" on the section cover this?

tools/yaml2obj/yaml2elf.cpp
245

Stray semicolon?

267–268

This is actually unreachable, right? If we get here then someone has extended the ELFYAML::Section hierarchy without adding a case here which I think is a programming error.

atanasyan added inline comments.Apr 8 2014, 9:40 PM
lib/Object/ELFYAML.cpp
582–591

Good catch. It is just a typo. The section's type name should be "!Relocations" in both cases.

test/Object/yaml2obj-elf-rel.yaml
41

The "!Relocation" tag is optional. I will remove it from the test case to make it more clear.

tools/yaml2obj/yaml2elf.cpp
267–268

Agreed. The llvm_unreachable is better here.

atanasyan updated this revision to Unknown Object (????).Apr 8 2014, 9:42 PM

Fixed some typos. User level error message replaced by llvm_unreachable call.

silvas added inline comments.Apr 9 2014, 11:06 AM
lib/Object/ELFYAML.cpp
579–580

Now that I think about it, since in the future the ELFYAML::Section hierarchy is going to be growing, I think it would make sense to standardize on a consistent tag name scheme. It just bothers me that the generic "!Section" is being used for ContentSection, while "!Relocations" for RelocationSection. Can we standardize on having the tag be the name of the class?

atanasyan added inline comments.Apr 9 2014, 11:11 AM
lib/Object/ELFYAML.cpp
579–580

Agreed. But I am not sure that ContentSection is a good name for tag. What do you think?

We could just have the name as SectionContent or Content to be short.

silvas added inline comments.Apr 9 2014, 11:35 AM
lib/Object/ELFYAML.cpp
579–580

Now that I think more about it, I think that the tag system is just overcomplicating things. I think it would make a lot more sense and be simpler to just autodetect which ELFYAML::Section subclass to create based on which keys are present in the section entry.

i.e., the deserialization deserializes Content: *and* Relocations: (and any future additions) as optional, then chooses which subclass to instantiate based on what is present. Of course, it issues an error if there is an illegal combination of keys.

The idea is that Content:, Relocations:, etc. (any future ones) would be just sugar for filling in the Content: in a more convenient/readable way; this approach makes the whole process easier to understand from a user's perspective.

I think the goal of the tags (making it a bit more "strongly typed" on disk) is admirable, but at the end of the day we are still duck typing (from a tag perspective) between Rel and Rela (and probably a bunch of other things). It's also just plain redundant.

shankarke added inline comments.Apr 9 2014, 12:10 PM
lib/Object/ELFYAML.cpp
579–580

This may be over-engineering yaml2obj IMHO.

silvas added inline comments.Apr 9 2014, 2:42 PM
lib/Object/ELFYAML.cpp
579–580

It's not clear what your comment was aimed at (my last paragraph or the approach I suggested?).

atanasyan added inline comments.Apr 9 2014, 3:04 PM
lib/Object/ELFYAML.cpp
579–580

I see the point but I think now we do not have enough information to make a correct design decision. There are only two different types of section and it is rather easy (I hope) to check what entries are in the Section block and select an appropriate binary representation. But suppose that in the future we will have more mandatory and optional Section entries. I am not sure that it will be always possible to select a binary representation in that case.

Let's define two types of section:

SectionFoo
  MegaContent (mandatory field)
  UselessContent (optional field)

SectionBar
  MegaContent (mandatory field)
  NotSoUselessContent (optional field)

It's difficult to recognize a section type in the following YAML file:

- Name: .section_name
  MegaContent: "bla-bla-bla"

If we define much more section type later or, on the contrary, understand that only RawContentSection and RelocationSection satisfy our requirements, we can combine two approaches:

  1. Tag defines section type unambiguously.
  2. If Tag is omitted, we can try to deduce section type from the set of declared fields.

Now I would like to keep the solution simple. Next we need to implement obj2yaml conversion for ELF binaries. This might give us additional info to select persistent design.

atanasyan added inline comments.Apr 9 2014, 3:30 PM
lib/Object/ELFYAML.cpp
579–580

From the other side, I will try to implement the approach suggested by you tomorrow. Let's take a look on the result and compare solutions.

atanasyan updated this revision to Unknown Object (????).Apr 10 2014, 7:04 AM

I still think that a user should specify a section type explicitly. One more argument is an error diagnostics. If a user explicitly defines a section type but forget some mandatory fields we can show a clear error message. If we need to deduce a section type and a user missed some fields, we have to show unclear "Unknown section type" error message. Try to quickly find why the section .stub has unknown type in the example below:

Sections:
- Name: .text
  Type: SHT_PROGBITS
  Content: "0000000000000000"
  AddressAlign: 16
  Flags: [SHF_ALLOC]
- Name: .stub
  Type: SHT_PROGBITS
  AddressAlign: 16
  Flags: [SHF_ALLOC]
- Name: .eh_frame
  Type: SHT_PROGBITS
  Content: "0000000000000000"
  AddressAlign: 16
  Flags: [SHF_ALLOC]

So I suggest one more solution. Let's recognize a type of section using Type field. A section with SHT_REL or SHT_RELA type is a relocation section, otherwise it is a raw content section. If or when we have more section types we will be able to invent another solution. The only shortcoming of such approach is impossibility to define a relocation section with arbitrary (for example invalid) content but I think it is a minor problem.

Excellent points, Simon, I agree with all of them and you've done a very good analysis. Let's get this patch committed. I especially like the use of the Type: field as the discriminator; it leads to a very simple implementation.

I also think you're absolutely right about obj2yaml; the requirements of obj2yaml will probably influence the final design more than anything.

atanasyan closed this revision.Apr 10 2014, 9:25 PM

Thanks for review.

Closed by commit rL206017 (authored by @atanasyan)