This is an archive of the discontinued LLVM Phabricator instance.

Create MutableObjectWriter class
Needs ReviewPublic

Authored by abrachet on Aug 18 2019, 3:58 PM.

Diff Detail

Event Timeline

abrachet created this revision.Aug 18 2019, 3:58 PM
abrachet marked an inline comment as done.Aug 18 2019, 8:37 PM
abrachet added inline comments.
llvm/tools/llvm-objcopy/ELF/Object.h
1167

I'm not sure if I'm using StringTableBuilder incorrectly. An assertion failed saying that the string was not in the table. I looked into it with lldb and manually added the string it said was not there, it still didn't work.

I've made a number of first-pass comments for you.

llvm/tools/llvm-objcopy/ELF/Object.h
1098–1100

Don't need to add this now, but some TODOs to add are:

  1. What to do if the string table index is a removed section.
  2. What to do if the e_shstrndx value is either SHN_XINDEX.
  3. What to do if the e_shstrndx value is 0.
1104

This shouldn't be an assert. What if the e_shstrndx value points at a non-string table?

1140

This seems like an unnecessary copy here. Why can't it wait until you actually need to write the .shstrtab to the buffer?

1142–1151

This block of code is very similar to the above block. Consider sharing it.

1167

I can't see from the code what is going wrong. Check to make sure that the strings are actually added to the StringTableBuilder when you call add, and take a look at the return values from that. I doubt they'll be much different from what getOffset eventually will return.

Also, setting the section name isn't part of laying out a section, so this feels like it's being done in the wrong palce.

1175

What about segments that are nested in other segments? This code doesn't handle that.

1181

What if a program header covers the ELF header and program header table? This code won't work for that.

1190–1191

I think you've got this alignment in the wrong place. I'm pretty sure it should be before the offset is set...

1201–1203

It would be easier to see what's going on with a traditional for loop here.

1250

What about data within segments not covered by sections? This code doesn't handle that.

abrachet updated this revision to Diff 217073.Aug 25 2019, 9:23 PM

Put option to use writer behind flag. Other large changes

abrachet updated this revision to Diff 217074.Aug 25 2019, 9:24 PM

Use right diff

abrachet updated this revision to Diff 217235.Aug 26 2019, 1:43 PM

Fixed UB bug. Still doesn't write correctly, llvm-readelf says the shstrtab is not null terminated, presumably the other sections are written poorly too.

Overall, the design looks pretty good now. I think it is clear to me what is happening. I do think you need to write significantly more tests though. Possibly the way to do it is to add a case to the existing tests for basic copies using the switch, and show that the output is identical. If it isn't, you need to understand why there are differences and whether they are important.

llvm/test/tools/llvm-objcopy/ELF/mutable-object-no-change.test
1 ↗(On Diff #217235)

coppied -> copied

Copied under what circumstance? Perhaps "is properly copied when using the mutable object interface, and no switches are specified".

3–12 ↗(On Diff #217235)

It's more normal to put the YAML at the end of the test.

I think your test case needs to test program headers and symbols too, so this needs some of each. Perhaps a separate test case would be appropriate to test segment layout for nested segments.

16 ↗(On Diff #217235)

Delete the extra blank line.

llvm/tools/llvm-objcopy/CommonOpts.td
87 ↗(On Diff #217235)

ObjectFile -> Object

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
818 ↗(On Diff #217235)

/// -> //

Maybe make this comment more explanatory too.

827 ↗(On Diff #217235)

I'm hesitant about this not taking an r-value reference or copy, because you std::move it further down the stack. This feels like it might be at risk of surprising a user of the interface later on.

llvm/tools/llvm-objcopy/ELF/Object.cpp
2313 ↗(On Diff #217235)

Probably worth a using for the typename ELFT::Phdr type, maybe even in the class definition.

2332 ↗(On Diff #217235)

Shouldn't this be Obj.getHeader().e_phnum * sizeof(Phdr)?

2337 ↗(On Diff #217235)

Maybe a stable_sort would be a good plan here?

2354 ↗(On Diff #217235)

What if the parent segment hasn't been laid out yet? I believe that you could theoretically have the parent segment after the child segment, in the ordered vector, if the alignment of the parent is greater than an otherwise identical child.

2369 ↗(On Diff #217235)

Why?

(I know it's the null section, but it's a good idea to be clear about it)

2385 ↗(On Diff #217235)

Huh? Do you mean to be using sh_addralign here?

2476 ↗(On Diff #217235)

Nit: delete the blank line

abrachet updated this revision to Diff 217775.Aug 28 2019, 10:12 PM

Addressed review comments

abrachet marked 13 inline comments as done.Aug 28 2019, 10:34 PM

I didn't get to all comments yet, and I also haven't added more tests because currently the object file isn't being create properly. Once I am making a valid ELF file I will do more rigorous testing.

Right now llvm-readelf complains that that the .shstrtab is not null terminated, so probably a layout issue of sections. I've been looking into it and couldn't find the problem yet. My section offsets are slightly lower than how llvm-objcopy currently does layout. Does anyone have any ideas why that is?

Currently this patch creates sections at offsets 0, 38, 38, 50, 51 and llvm-objcopy does 0, 40, 40, 58, 59 (both with no arguments)

llvm/test/tools/llvm-objcopy/ELF/mutable-object-no-change.test
3–12 ↗(On Diff #217235)

Oops forgot to do the first part, I'll do that in a later diff. Haven't done second part see my out of line comment.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
827 ↗(On Diff #217235)

There is no copy constructor for ObjectFiles so I could make one for MutableELFObject, I can't move In or *O the compiler says. I don't really know the intricacies of when you can and cannot move an object, probably I could move the OwningObjectFile though, this would require larger changes to this method. Do you have any thoughts?

llvm/tools/llvm-objcopy/ELF/Object.cpp
2332 ↗(On Diff #217235)

Good catch :)

2337 ↗(On Diff #217235)

I actually just learned about this today in class and was thinking it was just what I needed :)

2354 ↗(On Diff #217235)

I haven't gotten around to this yet because I have bigger problems that I want to address before potentially adding more bugs.

I can do this the quick way which would be having an std::set<size_t> and whenever *ParentIndex is larger than Index I layout that segment and then add it to the set and check every iteration of the loop if Index is in there. Seems ugly though. I could also a bool laidOut to MutableELFSegment<ELFT> because its ok to make every segment mutable because there are so few of them I believe.

2385 ↗(On Diff #217235)

Oh yes thanks!

2476 ↗(On Diff #217235)

Are you sure? There was previously a blank line between template class ELFWriter<ELF32BE>; and the end of namesapce elf. I think it looks better too.

I didn't get to all comments yet, and I also haven't added more tests because currently the object file isn't being create properly. Once I am making a valid ELF file I will do more rigorous testing.

Right now llvm-readelf complains that that the .shstrtab is not null terminated, so probably a layout issue of sections. I've been looking into it and couldn't find the problem yet. My section offsets are slightly lower than how llvm-objcopy currently does layout. Does anyone have any ideas why that is?

No idea myself off the top of my head, but the best thing to do is probably just to manually look at the produced object in a hex viewer, and see what looks wrong (consider section sizes, the section offset, and whether the right section is marked as the section header string table).

Currently this patch creates sections at offsets 0, 38, 38, 50, 51 and llvm-objcopy does 0, 40, 40, 58, 59 (both with no arguments)

I'd try manually laying out the ELF by hand, and see what your calculations come to, then compare them to what happens when you run the tool with a debugger. Where do the offsets start differing, and what causes that? Obvious suggestion might be problems with alignment, but I don't know.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
827 ↗(On Diff #217235)

I don't really think we want a copy, thinking about it more. We probably do want an r-value reference though. I'd change the signatures and call sites as follows, assuming it works (not had the time to verify them myself):

return elf::executeObjcopyWithMutableObject(Config, std::move(*ELFBinary), Out)
Error executeObjcopyWithMutableObject(const CopyConfig &Config,
                                      object::ELFObjectFileBase &&In,
                                      Buffer &Out) {
  ...
  return handleArgsAndWrite(std::move(*O), Out)
}
Error handleArgsAndWrite(ELFObjectFile<ELFT> &&Obj, Buffer &Out)
llvm/tools/llvm-objcopy/ELF/Object.cpp
2327–2332 ↗(On Diff #217775)

Adding the segments like this is a good idea taken from llvm-objcopy's existing handling (even if I do say so myself, since I remember suggesting this originally!), but where do you prevent them actually being written out?

2385 ↗(On Diff #217775)

clang-format?

2458 ↗(On Diff #217775)

I might be being blind, but where do you write the program header table?

2354 ↗(On Diff #217235)

I don't think you need to add a member to MutableELFSegment. You could have a local vector/map of bool though, so you can do IsLaidOut[Index] and just check and update that value at appropriate times.

2369 ↗(On Diff #217235)

Thinking about it from a strict correctness point of view, I think you just want to skip all sections with type SHT_NULL. From the gABI:

SHT_NULL - This value marks the section header as inactive; it does not have an associated section. Other members of the section header have undefined values.

2476 ↗(On Diff #217235)

My mistake. I misread the old diff. Please re-add!

llvm/tools/llvm-objcopy/ELF/Object.h
356–359

Canonical names are Elf_Ehdr, Elf_Phdr etc.