This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Object/
-
llvm/
-
Object/
6/6
ELFObjectFile.h
90/97
MutableELFObject.h
-
unittests/Object/
-
Object/
-
CMakeLists.txt
36/36
MutableELFObjectTest.cpp

Differential D64281

[Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 1]
AcceptedPublic

Authored by abrachet on Jul 6 2019, 1:18 AM.

Download Raw Diff

Details

Reviewers

jhenderson
rupprecht
Bigcheese
jakehehrlich
lebedev.ri

Summary

This initial patch adds a MutableELFObject class for doing mutations on sections of an ELFObjectFile.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

In D64281#1600516, @labath wrote:

Sorry to barge in here, but I couldn't resist to not spread the knowledge of more advanced googletest features. EXPECT_EQ is perfectly safe to use with StringRefs as it just defers to their operator==, and (as I mentioned in the other review), we have better facilities for checking the state of Expected and Error values.

Not at all thanks for the comments I really appreciate them!

llvm/include/llvm/Object/MutableObject.h
17 ↗	(On Diff #211671)	Not sure what to do here. I think eventually I will have a MutableObject base class, and it would seem silly to make a MutableRange.h. What do you think I should do?
21 ↗	(On Diff #211671)	FWIW, I think the MMU's on X86 and AArch64 both only support pointer sizes of 48 bits or similar, in either case less than 64, so pointers cannot be this large. I'm not sure this would be safe for 32 bit architectures though, but we have PointerIntPair in ADT, so this pattern is used in LLVM. I would also add that moving the bool out effectively doubles the structs size because presumably the compiler will word align it, but then reading those members is faster now. In either case it is crazy to worry about something like this in this stage! So I have done away with the bit field.

abrachet marked 11 inline comments as done.Jul 25 2019, 5:06 PM

This comment was removed by abrachet.

labath added inline comments.Jul 26 2019, 12:28 AM

llvm/include/llvm/Object/MutableObject.h
46 ↗	(On Diff #211857)	I guess this assert isn't needed now that you store the full uintptr_t?
21 ↗	(On Diff #211671)	Note that PointerIntPair uses the low bits of the pointer to store data, which is a generally safe (depending on your reading of the c++ standard) thing to do for aligned values. 64-bit pointers are indeed smaller, but for instance there's an arm64 pointer authentication extension, which uses those extra bits to store pointer "signatures". And yeah, there are no free bits on 32-bit pointers. :)
llvm/unittests/Object/MutableELFObjectTest.cpp
167–171	using `llvm::zip(ObjFile.sections(), MutableObject.sections())` is a bit cleaner way to handle parallel iteration (but don't forget to assert that the range sizes are the same). With something like `EXPECT_THAT(MutableObject.sections(), testing::ContainerEq(ObjFile.sections())` this check would be a one-liner, but it would require a bit more plumbing to make sure the elements are comparable, so it may not be worth it if this is just a one-off thing...

grimar added inline comments.Jul 26 2019, 1:54 AM

llvm/include/llvm/Object/MutableELFObject.h
28	Since all members are `public`, should this be a `struct`?
llvm/unittests/Object/MutableELFObjectTest.cpp
22	I'd suggest adding a short description comment before each unit test (about what this test intended to do) just like we often do in a regular test files.

jhenderson added inline comments.Jul 26 2019, 5:59 AM

llvm/include/llvm/Object/ELFObjectFile.h
794	I'm not sure I understand why you've made this change from the previous version? I don't think it gives anything.
llvm/include/llvm/Object/MutableObject.h
17 ↗	(On Diff #211671)	I'd probably put it in MutableELFObject.h for now and move it out in a later change when needed.
llvm/unittests/Object/MutableELFObjectTest.cpp
44	This still has the unnecessary trailing return type.
95	You need an ASSERT here that there are the correct number of sections.

Addressed review comments.

abrachet marked 9 inline comments as done.Jul 26 2019, 8:35 AM

abrachet added inline comments.

llvm/include/llvm/Object/ELFObjectFile.h
794	Because I didn't know you could do ELFObjectFile<ELFT>::isBerkeleyText() to not call the overrided method! Oops.
llvm/include/llvm/Object/MutableELFObject.h
28	For some reason I thought I remembered the style guide saying only POD types should be struct and others should be class regardless of access modifiers.
llvm/unittests/Object/MutableELFObjectTest.cpp
167–171	Thanks never knew about zip!

abrachet mentioned this in D65255: [yaml2obj] Move core yaml2obj code into lib and include for use in unit tests.Jul 26 2019, 1:58 PM

abrachet added a child revision: D65367: [Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 2].Jul 27 2019, 7:24 AM

jhenderson added inline comments.Jul 29 2019, 8:25 AM

llvm/include/llvm/Object/ELFObjectFile.h
804	Are you sure you want to be calling the base-class method though? Isn't that going to go wrong if Sec is a modified or added section?
llvm/include/llvm/Object/MutableELFObject.h
21	I've been thinking about how this class is used by its clients. To me, it doesn't seem great that clients have to know whether a given member is new or not (i.e. whether it is stored in NewValues etc). I think it's okay to provide makeMutable to get a mutable reference to one of the original members, but I also feel like getNew should not be in the public interface. Instead, you should provide a function that can take a key or something to look things up with, possibly a MappingType, and return the corresponding item as a const variable.
25	New -> IsNew This is a) easier to read, and b) going to play nicer with the upcoming variable name changes.
llvm/unittests/Object/MutableELFObjectTest.cpp
22	sections -> section's
141	You probably also need to show that if makeMutable has been called, calls to the various functions in the interface reference the mutable version. Currently, you only do this for the section name.

abrachet marked 5 inline comments as done.Jul 29 2019, 9:40 AM

abrachet added inline comments.

llvm/include/llvm/Object/ELFObjectFile.h
804	When these functions are called, they get called with a DataRef pointing to the Elf_Shdr whether they are new or not. I was segfaulting here before because in `ELFObjectFile` the DataRef's point to the Elf_Shdr but in `MutableELFObject` they are indexes into the `MutableRange`. I couldn't find a way for the DataRefs to be interpreted the same way between the two classes. This isn't ideal though.
llvm/include/llvm/Object/MutableELFObject.h
21	I had something like this originally I think, and I agree that right now it isn't very ergonomic. The problem is that the original and new entries are of completely different types. I haven't found a great way to do this. What I don't do right now but might be better is require that the new type, T have a conversion operator. But no matter what, there needs to be a public method to get a pointer to the new type. This would require that MutableRange also have a template parameter to the original type it is pointing to (unless the conversion was to uintptr_t i suppose).

Addressed review comments

abrachet marked 4 inline comments as done.Jul 29 2019, 10:42 PM

rupprecht added inline comments.Jul 30 2019, 2:19 PM

llvm/include/llvm/Object/MutableELFObject.h
27	nit: it's simpler if all the params (Ptr/IsNew) are consistently in the same order (for class members, constructor arg order, and member initializers)
60–61	These should have an assertion (in debug mode) that `Index` is in range
67	`Mapping.Ptr` is already a `uintptr_t`, is the `reinterpret_cast` necessary?
90	`std::memcpy`
111–112	Is `toDataRef(reinterpret_cast<uintptr_t>(...))` redundant? (Converting from pointer -> uintptr_t -> pointer)
135	I think this is a potentially-invalid reference, as `B` is `std::move`-d for the previous member. It looks like this works because the move constructor for `ELFObjectFile` is implemented as a copy constructor, although that is unusual (a typical implementation would swap() members): template <class ELFT> ELFObjectFile<ELFT>::ELFObjectFile(ELFObjectFile<ELFT> &&Other) : ELFObjectFile(Other.Data, Other.EF, Other.DotDynSymSec, Other.DotSymtabSec, Other.ShndxTable) {} I'm curious if you can test this theory by "fixing" the move constructor to be something like: template <class ELFT> ELFObjectFile<ELFT>::ELFObjectFile(ELFObjectFile<ELFT> &&Other) { using std::swap; swap(Data, Other.Data); swap(EF, Other.EF); swap(DotDynSymSec, Other.DotDynSymSec); swap(DotSymtabSec, Other.DotSymtabSec); swap(ShndxTable, Other.ShndxTable); } (no need to check it in though) I think you can just use `section_begin()/section_end()` directly here, and it will correctly call the methods from the parent class (the vtable doesn't update until later)
136	What's being captured by `[&]` here?
llvm/unittests/Object/MutableELFObjectTest.cpp
48	This should check the error code too
49	Can you avoid the copy here, and just return `StringRef`?
109	nit: `ZeroData` is kind of misleading since this is not `0`, but rather an ASCII `'0'`, i.e. this is {48, 48, 48, 48}. At any rate, it doesn't matter that it's "zero", any random data works. Testing against something non-zero is probably healthier for tests too. Maybe just rename it to something else and use different bytes (e.g. to test byte ordering)
169–170	This macro seems small enough that it should just be typed out
191–192	You can put both `Iter` and `End` in the for header as described here: http://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop for (auto Iter = MutableObject.section_begin(), End = MutableObject.section_end(); Iter != End; ++Iter)

(taking this out-of-line due to length)

The problem is that the original and new entries are of completely different types.

What bit is this referring to? The MutableRange is for wrapping a container (vector) of Ts, and the new values are stored as Ts.

I don't want to need to know whether I'm dealing with a new or original member at any point outside the class itself. Indeed, if I can hide the MappingType completely from the public interface, that's even better. If I'm looking up an item, I want to have an index/key into the range, which transparently works on fetching the new or old member depending on where it is in the range. That will require some plumbing to know which container (the underlying one or the "new" one) is referring to.

What about the following?

template <typename T>
class MutableRange {
  ArrayRef<T> Originals;
  std::map<size_t Optional<T>> Mapping;
  using value_type = Mapping::value_type;

public:
  MutableRange(ArrayRef<T> Container) : Originals(Container) {
    for (size_t I = 0, Size = Container.size(); I != Size; ++I)
      Mapping.insert(std::make_pair(I, None));
  }

  void push_back(T t) {
    Mapping.insert(value_type(Mapping.size(), T));
  }

  const T& operator[](size_t Index) const {
    assert(Mapping.count(Index) != 0);
    if (Mapping[Index] == None)
      return Originals[Index];
    return Mapping[Index];
  }

  T& getMutable(size_t Index) {
    assert(Mapping.count(Index) != 0);
    if (Mapping[Index] == None)
      Mapping[Index] = Originals[Index];
    return Mapping[Index];
  }

  void remove(T t) {
    size_t Removed = 0;
    for (size_t I = 0, Removed = 0, Size = Mapping.size(); I != Size; ++I) {
      const T& Value = Mapping[I] ? *Mapping[I] : Originals[I];
      if (Value == t) {
        Mapping.erase(I);
        ++Removed;
        continue;
      }
      if (Removed == 0)
        continue;
      Mapping[I-Removed] = Mapping[I];
    }
  }
};

I think this satisfies the requirements of the container: it provides access to the underlying container member, until it gets modified, even if members before it have been removed. It also provides a way to get a modifiable version. You could even use a different key instead of size_t e.g. a pointer, by keeping a record of the mapping between pointer and index, though that is a little trickier. If you use a pointer, you could replace the Optional with a std::unique_ptr. In turn, that might be enough to avoid co-opting the DataRefImpl behaviour so that the base class behaviour will just work (because the pointer will point to a real section, just the real section is stored in this map).

Does this work?

llvm/include/llvm/Object/ELFObjectFile.h
804	Oh, I realised I was mistaken as to which class we are in. I really don't like this change here at all. The base class shouldn't need to know that it has to call the base class version of the method. It should just work. Effectively the design here now means the base class has to know that there's a subclass that might change the meaning of some of its functions under-the-hood, and guard against that. The solution might be to make getSection a virtual method and override it in your sub-class. What do you think of that?

(won't quote because of length but I'll tag you so you know I am responding) @jhenderson
That does work, and was what I was thinking of before. Also I quite like keeping pointers so it just works with the methods in the base class. But I'm still not sure what type T should be. If we use sections as an example, when changing the name it isn't possible (or feasible, I suppose that sh_name could be a pointer (not into the string table but a pointer into the processes memory) for modified sections). Right now, the original type is just an Elf_Shdr, or rather the mapping type points to one, but for a modified section, it indexes into the vector where to find the T, which is a wrapper around an Elf_Shdr but for example for its name it has a StringRef so it can be easily changed. This is how llvm-objcopy does it, but it copies every single section whether or not they will ever be modified. I agree that it is ugly for the users of the class to ever have to deal with the MappingType, it isn't ideal, but I don't know how to do this unless we copy every single one like is done in llvm-objcopy, and at that point I would suggest the Section type just have an Elf_Shdr * to the original and just keep a vector of Sections.

Something I played around with before was having a template parameter to the original type and then having methods with return OrigType & and require that T have a conversion function to 'OrigType`. This would be cleaner for the user but also avoid copying every original type into the wrapper type T. This is all moot if you don't think that copying is that big of a bottle neck, @MaskRay mentioned this. After all this is how its done for sections in llvm-objcopy. But I am still shooting for being able to use this class with symbols too so there may be more concern about this there.

Addressed review comments.

Fixed silly mistake

Again responding to your comment @jhenderson but not quoting due to length.

I'll try to explain MutableRange further and why certain decisions were made, upon reading my previous response I don't think I explained well. I fear that this spells the writing on the wall for the current implementation because if I need to explain it isn't written very well and won't be maintainable!

Again for convenience I'll talk about sections. When it is initially constructed the std::vector<T> NewValues has a size of 0, the std::vector<MappingType> Mappings has as a size of as many sections in the original file. The MappingTypes will look like this: {.IsNew = false, .Ptr = <pointer into the file>}, but on modification that mapping looks like {.IsNew = true, .Ptr = <index into NewValues>}. This means that for unmodified sections there is an overhead of sizeof(MappingType), not sizeof(T) so doing modifications on only some sections should be faster, or at least there is less copying and uses less memory.

Of course there's no arguing that it leads to what is pretty ugly (and obviously causes a lot of confusion) solution. To get to the Elf_Shdr* from a new section there is a lot of indirection. DataRef => Mapping => NewValues => Elf_Shdr. And then for an unmodified one it is DataRef => Mapping => Elf_Shdr.

I still think that modified sections cannot be just an Elf_Shdr and there isn't a clean way to hold either an Elf_Shdr or a MutableELFSection. I had played around with a union, which didn't work because MutableELFSection had a non trivial copy constructor, but this still requires that the uses of MutableRange know to distinguish between the original type and the modified type. I will also say that although it is ugly, and it would be ideal if what you call the plumbing was inside of MutableRange, that plumbing exists in MutableELFObject, but it isn't exposed outside of the class. MutableELFObject abstracts this and works the same way as its predecessors.

llvm/include/llvm/Object/ELFObjectFile.h
804	What do you think of that? I think that saves so much boiler plate code! Can't believe I didn't see this earlier thank you. Also, I remember @jakehehrlich saying how I was previously doing it was not good, this is much better now. I now only have to override `moveSectionNext`, `getSectionName`, `getSectionIndex` and `getSectionContents`
llvm/include/llvm/Object/MutableELFObject.h
135	I couldn't test to see if it breaks with your example of swapping members because `ELFObjectFile` has no default constructor so all I could use were Others members to initialize and then the swap is just swapping two equal values. I think you can just use section_begin()/section_end() directly here, and it will correctly call the methods from the parent class (the vtable doesn't update until later) This didn't work for me, my program never exited (I gave it ~10 seconds assuming a segfault was coming). I am still using `B.section_begin()/end()`. I also tried `ELFObjectFile<ELFT>::section_begin()` and that compiled but caused a crash. Tried to create the OwningArrayRef with nonsense data `OwningArrayRef(this=0x0000000106006280, Size=72057594037927936)` ObjectTests(66699,0x10c97d5c0) malloc: can't allocate region * mach_vm_map(size=72057594037927936) failed (error code=3) ObjectTests(66699,0x10c97d5c0) malloc: * set a breakpoint in malloc_error_break to debug libc++abi.dylib: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc

lebedev.ri resigned from this revision.Aug 1 2019, 3:11 PM

Complete overhaul of how MutableRange works. It is now called MutableTable which seemed more fitting. It has a much cleaner interface for MutableELFObject while maintaining low overhead.

Fixed the reason for needing to make a member mutable

Since you're introducing a new class in a public header, you should write doxygen comments for at least the public/protected interface, and quite possibly any private bits too. I think if you are careful with those comments too, it will help explain the design of the classes.

I've not looked at the unit tests today. I'll do that tomorrow. The interface looks a lot cleaner than it did when I last looked at this. Hopefully adding comments will make it much clearer what's going on. If you make MutableTable a sub-class, then my concerns about the leaky implementation (i.e. needing to decide between fetching the section properties from the table or the base version) are significantly lessened.

llvm/include/llvm/Object/MutableELFObject.h
20	Perhaps this should be nested inside MutableELFObject for now, since it is effectively an implementation detail? Obviously, it would move to MutableObject if that class ever exists.
56	For what do you envisage this function being useful?
62	How about having this return a reference, rather than a pointer? That would seem to be a more natural fit.
71	What is this function useful for? Returning an Optional would be a nicer interface than returning a nullptr, IMO.
92	Does this function need to exist as a free function, or can it be a private method of MutableELFObject?
122	Do you need this to be a friend, or can you just pass in another argument or two (if needed) into the constructor of MutableELFSection?
128	protected? Does this class have any subclasses?
144	Use `explicit` here. Should this be an r-value reference? What do you think?
161–162	You're going to need to do something with the Error inside Expected here. This function should probably return an Expected too, so that you can pass that error up to a higher level to be handled.
169	What is this function for?

Addressed review comments

In D64281#1617212, @jhenderson wrote:

Since you're introducing a new class in a public header, you should write doxygen comments for at least the public/protected interface, and quite possibly any private bits too. I think if you are careful with those comments too, it will help explain the design of the classes.

I've not looked at the unit tests today. I'll do that tomorrow. The interface looks a lot cleaner than it did when I last looked at this. Hopefully adding comments will make it much clearer what's going on. If you make MutableTable a sub-class, then my concerns about the leaky implementation (i.e. needing to decide between fetching the section properties from the table or the base version) are significantly lessened.

I'm still working on the comments for the private interface although they are coming. I do have just the one for the one public method not inherited from ObjectFile.

llvm/include/llvm/Object/MutableELFObject.h
56	This was referencing `getOriginal` before. In symbols for example, the top half of the `DataRef` is the section index of the symbol table. In that case I use `getOriginal` to see its sh_type. It wouldn't make sense to use the updated section list in this case I think.
71	This was referencing `getIfNew` previously. It's for methods which could not use just the OrigType to do their job properly. For example, `getSectionName` returns the `Name` member when the section has been modified because there is no way to describe the change to `ELFObjectFile` through its `sh_name` field. `Optional` does not work with references so far as I can tell. Is this correct? A copy would be too costly for `MutableELFSections` which have an `OwningArrayRef` and `std::string`.
144	Sounds fine to me. In the unit tests I move the ELFObjectFile but then continue to use it because as @rupprecht pointed out previously the move constructor is just a copy constructor. Is it safe to do this or should I explicitly copy?
169	It advances the `section_iterator`. `MutableELFObject` uses a different iterator to that of `ELFObjectFile` which I belive would have a `moveSectionNext` that did something like `Sec.p += sizeof(Elf_Shdr)` as those just point directly to the section header.

Realized from the code example in doxygen comment that it was very unergonomic, added a better overload to public methods which take a SectionRef so they can be used in for-each loops.

jhenderson added inline comments.Aug 7 2019, 8:50 AM

llvm/include/llvm/Object/MutableELFObject.h
71	You're right about `Optional`, sorry for the noise. `getSectionName` currently calls `getConstIfNew`, so this function doesn't need to exist, right?
118	Does this need to exist? If so, it needs unit-testing.
122	Does this need to exist? If so, it needs unit-testing.
144	Is it safe to do this or should I explicitly copy? This doesn't sound safe to me - you need to copy or just simply not use the object after it's been moved. a) I wouldn't be surprised if this is technically undefined behaviour and the compiler makes assumptions about the state of the instance after the move. b) It's fragile to future changes.
169	At the moment, it doesn't look like it's used, so it shouldn't exist. Re-add it when it is needed. However, perhaps a safer thing to do is define your own iterator class, as @jakehehrlich said that does this internally.
llvm/unittests/Object/MutableELFObjectTest.cpp
22–23	I'd rephrase the second half of this sentence as "methods work the same in both ELFObjectFile and MutableELFObject".
44	Isn't the solution here to just make a second local variable that's a pointer to an ELFObjectFile constructed from MutableObject? `ELFObjectFile *ElfObject = &MutableObject;`
47	ObjFileSecCount?
49	MutObjSecCount?
51	This should be ASSERT_EQ, since the rest of the test assumes there's the same number.
79	IIRC, this will still abort if there is an error, since the error itself hasn't been handled. The correct thing to do would be to a) check there's no Error, and then b) call consumeError().
115	You need to assert somewhere that NumSections is as expected.
126	You need to show the iterator is still valid after this, by checking std::distance.
193	I'm pretty sure you don't need the UL suffix here. Same below.

Aargh, you pushed your latest update whilst I was doing my last review, so I can't see what are the latest changes!

added a better overload to public methods which take a SectionRef so they can be used in for-each loops.

What is this bit referring to?

jhenderson added inline comments.Aug 7 2019, 8:55 AM

llvm/include/llvm/Object/MutableELFObject.h
160	Probably this should be handleError(MutSecOrErr.takeError());

Addressed review comments

abrachet marked 12 inline comments as done.Aug 7 2019, 9:59 AM

abrachet added inline comments.

llvm/include/llvm/Object/MutableELFObject.h
118	Yes it is used and thus tested implicitly. How would I go about testing the private interface? I could make it protected and then inherit from it in the unit test, but that doesn't sound great. FWIW this is almost identical to the `getHeader` method in `ELFFile`. https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Object/ELF.h#L108
122	Yes. It is used for every public method on `SectionRef` that gets tested in the unit test. Is this enough or should I test it directly as well?
144	I did an explicit copy /I think/ in the unit test now. I have an `ELFObjectFile *` and then deference it and assign to an `ObjectFile &`. Does this make a copy or does the compiler omit it and just assign the address?
169	This was referring to `moveSectionNext()` before. It does get used, it is an overload of the function that `content_iterator::operator ++()` calls. It must be overriden because `ELFObjectFile::moveSectionNext()` iterates over sections not with their index but their mapped address. This uses index into `MutableTable`.
llvm/unittests/Object/MutableELFObjectTest.cpp
44	I don't think so. Then both will be `MutableELFObject`'s and the same virtual methods will be called for each, I believe. I think testing the methods on an `ELFObjectFile` and `MutableELFObject` test that overloads for `MutableELFObject` behave the same as the methods in `ELFObjectFile` when the sections haven't been changed.
79	Thanks!
115	I have `ASSERT_EQ(NumSections, 7);` it ended up being 7 sections because of additional sections that `yaml2obj` added. Is this what you meant?
193	I don't actually need the L it turns out. Without making it unsigned I get this warning `comparison of integers of different signs` on clang.

In D64281#1619189, @jhenderson wrote:

Aargh, you pushed your latest update whilst I was doing my last review, so I can't see what are the latest changes!

added a better overload to public methods which take a SectionRef so they can be used in for-each loops.

What is this bit referring to?

I added getMutableSection(SectionRef) and changed getMutableSection(section_iterator Sec) to just call that with a *Sec. Maybe I should remove the section_iterator overload though?

labath added inline comments.Aug 7 2019, 11:50 AM

llvm/include/llvm/Object/MutableELFObject.h
144	Kind of the whole point of references is that they allow you to "refer" to other objects without making copies of them. If you want to make a copy, drop the `&` and assign it to an `ObjectFile` (or `ELFObjectFile`, or something)

Create a new ObjectFile in unit test

abrachet marked 2 inline comments as done.Aug 7 2019, 1:44 PM

abrachet added inline comments.

llvm/include/llvm/Object/MutableELFObject.h
144	Oh yeah. What am I talking about!

Added doxygen comments to private interface. NFC

jhenderson added inline comments.Aug 8 2019, 6:42 AM

llvm/include/llvm/Object/MutableELFObject.h
21	Comment?
44	How about a comment for MutableELFObject?
87	"regardless if it" sounds off to me. How about "regardless of whether it"
118	Sorry, missed that getHeader was private. Being able to modify the header contents is a required thing however, so you might want to consider how this could be done (possibly a different change though).
122	I have no idea which method this was referring to any more...
141	overrie -> override
142	wether -> whether
169	Sorry, I missed that `moveSectionNext()` is an override.
llvm/unittests/Object/MutableELFObjectTest.cpp
55	Rather than reusing the same variable, create a new variable here. Then don't bother with the ObjFile variable above. It doesn't give us anything.
89	Don't use `auto` here.
115	Yes, that's right.
142	EXPECT_EQ -> ASSERT_EQ to avoid dereferencing an invalid iterator later on. I think I either misread the code last time, as I don't know why I wanted this check here. However, you do show that the std::distance from section_begin to section_end is still correct after you called getMutableSection, but not until after you've done the iteration, meaning that your iterator could be invalid part way through your calls to compareSectionName. Move the later std::distance check up to here instead.
193	Sounds like clang isn't doing a good enough job here - it's a spurious warning, because 2 is a positive integer literal so can clearly be handled properly in this comparison. Check elsewhere, but I thought it was more common in the code base to use lower-case for literal suffixes when they're needed.

abrachet marked 2 inline comments as done.Aug 8 2019, 6:52 AM

abrachet added inline comments.

llvm/include/llvm/Object/MutableELFObject.h
122	Oops I think I responded to this before adding the comments so I forgot to add a what the comment was originally referring to. It was `getSection(DataRefImpl)`. Its the function that I made virtual from `ELFObjectFile`. All the tests that I have now like `FirstSec->getSize()` test this method

Addressed review comments

abrachet marked 17 inline comments as done.Aug 8 2019, 11:28 PM

abrachet added inline comments.

llvm/include/llvm/Object/MutableELFObject.h
21	Not really sure what to say, is that enough you think?
44	Same as above, not quite sure what to say.
llvm/unittests/Object/MutableELFObjectTest.cpp
55	I think that it does give something, The `ObjectFile`s are made from the same yaml so I'm testing that the public methods return the same values. Its a whole thing because ObjectFile has no copy constructor. Also we don't want to use the ObjectFile after moving it. I also still think that doing something like this MutableELFObject<ELF64LE> MutableObject(std::move(MutELFObjFile)); ELFObjectFile<ELF64LE> ElfObject = &MutableObject; // test the interface will never produce any differences so it doesn't test that MutableELFObject has the same behavior as ELFObjectFile on an unmodified object file.
142	Is this what you mean? And then remove the one on 156? Sorry wasn't quite sure.
193	FWIW I think there is there's a function template instantiation somewhere in these macros and so its doing `unsigned_expr == signed_expr` not `unsigned_expr == 2`.

jhenderson added inline comments.Aug 9 2019, 8:43 AM

llvm/include/llvm/Object/MutableELFObject.h
21	Why would you use this class? What is the overall aim here? It probably wants to say something like it is used by MutableELFObject to provide a mutable version of an ELFObjectFile Section or something to that effect.
44	I think this comment needs to explain why this class exists. Essentially it should say why we can't use ELFObjectFile directly for our purposes.
llvm/unittests/Object/MutableELFObjectTest.cpp
55	I think you misunderstood. Having `MutELFObjFile` is fine, as is constructing two objects from the YAML. `ObjFile` doesn't need to exist however. You can use `ELFObjFile` directly now that you're not reusing the local variable for the creation of `MutableObject`, so the two variables you operate on are `ELFObjFile` (created directly from YAML) and `MutableObject` (created indirectly from YAML via MutELFObjFile).
142	Sorry, I obviously wasn't clear enough, and I realised something else whilst writing this comment. At the time of writing, the std::distance at line 127 (checks the section count before mutating), and the one at line 142 (checks it after mutating) are the ones that should exist. The one at line 155 is unnecessary, since you check it at line 142. The one at line 140 is necessary because it shows that the Iter variable is still valid after getting a mutable section.

Addressed review comments

abrachet marked 6 inline comments as done.Aug 9 2019, 9:26 PM

Aside from a comment change, this looks good to me. I'd wait for others to approve too though, and I wouldn't necessarily land it even then until some later patches are ready too, although I'm not sure about that.

llvm/include/llvm/Object/MutableELFObject.h
46	I'd change this slightly: ... an ELFObjectFile. The ELFObjectFile class only provides ...

This revision is now accepted and ready to land.Aug 12 2019, 9:02 AM

rupprecht added inline comments.Aug 12 2019, 2:27 PM

llvm/include/llvm/Object/MutableELFObject.h
24	You should be able to use the more common: using Elf_Shdr = typename ELFT::Shdr;
82–85	"OriginalValues" is overloaded, so it's not clear which one is being referenced in the constructor A common hacky practice in LLVM is to name the constructor arg "TheOriginalValues" or something -- that's fine to use if you can't come up with any other different name.
135–136	ditto
llvm/unittests/Object/MutableELFObjectTest.cpp
89–90	Is it expected that an error here should not cause the test to fail? I think you want an assertion that Expect doesn't have an error instead.

Addressed review comments

abrachet marked 5 inline comments as done.Aug 12 2019, 4:33 PM

Latest changes LGTM.

rupprecht accepted this revision.Aug 15 2019, 11:16 AM

MaskRay added inline comments.Aug 19 2019, 10:13 PM

llvm/include/llvm/Object/MutableELFObject.h
85	`llvm::transform(OriginalValues, ...)`
114	I think such `Out of bounds` assertions are probably not very necessary. They will certainly segfault if the user does have out-of-bounds accesses.
117	Mappings[Index] = MappingType(NewValues.size(), MappingType::New); NewValues.emplace_back(std::forward<Args>(Arguments)...);
151	typo: overridden `the section_iterators in MutableELFObject` -> `MutableELFObject::section_begin` ? (Just name it explicitly)
185	`for (SectionRef Sec : MutObj.sections())` SectionRef has just 2 words. It is usually passed by value.
188	if (auto MutSecOrErr = MutObj.getMutableSection(Sec)) then; else handleError(...);
190	typo: sh_addralign

MaskRay mentioned this in D65633: [Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 3].Aug 19 2019, 10:42 PM

MaskRay mentioned this in D66063: [Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 5].Aug 19 2019, 10:58 PM

MaskRay mentioned this in D66070: [Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 6].Aug 19 2019, 11:03 PM

jhenderson added inline comments.Aug 20 2019, 2:34 AM

llvm/include/llvm/Object/MutableELFObject.h
114	Won't that entirely depend on how the memory is laid out, compiler-generated checks etc? A segmentation fault hardly provides any useful information, whereas at least the assertion shows what's going wrong in builds with assertions enabled...

MaskRay added inline comments.Aug 20 2019, 2:48 AM

llvm/include/llvm/Object/MutableELFObject.h
114	I think if the user of this library makes an out-of-bounds access, the program will very quickly segfault with a clear stack trace (symbolization can happen with a symbol table, even if debug info is absent) that the problem is related to the use of MutableELFObject. There are lots of ways to detect errors: -D_GLIBCXX_DEBUG / asan / ... We can add some assert() in some tricky places (they also serve as documentation of some invariants) but here I think it is probably not very necessary.

rupprecht added inline comments.Aug 20 2019, 12:49 PM

llvm/include/llvm/Object/MutableELFObject.h
114	I think if the user of this library makes an out-of-bounds access, the program will very quickly segfault with a clear stack trace Having dealt with many prod crashes in the past, I strongly disagree. C++ optimizations can often make the crash happen far away from the actual bug. I think it's much more common to run tools in debug mode than in asan, so it's useful to have these.

FYI, you'll have to rebase this whole patch series after rL368826 due to Section::getName() changing signature

Addressed review comments

In D64281#1638215, @rupprecht wrote:

FYI, you'll have to rebase this whole patch series after rL368826 due to Section::getName() changing signature

I was safe for this patch it seems but not others, thanks for the heads up.

llvm/include/llvm/Object/MutableELFObject.h
114	I haven't removed the assertions in this diff. I think worst case it looks ugly and I agree that they aren't super necessary but they have helped me track bugs down before so I think the benefit outweighs the cost.

Use size_t for indexes not uint64_t

Removed superfluous overloads from this patch not later ones

Latest round of changes LGTM.

I've been thinking about this, is size_t safe to use? Some methods from inherited classes use uint64_t on my machine its fine because even unsigned long and unsigned long long have the same width and sign but on a 32 bit machine I think that assigning to a size_t (which is guaranteed to be word size, I think) from a uint64_t might produce a warning. I don't think there are any examples in this patch, but this seemed like the best place to ask.

abrachet marked an inline comment as done.Aug 21 2019, 6:36 PM

abrachet added inline comments.

llvm/include/llvm/Object/MutableELFObject.h
147	Actually, here is an example, `MutableTable::operator [](size_t)` being called with `DataRefImpl::p` which is `uint64_t`.

In D64281#1640474, @abrachet wrote:

I've been thinking about this, is size_t safe to use? Some methods from inherited classes use uint64_t on my machine its fine because even unsigned long and unsigned long long have the same width and sign but on a 32 bit machine I think that assigning to a size_t (which is guaranteed to be word size, I think) from a uint64_t might produce a warning. I don't think there are any examples in this patch, but this seemed like the best place to ask.

You should use size_t and consider casting for cases using uint64_t. In a 32-bit build, size_t is usually a 32-bit number (i.e. unsigned int), so theoretically, the compiler could produce warnings for truncation issues when converting from a uint64_t. Since the main thing these indexes are used for is for accessing a vector, the type should match the argument type of std::vector<T>::operator[], which is (usually) a size_t.

In D64281#1640799, @jhenderson wrote:

In D64281#1640474, @abrachet wrote:

I've been thinking about this, is size_t safe to use? Some methods from inherited classes use uint64_t on my machine its fine because even unsigned long and unsigned long long have the same width and sign but on a 32 bit machine I think that assigning to a size_t (which is guaranteed to be word size, I think) from a uint64_t might produce a warning. I don't think there are any examples in this patch, but this seemed like the best place to ask.

You should use size_t and consider casting for cases using uint64_t. In a 32-bit build, size_t is usually a 32-bit number (i.e. unsigned int), so theoretically, the compiler could produce warnings for truncation issues when converting from a uint64_t. Since the main thing these indexes are used for is for accessing a vector, the type should match the argument type of std::vector<T>::operator[], which is (usually) a size_t.

@MaskRay In the past I remember you finding usages on certain types (or similar) in the code base. Is there some kind of tool you use that could search every time a function that takes a size_t but is called with a unit64_t? Do you use clang-query for these kinds of things?

Revision Contents

Path

Size

llvm/

include/

llvm/

Object/

ELFObjectFile.h

2 lines

MutableELFObject.h

231 lines

unittests/

Object/

CMakeLists.txt

2 lines

MutableELFObjectTest.cpp

210 lines

Diff 216305

llvm/include/llvm/Object/ELFObjectFile.h

Show First 20 Lines • Show All 395 Lines • ▼ Show 20 Lines	public:

const Elf_Sym *getSymbol(DataRefImpl Sym) const {		const Elf_Sym *getSymbol(DataRefImpl Sym) const {
auto Ret = EF.template getEntry<Elf_Sym>(Sym.d.a, Sym.d.b);		auto Ret = EF.template getEntry<Elf_Sym>(Sym.d.a, Sym.d.b);
if (!Ret)		if (!Ret)
report_fatal_error(errorToErrorCode(Ret.takeError()).message());		report_fatal_error(errorToErrorCode(Ret.takeError()).message());
return *Ret;		return *Ret;
}		}

const Elf_Shdr *getSection(DataRefImpl Sec) const {		virtual const Elf_Shdr *getSection(DataRefImpl Sec) const {
return reinterpret_cast<const Elf_Shdr *>(Sec.p);		return reinterpret_cast<const Elf_Shdr *>(Sec.p);
}		}

basic_symbol_iterator symbol_begin() const override;		basic_symbol_iterator symbol_begin() const override;
basic_symbol_iterator symbol_end() const override;		basic_symbol_iterator symbol_end() const override;

elf_symbol_iterator dynamic_symbol_begin() const;		elf_symbol_iterator dynamic_symbol_begin() const;
elf_symbol_iterator dynamic_symbol_end() const;		elf_symbol_iterator dynamic_symbol_end() const;
▲ Show 20 Lines • Show All 373 Lines • ▼ Show 20 Lines
}		}

template <class ELFT>		template <class ELFT>
bool ELFObjectFile<ELFT>::isSectionVirtual(DataRefImpl Sec) const {		bool ELFObjectFile<ELFT>::isSectionVirtual(DataRefImpl Sec) const {
return getSection(Sec)->sh_type == ELF::SHT_NOBITS;		return getSection(Sec)->sh_type == ELF::SHT_NOBITS;
}		}

template <class ELFT>		template <class ELFT>
bool ELFObjectFile<ELFT>::isBerkeleyText(DataRefImpl Sec) const {		bool ELFObjectFile<ELFT>::isBerkeleyText(DataRefImpl Sec) const {
		jhendersonUnsubmitted Done Reply Inline Actions I'm not sure I understand why you've made this change from the previous version? I don't think it gives anything. jhenderson: I'm not sure I understand why you've made this change from the previous version? I don't think…
		abrachetAuthorUnsubmitted Done Reply Inline Actions Because I didn't know you could do ELFObjectFile<ELFT>::isBerkeleyText() to not call the overrided method! Oops. abrachet: Because I didn't know you could do ELFObjectFile<ELFT>::isBerkeleyText() to not call the…
return getSection(Sec)->sh_flags & ELF::SHF_ALLOC &&		return getSection(Sec)->sh_flags & ELF::SHF_ALLOC &&
(getSection(Sec)->sh_flags & ELF::SHF_EXECINSTR \|\|		(getSection(Sec)->sh_flags & ELF::SHF_EXECINSTR \|\|
!(getSection(Sec)->sh_flags & ELF::SHF_WRITE));		!(getSection(Sec)->sh_flags & ELF::SHF_WRITE));
}		}

template <class ELFT>		template <class ELFT>
bool ELFObjectFile<ELFT>::isBerkeleyData(DataRefImpl Sec) const {		bool ELFObjectFile<ELFT>::isBerkeleyData(DataRefImpl Sec) const {
const Elf_Shdr *EShdr = getSection(Sec);		const Elf_Shdr *EShdr = getSection(Sec);
return !isBerkeleyText(Sec) && EShdr->sh_type != ELF::SHT_NOBITS &&		return !isBerkeleyText(Sec) && EShdr->sh_type != ELF::SHT_NOBITS &&
EShdr->sh_flags & ELF::SHF_ALLOC;		EShdr->sh_flags & ELF::SHF_ALLOC;
		jhendersonUnsubmitted Done Reply Inline Actions Are you sure you want to be calling the base-class method though? Isn't that going to go wrong if Sec is a modified or added section? jhenderson: Are you sure you want to be calling the base-class method though? Isn't that going to go wrong…
		abrachetAuthorUnsubmitted Done Reply Inline Actions When these functions are called, they get called with a DataRef pointing to the Elf_Shdr whether they are new or not. I was segfaulting here before because in `ELFObjectFile` the DataRef's point to the Elf_Shdr but in `MutableELFObject` they are indexes into the `MutableRange`. I couldn't find a way for the DataRefs to be interpreted the same way between the two classes. This isn't ideal though. abrachet: When these functions are called, they get called with a DataRef pointing to the Elf_Shdr…
		jhendersonUnsubmitted Done Reply Inline Actions Oh, I realised I was mistaken as to which class we are in. I really don't like this change here at all. The base class shouldn't need to know that it has to call the base class version of the method. It should just work. Effectively the design here now means the base class has to know that there's a subclass that might change the meaning of some of its functions under-the-hood, and guard against that. The solution might be to make getSection a virtual method and override it in your sub-class. What do you think of that? jhenderson: Oh, I realised I was mistaken as to which class we are in. I really don't like this change here…
		abrachetAuthorUnsubmitted Done Reply Inline Actions What do you think of that? I think that saves so much boiler plate code! Can't believe I didn't see this earlier thank you. Also, I remember @jakehehrlich saying how I was previously doing it was not good, this is much better now. I now only have to override `moveSectionNext`, `getSectionName`, `getSectionIndex` and `getSectionContents` abrachet: > What do you think of that? I think that saves so much boiler plate code! Can't believe I…
}		}

template <class ELFT>		template <class ELFT>
relocation_iterator		relocation_iterator
ELFObjectFile<ELFT>::section_rel_begin(DataRefImpl Sec) const {		ELFObjectFile<ELFT>::section_rel_begin(DataRefImpl Sec) const {
DataRefImpl RelData;		DataRefImpl RelData;
auto SectionsOrErr = EF.sections();		auto SectionsOrErr = EF.sections();
if (!SectionsOrErr)		if (!SectionsOrErr)
▲ Show 20 Lines • Show All 395 Lines • Show Last 20 Lines

llvm/include/llvm/Object/MutableELFObject.h

This file was added.

				//===-- MutableELFObject.h --------------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_OBJECT_MUTABLEELFOBJECT_H
				#define LLVM_OBJECT_MUTABLEELFOBJECT_H

				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/Object/ELFObjectFile.h"

				jhendersonUnsubmitted Done Reply Inline Actions These should be prefixed with "llvm/Object/" (see e.g. ELFObjectFile.h for this sort of example). jhenderson: These should be prefixed with "llvm/Object/" (see e.g. ELFObjectFile.h for this sort of…
				namespace llvm {
				namespace object {

				template <typename ELFT> class MutableELFObject;

				/// This class is used by MutableELFObject to provide a mutable version of an
				jhendersonUnsubmitted Done Reply Inline Actions Perhaps this should be nested inside MutableELFObject for now, since it is effectively an implementation detail? Obviously, it would move to MutableObject if that class ever exists. jhenderson: Perhaps this should be nested inside MutableELFObject for now, since it is effectively an…
				/// ELFObjectFile Section.
				jhendersonUnsubmitted Done Reply Inline Actions I've been thinking about how this class is used by its clients. To me, it doesn't seem great that clients have to know whether a given member is new or not (i.e. whether it is stored in NewValues etc). I think it's okay to provide makeMutable to get a mutable reference to one of the original members, but I also feel like getNew should not be in the public interface. Instead, you should provide a function that can take a key or something to look things up with, possibly a MappingType, and return the corresponding item as a const variable. jhenderson: I've been thinking about how this class is used by its clients. To me, it doesn't seem great…
				abrachetAuthorUnsubmitted Done Reply Inline Actions I had something like this originally I think, and I agree that right now it isn't very ergonomic. The problem is that the original and new entries are of completely different types. I haven't found a great way to do this. What I don't do right now but might be better is require that the new type, T have a conversion operator. But no matter what, there needs to be a public method to get a pointer to the new type. This would require that MutableRange also have a template parameter to the original type it is pointing to (unless the conversion was to uintptr_t i suppose). abrachet: I had something like this originally I think, and I agree that right now it isn't very…
				jhendersonUnsubmitted Done Reply Inline Actions Comment? jhenderson: Comment?
				abrachetAuthorUnsubmitted Done Reply Inline Actions Not really sure what to say, is that enough you think? abrachet: Not really sure what to say, is that enough you think?
				jhendersonUnsubmitted Done Reply Inline Actions Why would you use this class? What is the overall aim here? It probably wants to say something like it is used by MutableELFObject to provide a mutable version of an ELFObjectFile Section or something to that effect. jhenderson: Why would you use this class? What is the overall aim here? It probably wants to say something…
				template <typename ELFT> struct MutableELFSection {
				jhendersonUnsubmitted Done Reply Inline Actions Num -> Ptr? jhenderson: Num -> Ptr?
				jhendersonUnsubmitted Done Reply Inline Actions In nearly every case I can see, toDataRef is called with a uintptr_t, so do you need the template? What case doesn't work in that case? jhenderson: In nearly every case I can see, toDataRef is called with a uintptr_t, so do you need the…
				abrachetAuthorUnsubmitted Done Reply Inline Actions The `MutableRange<MutableELFSection<ELFT>>::getNew` method returns a pointer to a `MutableELFSection<ELFT>` which is used for getName and getContents which need a pointer to the new section. abrachet: The `MutableRange<MutableELFSection<ELFT>>::getNew` method returns a pointer to a…
				jhendersonUnsubmitted Done Reply Inline Actions I might be being blind, but I don't see any reference to getName or getContents in this patch that have anything to do with this function? I assume you meant getSectionName and getSectionContents... How about you do the cast to uintptr_t outside this function? That would seem like the logical choice instead of creating a template with a dubious cast in it. jhenderson: I might be being blind, but I don't see any reference to getName or getContents in this patch…
				using Elf_Shdr = typename ELFT::Shdr;

				rupprechtUnsubmitted Done Reply Inline Actions You should be able to use the more common: using Elf_Shdr = typename ELFT::Shdr; rupprecht: You should be able to use the more common: ``` using Elf_Shdr = typename ELFT::Shdr; ```
				Elf_Shdr Header;
				jhendersonUnsubmitted Done Reply Inline Actions New -> IsNew This is a) easier to read, and b) going to play nicer with the upcoming variable name changes. jhenderson: New -> IsNew This is a) easier to read, and b) going to play nicer with the upcoming variable…
				std::string Name;
				OwningArrayRef<uint8_t> Data;
				rupprechtUnsubmitted Done Reply Inline Actions nit: it's simpler if all the params (Ptr/IsNew) are consistently in the same order (for class members, constructor arg order, and member initializers) rupprecht: nit: it's simpler if all the params (Ptr/IsNew) are consistently in the same order (for class…

				grimarUnsubmitted Done Reply Inline Actions Since all members are `public`, should this be a `struct`? grimar: Since all members are `public`, should this be a `struct`?
				abrachetAuthorUnsubmitted Done Reply Inline Actions For some reason I thought I remembered the style guide saying only POD types should be struct and others should be class regardless of access modifiers. abrachet: For some reason I thought I remembered the [[ https://llvm.org/docs/CodingStandards.html#use-of…
				MutableELFSection(const Elf_Shdr &Header, StringRef Name, void *Base)
				: Header(Header), Name(Name),
				Data(OwningArrayRef<uint8_t>(Header.sh_size)) {
				std::memcpy(Data.data(),
				reinterpret_cast<const uint8_t *>(Base) + Header.sh_offset,
				Header.sh_size);
				}
				jhendersonUnsubmitted Done Reply Inline Actions I think a more normal-looking signature would simply be `MutableELFSection(uint8_t Data, const MutableELFObject<ELF> ObjFile)` jhenderson: I think a more normal-looking signature would simply be `MutableELFSection(uint8_t *Data, const…
				abrachetAuthorUnsubmitted Done Reply Inline Actions I'm not sure I understand why it should be a `uint8_t `, `ELFObjectFile::getSection` takes a `DataRef` abrachet:* I'm not sure I understand why it should be a `uint8_t *`, `ELFObjectFile::getSection` takes a…
				jhendersonUnsubmitted Done Reply Inline Actions Right, sorry. I incorrectly assumed that DataRefImpl would store a pointer, since it essentially is a pointer to some data. jhenderson: Right, sorry. I incorrectly assumed that DataRefImpl would store a pointer, since it…

				void setData(ArrayRef<uint8_t> Ref) {
				Data = OwningArrayRef<uint8_t>(Ref);
				Header.sh_size = Data.size();
				}
				jhendersonUnsubmitted Done Reply Inline Actions setData would be a more appropriate name. jhenderson: setData would be a more appropriate name.

				operator const Elf_Shdr &() const { return Header; }
				};

				jhendersonUnsubmitted Done Reply Inline Actions How about a comment for MutableELFObject? jhenderson: How about a comment for MutableELFObject?
				abrachetAuthorUnsubmitted Done Reply Inline Actions Same as above, not quite sure what to say. abrachet: Same as above, not quite sure what to say.
				jhendersonUnsubmitted Done Reply Inline Actions I think this comment needs to explain why this class exists. Essentially it should say why we can't use ELFObjectFile directly for our purposes. jhenderson: I think this comment needs to explain why this class exists. Essentially it should say why we…
				/// This class is used for doing mutations on an ELFObjectFile. The
				jhendersonUnsubmitted Done Reply Inline Actions Why does this need to exist? jhenderson: Why does this need to exist?
				abrachetAuthorUnsubmitted Done Reply Inline Actions This was referring to the friend declaration before changes. Because the constructor uses the base method which is protected. abrachet: This was referring to the friend declaration before changes. Because the constructor uses the…
				/// ELFObjectFile class only provides an interface for reading object files.
				jhendersonUnsubmitted Done Reply Inline Actions I'm not sure this gives us anything significant. Just use the type directly (it's not like it's used widely). jhenderson: I'm not sure this gives us anything significant. Just use the type directly (it's not like it's…
				jhendersonUnsubmitted Done Reply Inline Actions I'd change this slightly: ... an ELFObjectFile. The ELFObjectFile class only provides ... jhenderson: I'd change this slightly: ... an ELFObjectFile. The ELFObjectFile class only provides ...
				template <typename ELFT> class MutableELFObject : public ELFObjectFile<ELFT> {
				/// This class is used for a 'copy on write' effect with tables in an ELF
				jhendersonUnsubmitted Done Reply Inline Actions Do you anticipate this being used in other future functions within this class? If not, move it inside the function, to reduce its effect to where it's actually used. jhenderson: Do you anticipate this being used in other future functions within this class? If not, move it…
				abrachetAuthorUnsubmitted Done Reply Inline Actions Yes all of the getSectionX methods will need this type. Those haven't been implemented in this patch because I didn't want to overload this patch. But calling those methods for an updated section is undefined behavior. I'm not sure if that means it needs to be in this patch or if it makes sense to move it to the part 2. abrachet: Yes all of the getSectionX methods will need this type. Those haven't been implemented in this…
				jhendersonUnsubmitted Done Reply Inline Actions I'd be inclined the add all the getSectionX/isSectionX methods in this patch. It gives you something tangible to test too, so clearly demonstrates that the class works. Best thing to do is probably work on the principle that users of this class should be able to use this safely. Alternatively, just return "not supported" errors in all the ones you don't want to support for now. We should avoid leaving cases where things will be confusing to anybody who tries to pick this up and use it after the first commit. jhenderson: I'd be inclined the add all the getSectionX/isSectionX methods in this patch. It gives you…
				/// object file.
				///
				/// The table holds entries of two different but related types. OrigType
				jhendersonUnsubmitted Done Reply Inline Actions No need for `inline` with class members. jhenderson: No need for `inline` with class members.
				/// represents the object type in the file originaly while NewType is the
				jhendersonUnsubmitted Done Reply Inline Actions This is another use of auto that should probably change. You might want to typedef the MappingType from earlier. jhenderson: This is another use of auto that should probably change. You might want to typedef the…
				/// type to create on modification of the entry. It is required that
				jhendersonUnsubmitted Done Reply Inline Actions Probably shouldn't use auto here. jhenderson: Probably shouldn't use auto here.
				abrachetAuthorUnsubmitted Done Reply Inline Actions The typename is really long would you be open to changing it back to auto? abrachet: The typename is really long would you be open to changing it back to auto?
				jhendersonUnsubmitted Done Reply Inline Actions No. LLVM follows the rule of "only use auto when it is clear from the context what the type is". See the Coding Standards. jhenderson: No. LLVM follows the rule of "only use auto when it is clear from the context what the type is".
				/// NewType have an `operator const OrigType &()`.
				///
				jhendersonUnsubmitted Done Reply Inline Actions How about `NewSec`? jhenderson: How about `NewSec`?
				/// The table keeps a list of mappings, these mappings can have one of two
				jhendersonUnsubmitted Done Reply Inline Actions For what do you envisage this function being useful? jhenderson: For what do you envisage this function being useful?
				abrachetAuthorUnsubmitted Done Reply Inline Actions This was referencing `getOriginal` before. In symbols for example, the top half of the `DataRef` is the section index of the symbol table. In that case I use `getOriginal` to see its sh_type. It wouldn't make sense to use the updated section list in this case I think. abrachet: This was referencing `getOriginal` before. In symbols for example, the top half of the…
				/// states either Original or New. In the case of Original the index
				/// associated with the mapping is into the original table in the file. For
				/// a New mapping, the index is into the NewValues vector. This design allows
				/// fewer copies to be made than there would otherwise need to be, entries
				/// with no modifications never get copied and the only overhead for those is
				rupprechtUnsubmitted Done Reply Inline Actions These should have an assertion (in debug mode) that `Index` is in range rupprecht: These should have an assertion (in debug mode) that `Index` is in range
				/// an index. Entries which get modified can have richer types during program
				jhendersonUnsubmitted Done Reply Inline Actions How about having this return a reference, rather than a pointer? That would seem to be a more natural fit. jhenderson: How about having this return a reference, rather than a pointer? That would seem to be a more…
				/// executation than are allowed by the object file standard.
				template <typename OrigType, typename NewType> class MutableTable {
				struct MappingType {
				enum MappedType { Original, New };

				rupprechtUnsubmitted Done Reply Inline Actions `Mapping.Ptr` is already a `uintptr_t`, is the `reinterpret_cast` necessary? rupprecht: `Mapping.Ptr` is already a `uintptr_t`, is the `reinterpret_cast` necessary?
				size_t Index;
				MappedType Type;

				MappingType(size_t Index, MappedType Type) : Index(Index), Type(Type) {}
				jhendersonUnsubmitted Done Reply Inline Actions What is this function useful for? Returning an Optional would be a nicer interface than returning a nullptr, IMO. jhenderson: What is this function useful for? Returning an Optional would be a nicer interface than…
				abrachetAuthorUnsubmitted Done Reply Inline Actions This was referencing `getIfNew` previously. It's for methods which could not use just the OrigType to do their job properly. For example, `getSectionName` returns the `Name` member when the section has been modified because there is no way to describe the change to `ELFObjectFile` through its `sh_name` field. `Optional` does not work with references so far as I can tell. Is this correct? A copy would be too costly for `MutableELFSections` which have an `OwningArrayRef` and `std::string`. abrachet: This was referencing `getIfNew` previously. It's for methods which could not use just the…
				jhendersonUnsubmitted Done Reply Inline Actions You're right about `Optional`, sorry for the noise. `getSectionName` currently calls `getConstIfNew`, so this function doesn't need to exist, right? jhenderson: You're right about `Optional`, sorry for the noise. `getSectionName` currently calls…

				operator size_t() const { return Index; }
				jhendersonUnsubmitted Done Reply Inline Actions More too much auto. jhenderson: More too much auto.
				};
				jhendersonUnsubmitted Done Reply Inline Actions `unsigned long` isn't the right size to contain a pointer on some systems, e.g. 64-bit Windows. I think I'd prefer you to go back to the old approach of taking uintptr_t. jhenderson: `unsigned long` isn't the right size to contain a pointer on some systems, e.g. 64-bit Windows.
				abrachetAuthorUnsubmitted Done Reply Inline Actions Compiler was getting upset because the `0` literal is treated as an int so the toDataRef was instatiated with [T = int], `reinterpret_cast` can't cast to a wider type, and `static_cast` can't cast a pointer to non pointer type. Now I am just using a C-style cast. I haven't seen anything in the style guide explicitly disallowing their use but I also haven't seen them too often either. Is it ok to use, or should I do `enable_if` specialization for integral and pointer types? It's on line 23 of this file. abrachet: Compiler was getting upset because the `0` literal is treated as an int so the toDataRef was…

				ArrayRef<OrigType> OriginalValues;
				jhendersonUnsubmitted Done Reply Inline Actions How about `getMutableSection`? jhenderson: How about `getMutableSection`?
				std::vector<NewType> NewValues;
				std::vector<MappingType> Mappings;

				public:
				explicit MutableTable(ArrayRef<OrigType> TheOriginalValues)
				: OriginalValues(TheOriginalValues) {
				size_t Count = 0;
				llvm::transform(OriginalValues, std::back_inserter(Mappings),
				[&Count](const OrigType &) {
				rupprechtUnsubmitted Done Reply Inline Actions "OriginalValues" is overloaded, so it's not clear which one is being referenced in the constructor A common hacky practice in LLVM is to name the constructor arg "TheOriginalValues" or something -- that's fine to use if you can't come up with any other different name. rupprecht: "OriginalValues" is overloaded, so it's not clear which one is being referenced in the…
				MaskRayUnsubmitted Done Reply Inline Actions `llvm::transform(OriginalValues, ...)` MaskRay: `llvm::transform(OriginalValues, ...)`
				return MappingType(Count++, MappingType::Original);
				});
				jhendersonUnsubmitted Done Reply Inline Actions "regardless if it" sounds off to me. How about "regardless of whether it" jhenderson: "regardless if it" sounds off to me. How about "regardless of whether it"
				}

				/// Get the OrigType at index Index regardless of whether it is an OrigType
				jakehehrlichUnsubmitted Not Done Reply Inline Actions It seems like you're trynig to implement an iterator here. Why not make a mutable section iterator? 'getMutableSection would become dereference and this would become increment. jakehehrlich: It seems like you're trynig to implement an iterator here. Why not make a mutable section…
				rupprechtUnsubmitted Done Reply Inline Actions `std::memcpy` rupprecht: `std::memcpy`
				/// or NewType. In the case that Mappings[Index].Type == New, call NewTypes
				/// operator OrigType to make the proper conversion.
				jhendersonUnsubmitted Done Reply Inline Actions Does this function need to exist as a free function, or can it be a private method of MutableELFObject? jhenderson: Does this function need to exist as a free function, or can it be a private method of…
				const OrigType &operator[](size_t Index) const {
				assert(Index < Mappings.size() && "Out of bounds");
				if (Mappings[Index].Type == MappingType::New)
				return static_cast<const OrigType &>(NewValues[Mappings[Index]]);
				return OriginalValues[Mappings[Index]];
				}

				/// Get the OrigType at index Index. This method ignores any changes made
				/// and always returns the OrigType from its original state at its original
				/// index.
				const OrigType &getOriginal(size_t Index) const {
				assert(Index < OriginalValues.size() && "Out of bounds");
				jakehehrlichUnsubmitted Not Done Reply Inline Actions Are we overriding these getFoo methods? They all seem really bad. Just make one method that returns a mutable section and let the user get the data they need from there. jakehehrlich: Are we overriding these getFoo methods? They all seem really bad. Just make one method that…
				abrachetAuthorUnsubmitted Done Reply Inline Actions It's so sections can be mutated and the ObjectFile exposes those sections and their changes from the iterator returned by ObjectFile::sections(). But of course there is a cost to that, the SectionRef's from this mutable class are 2 levels of indirection away from the sections not 1 like its super class, which I think is where your hesitation is? I personally think it is a big advantage to use ObjectFile because it already exists, has been widely tested and is an interface many (all?) tools already interact with. Of course no tools do any mutations and then subsequently create an ObjectFile from that though. I am not opposed to having mutated sections get their own iterator, but I'm also not convinced it would be better. Preserving the order of sections or inserting sections is very easy this way for example. abrachet: It's so sections can be mutated and the ObjectFile exposes those sections and their changes…
				return OriginalValues[Index];
				}

				jhendersonUnsubmitted Done Reply Inline Actions This pattern is repeated everywhere. Should it be factored out into a separate function? jhenderson: This pattern is repeated everywhere. Should it be factored out into a separate function?
				/// If the entry at index Index has already been made mutable, this returns
				/// a reference to that. Otherwise, this replaces the current entry at the
				/// specified index with a NewType constructued with Arguments.
				template <typename... Args>
				NewType &makeMutable(size_t Index, Args &&... Arguments) {
				rupprechtUnsubmitted Done Reply Inline Actions Is `toDataRef(reinterpret_cast<uintptr_t>(...))` redundant? (Converting from pointer -> uintptr_t -> pointer) rupprecht: Is `toDataRef(reinterpret_cast<uintptr_t>(...))` redundant? (Converting from pointer ->…
				assert(Index < Mappings.size() && "Out of bounds");
				if (Mappings[Index].Type == MappingType::New)
				MaskRayUnsubmitted Not Done Reply Inline Actions I think such `Out of bounds` assertions are probably not very necessary. They will certainly segfault if the user does have out-of-bounds accesses. MaskRay: I think such `Out of bounds` assertions are probably not very necessary. They will certainly…
				jhendersonUnsubmitted Not Done Reply Inline Actions Won't that entirely depend on how the memory is laid out, compiler-generated checks etc? A segmentation fault hardly provides any useful information, whereas at least the assertion shows what's going wrong in builds with assertions enabled... jhenderson: Won't that entirely depend on how the memory is laid out, compiler-generated checks etc? A…
				MaskRayUnsubmitted Not Done Reply Inline Actions I think if the user of this library makes an out-of-bounds access, the program will very quickly segfault with a clear stack trace (symbolization can happen with a symbol table, even if debug info is absent) that the problem is related to the use of MutableELFObject. There are lots of ways to detect errors: -D_GLIBCXX_DEBUG / asan / ... We can add some assert() in some tricky places (they also serve as documentation of some invariants) but here I think it is probably not very necessary. MaskRay: I think if the user of this library makes an out-of-bounds access, the program will very…
				rupprechtUnsubmitted Not Done Reply Inline Actions I think if the user of this library makes an out-of-bounds access, the program will very quickly segfault with a clear stack trace Having dealt with many prod crashes in the past, I strongly disagree. C++ optimizations can often make the crash happen far away from the actual bug. I think it's much more common to run tools in debug mode than in asan, so it's useful to have these. rupprecht: > I think if the user of this library makes an out-of-bounds access, the program will very…
				abrachetAuthorUnsubmitted Done Reply Inline Actions I haven't removed the assertions in this diff. I think worst case it looks ugly and I agree that they aren't super necessary but they have helped me track bugs down before so I think the benefit outweighs the cost. abrachet: I haven't removed the assertions in this diff. I think worst case it looks ugly and I agree…
				return NewValues[Mappings[Index]];
				Mappings[Index] = MappingType(NewValues.size(), MappingType::New);
				NewValues.emplace_back(std::forward<Args>(Arguments)...);
				MaskRayUnsubmitted Done Reply Inline Actions Mappings[Index] = MappingType(NewValues.size(), MappingType::New); NewValues.emplace_back(std::forward<Args>(Arguments)...); MaskRay: ``` Mappings[Index] = MappingType(NewValues.size(), MappingType::New); NewValues.emplace_back…
				return NewValues.back();
				jhendersonUnsubmitted Done Reply Inline Actions Does this need to exist? If so, it needs unit-testing. jhenderson: Does this need to exist? If so, it needs unit-testing.
				abrachetAuthorUnsubmitted Done Reply Inline Actions Yes it is used and thus tested implicitly. How would I go about testing the private interface? I could make it protected and then inherit from it in the unit test, but that doesn't sound great. FWIW this is almost identical to the `getHeader` method in `ELFFile`. https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Object/ELF.h#L108 abrachet: Yes it is used and thus tested implicitly. How would I go about testing the private interface?
				jhendersonUnsubmitted Done Reply Inline Actions Sorry, missed that getHeader was private. Being able to modify the header contents is a required thing however, so you might want to consider how this could be done (possibly a different change though). jhenderson: Sorry, missed that getHeader was private. Being able to modify the header contents is a…
				}

				/// Returns a pointer to the NewType if the entry at the specified index
				/// has had makeMutable called on it. Otherwise this method returns nullptr.
				jhendersonUnsubmitted Done Reply Inline Actions Do you need this to be a friend, or can you just pass in another argument or two (if needed) into the constructor of MutableELFSection? jhenderson: Do you need this to be a friend, or can you just pass in another argument or two (if needed)…
				jhendersonUnsubmitted Done Reply Inline Actions Does this need to exist? If so, it needs unit-testing. jhenderson: Does this need to exist? If so, it needs unit-testing.
				abrachetAuthorUnsubmitted Done Reply Inline Actions Yes. It is used for every public method on `SectionRef` that gets tested in the unit test. Is this enough or should I test it directly as well? abrachet: Yes. It is used for every public method on `SectionRef` that gets tested in the unit test. Is…
				jhendersonUnsubmitted Done Reply Inline Actions I have no idea which method this was referring to any more... jhenderson: I have no idea which method this was referring to any more...
				abrachetAuthorUnsubmitted Done Reply Inline Actions Oops I think I responded to this before adding the comments so I forgot to add a what the comment was originally referring to. It was `getSection(DataRefImpl)`. Its the function that I made virtual from `ELFObjectFile`. All the tests that I have now like `FirstSec->getSize()` test this method abrachet: Oops I think I responded to this before adding the comments so I forgot to add a what the…
				const NewType *getConstIfNew(size_t Index) const {
				assert(Index < Mappings.size() && "Out of bounds");
				return Mappings[Index].Type == MappingType::New
				? &NewValues[Mappings[Index]]
				: nullptr;
				}
				jhendersonUnsubmitted Done Reply Inline Actions protected? Does this class have any subclasses? jhenderson: protected? Does this class have any subclasses?

				/// Return the number of elements in the table.
				size_t size() const { return Mappings.size(); }
				};

				using Elf_Shdr = typename ELFT::Shdr;
				using Elf_Ehdr = typename ELFT::Ehdr;
				rupprechtUnsubmitted Not Done Reply Inline Actions I think this is a potentially-invalid reference, as `B` is `std::move`-d for the previous member. It looks like this works because the move constructor for `ELFObjectFile` is implemented as a copy constructor, although that is unusual (a typical implementation would swap() members): template <class ELFT> ELFObjectFile<ELFT>::ELFObjectFile(ELFObjectFile<ELFT> &&Other) : ELFObjectFile(Other.Data, Other.EF, Other.DotDynSymSec, Other.DotSymtabSec, Other.ShndxTable) {} I'm curious if you can test this theory by "fixing" the move constructor to be something like: template <class ELFT> ELFObjectFile<ELFT>::ELFObjectFile(ELFObjectFile<ELFT> &&Other) { using std::swap; swap(Data, Other.Data); swap(EF, Other.EF); swap(DotDynSymSec, Other.DotDynSymSec); swap(DotSymtabSec, Other.DotSymtabSec); swap(ShndxTable, Other.ShndxTable); } (no need to check it in though) I think you can just use `section_begin()/section_end()` directly here, and it will correctly call the methods from the parent class (the vtable doesn't update until later) rupprecht: I think this is a potentially-invalid reference, as `B` is `std::move`-d for the previous…
				abrachetAuthorUnsubmitted Done Reply Inline Actions I couldn't test to see if it breaks with your example of swapping members because `ELFObjectFile` has no default constructor so all I could use were Others members to initialize and then the swap is just swapping two equal values. I think you can just use section_begin()/section_end() directly here, and it will correctly call the methods from the parent class (the vtable doesn't update until later) This didn't work for me, my program never exited (I gave it ~10 seconds assuming a segfault was coming). I am still using `B.section_begin()/end()`. I also tried `ELFObjectFile<ELFT>::section_begin()` and that compiled but caused a crash. Tried to create the OwningArrayRef with nonsense data `OwningArrayRef(this=0x0000000106006280, Size=72057594037927936)` ObjectTests(66699,0x10c97d5c0) malloc: can't allocate region * mach_vm_map(size=72057594037927936) failed (error code=3) ObjectTests(66699,0x10c97d5c0) malloc: * set a breakpoint in malloc_error_break to debug libc++abi.dylib: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc abrachet: I couldn't test to see if it breaks with your example of swapping members because…

				rupprechtUnsubmitted Done Reply Inline Actions What's being captured by `[&]` here? rupprecht: What's being captured by `[&]` here?
				rupprechtUnsubmitted Done Reply Inline Actions ditto rupprecht: ditto
				MutableTable<Elf_Shdr, MutableELFSection<ELFT>> Sections;

				const Elf_Ehdr &getHeader() const {
				return reinterpret_cast<const Elf_Ehdr >(this->base());
				}
				jhendersonUnsubmitted Done Reply Inline Actions overrie -> override jhenderson: overrie -> override

				jhendersonUnsubmitted Done Reply Inline Actions wether -> whether jhenderson: wether -> whether
				/// Many getSection* methods in ELFObjectFile use getSection to get the
				/// the header associated with that section. This override returns a valid
				jhendersonUnsubmitted Done Reply Inline Actions Use `explicit` here. Should this be an r-value reference? What do you think? jhenderson: Use `explicit` here. Should this be an r-value reference? What do you think?
				abrachetAuthorUnsubmitted Done Reply Inline Actions Sounds fine to me. In the unit tests I move the ELFObjectFile but then continue to use it because as @rupprecht pointed out previously the move constructor is just a copy constructor. Is it safe to do this or should I explicitly copy? abrachet: Sounds fine to me. In the unit tests I move the ELFObjectFile but then continue to use it…
				jhendersonUnsubmitted Done Reply Inline Actions Is it safe to do this or should I explicitly copy? This doesn't sound safe to me - you need to copy or just simply not use the object after it's been moved. a) I wouldn't be surprised if this is technically undefined behaviour and the compiler makes assumptions about the state of the instance after the move. b) It's fragile to future changes. jhenderson: > Is it safe to do this or should I explicitly copy? This doesn't sound safe to me - you need…
				abrachetAuthorUnsubmitted Done Reply Inline Actions I did an explicit copy /I think/ in the unit test now. I have an `ELFObjectFile ` and then deference it and assign to an `ObjectFile &`. Does this make a copy or does the compiler omit it and just assign the address? abrachet:* I did an explicit copy /I think/ in the unit test now. I have an `ELFObjectFile *` and then…
				labathUnsubmitted Done Reply Inline Actions Kind of the whole point of references is that they allow you to "refer" to other objects without making copies of them. If you want to make a copy, drop the `&` and assign it to an `ObjectFile` (or `ELFObjectFile`, or something) labath: Kind of the whole point of references is that they allow you to "refer" to other objects…
				abrachetAuthorUnsubmitted Done Reply Inline Actions Oh yeah. What am I talking about! abrachet: Oh yeah. What am I talking about!
				/// section header whether the section has been modified or not.
				const Elf_Shdr *getSection(DataRefImpl Sec) const override {
				return &Sections[Sec.p];
				abrachetAuthorUnsubmitted Done Reply Inline Actions Actually, here is an example, `MutableTable::operator [](size_t)` being called with `DataRefImpl::p` which is `uint64_t`. abrachet: Actually, here is an example, `MutableTable::operator [](size_t)` being called with…
				}

				/// moveSectionNext must be overridden because MutableELFObject::section_begin
				/// works differently than in ELFObjectFile. In this class, sections are
				MaskRayUnsubmitted Done Reply Inline Actions typo: overridden `the section_iterators in MutableELFObject` -> `MutableELFObject::section_begin` ? (Just name it explicitly) MaskRay: typo: overridden `the section_iterators in MutableELFObject` -> `MutableELFObject…
				/// iterated with their index, not address in the file, which allows use with
				/// MutableTable.
				void moveSectionNext(DataRefImpl &Sec) const override;
				Expected<StringRef> getSectionName(DataRefImpl Sec) const override;
				uint64_t getSectionIndex(DataRefImpl Sec) const override;
				Expected<ArrayRef<uint8_t>>
				getSectionContents(DataRefImpl Sec) const override;

				static DataRefImpl toDataRef(uintptr_t Ptr) {
				jhendersonUnsubmitted Done Reply Inline Actions Probably this should be handleError(MutSecOrErr.takeError()); jhenderson: Probably this should be handleError(MutSecOrErr.takeError());
				DataRefImpl Ref;
				Ref.p = Ptr;
				jhendersonUnsubmitted Done Reply Inline Actions You're going to need to do something with the Error inside Expected here. This function should probably return an Expected too, so that you can pass that error up to a higher level to be handled. jhenderson: You're going to need to do something with the Error inside Expected here. This function should…
				return Ref;
				}

				public:
				explicit MutableELFObject(ELFObjectFile<ELFT> &&B)
				: ELFObjectFile<ELFT>(std::move(B)),
				Sections(ArrayRef<Elf_Shdr>(reinterpret_cast<const Elf_Shdr *>(
				jhendersonUnsubmitted Done Reply Inline Actions What is this function for? jhenderson: What is this function for?
				abrachetAuthorUnsubmitted Done Reply Inline Actions It advances the `section_iterator`. `MutableELFObject` uses a different iterator to that of `ELFObjectFile` which I belive would have a `moveSectionNext` that did something like `Sec.p += sizeof(Elf_Shdr)` as those just point directly to the section header. abrachet: It advances the `section_iterator`. `MutableELFObject` uses a different iterator to that of…
				jhendersonUnsubmitted Done Reply Inline Actions At the moment, it doesn't look like it's used, so it shouldn't exist. Re-add it when it is needed. However, perhaps a safer thing to do is define your own iterator class, as @jakehehrlich said that does this internally. jhenderson: At the moment, it doesn't look like it's used, so it shouldn't exist. Re-add it when it is…
				abrachetAuthorUnsubmitted Done Reply Inline Actions This was referring to `moveSectionNext()` before. It does get used, it is an overload of the function that `content_iterator::operator ++()` calls. It must be overriden because `ELFObjectFile::moveSectionNext()` iterates over sections not with their index but their mapped address. This uses index into `MutableTable`. abrachet: This was referring to `moveSectionNext()` before. It does get used, it is an overload of the…
				jhendersonUnsubmitted Done Reply Inline Actions Sorry, I missed that `moveSectionNext()` is an override. jhenderson: Sorry, I missed that `moveSectionNext()` is an override.
				this->base() + getHeader().e_shoff),
				getHeader().e_shnum)) {}

				section_iterator section_begin() const override {
				return section_iterator(SectionRef(toDataRef(0), this));
				}

				section_iterator section_end() const override {
				return section_iterator(SectionRef(toDataRef(Sections.size()), this));
				}

				/// Returns a mutable reference to the section pointed to by Sec. A possible
				/// usage to change all sections with alignment of 0 to 1 could be:
				/// @code{.cpp}
				/// for (const SectionRef Sec : MutObj.sections())
				/// if (!Sec.getAlignment()) {
				MaskRayUnsubmitted Done Reply Inline Actions `for (SectionRef Sec : MutObj.sections())` SectionRef has just 2 words. It is usually passed by value. MaskRay: `for (SectionRef Sec : MutObj.sections())` SectionRef has just 2 words. It is usually passed…
				/// if (auto MutSecOrErr = MutObj.getMutableSection(Sec))
				/// then;
				/// else
				MaskRayUnsubmitted Done Reply Inline Actions if (auto MutSecOrErr = MutObj.getMutableSection(Sec)) then; else handleError(...); MaskRay: ``` if (auto MutSecOrErr = MutObj.getMutableSection(Sec)) then; else handleError(...); ```
				/// handleError(...);
				/// MutSecOrErr->Header.sh_addralign = 1;
				MaskRayUnsubmitted Done Reply Inline Actions typo: sh_addralign MaskRay: typo: sh_addralign
				/// }
				/// @endcode
				Expected<MutableELFSection<ELFT> &> getMutableSection(SectionRef Sec) {
				const Elf_Shdr_Impl<ELFT> &Header = Sections[Sec.getRawDataRefImpl().p];
				Expected<StringRef> Name = getSectionName(Sec.getRawDataRefImpl());
				if (!Name)
				return Name.takeError();
				return Sections.makeMutable(Sec.getRawDataRefImpl().p, Header, *Name, this);
				}
				};

				template <typename ELFT>
				void MutableELFObject<ELFT>::moveSectionNext(DataRefImpl &Sec) const {
				++Sec.p;
				}

				template <typename ELFT>
				uint64_t MutableELFObject<ELFT>::getSectionIndex(DataRefImpl Sec) const {
				return Sec.p;
				}

				template <typename ELFT>
				Expected<StringRef>
				MutableELFObject<ELFT>::getSectionName(DataRefImpl Sec) const {
				if (const MutableELFSection<ELFT> *SecOrNull = Sections.getConstIfNew(Sec.p))
				return SecOrNull->Name;
				return ELFObjectFile<ELFT>::getSectionName(Sec);
				}

				template <typename ELFT>
				Expected<ArrayRef<uint8_t>>
				MutableELFObject<ELFT>::getSectionContents(DataRefImpl Sec) const {
				if (const MutableELFSection<ELFT> *SecOrNull = Sections.getConstIfNew(Sec.p))
				return ArrayRef<uint8_t>(SecOrNull->Data.data(), SecOrNull->Header.sh_size);
				return ELFObjectFile<ELFT>::getSectionContents(Sec);
				}

				} // namespace object
				} // namespace llvm

				#endif // LLVM_OBJECT_MUTABLEELFOBJECT_H

llvm/unittests/Object/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	BinaryFormat			BinaryFormat
	Object			Object
				ObjectYAML
	)			)

	add_llvm_unittest(ObjectTests			add_llvm_unittest(ObjectTests
	MinidumpTest.cpp			MinidumpTest.cpp
				MutableELFObjectTest.cpp
	SymbolSizeTest.cpp			SymbolSizeTest.cpp
	SymbolicFileTest.cpp			SymbolicFileTest.cpp
	)			)

	target_link_libraries(ObjectTests PRIVATE LLVMTestingSupport)			target_link_libraries(ObjectTests PRIVATE LLVMTestingSupport)

llvm/unittests/Object/MutableELFObjectTest.cpp

This file was added.

				//===- MutableELFObjectTest.cpp -------------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Object/MutableELFObject.h"
				#include "llvm/ADT/SmallString.h"
				#include "llvm/Object/ObjectFile.h"
				#include "llvm/ObjectYAML/yaml2obj.h"
				#include "llvm/Support/Error.h"
				#include "llvm/Support/YAMLTraits.h"
				#include "llvm/Testing/Support/Error.h"
				#include "gtest/gtest.h"

				using namespace llvm;
				using namespace object;
				using namespace yaml;

				// Test that when no modifications have been made SectionRef's methods work
				grimarUnsubmitted Done Reply Inline Actions I'd suggest adding a short description comment before each unit test (about what this test intended to do) just like we often do in a regular test files. grimar: I'd suggest adding a short description comment before each unit test (about what this test…
				jhendersonUnsubmitted Done Reply Inline Actions sections -> section's jhenderson: sections -> section's
				// the same in both ELFObjectFile and MutableELFObject.
				jhendersonUnsubmitted Done Reply Inline Actions I'd rephrase the second half of this sentence as "methods work the same in both ELFObjectFile and MutableELFObject". jhenderson: I'd rephrase the second half of this sentence as "methods work the same in both ELFObjectFile…
				TEST(MutableELFObject, NoChange) {
				StringRef Yaml = R"(
				--- !ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_REL
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Content: "DEADBEEF")";

				SmallString<0> Storage;
				raw_svector_ostream OS(Storage);
				yaml::Input Input(Yaml);
				ASSERT_THAT_ERROR(convertYAML(Input, OS), Succeeded());

				Expected<std::unique_ptr<ObjectFile>> ErrOrObj =
				ObjectFile::createObjectFile(MemoryBufferRef(OS.str(), "YamlObject"));
				jhendersonUnsubmitted Done Reply Inline Actions This still has the unnecessary trailing return type. jhenderson: This still has the unnecessary trailing return type.
				jhendersonUnsubmitted Done Reply Inline Actions Isn't the solution here to just make a second local variable that's a pointer to an ELFObjectFile constructed from MutableObject? `ELFObjectFile ElfObject = &MutableObject;` jhenderson:* Isn't the solution here to just make a second local variable that's a pointer to an…
				abrachetAuthorUnsubmitted Done Reply Inline Actions I don't think so. Then both will be `MutableELFObject`'s and the same virtual methods will be called for each, I believe. I think testing the methods on an `ELFObjectFile` and `MutableELFObject` test that overloads for `MutableELFObject` behave the same as the methods in `ELFObjectFile` when the sections haven't been changed. abrachet: I don't think so. Then both will be `MutableELFObject`'s and the same virtual methods will be…
				ASSERT_THAT_EXPECTED(ErrOrObj, Succeeded());
				auto *ELFObjFile = dyn_cast<ELFObjectFile<ELF64LE>>(ErrOrObj->get());
				ASSERT_TRUE(ELFObjFile);
				jhendersonUnsubmitted Done Reply Inline Actions ObjFileSecCount? jhenderson: ObjFileSecCount?

				rupprechtUnsubmitted Done Reply Inline Actions This should check the error code too rupprecht: This should check the error code too
				// Create a new ObjectFile from the same yaml. There is no copy constructor.
				rupprechtUnsubmitted Done Reply Inline Actions Can you avoid the copy here, and just return `StringRef`? rupprecht: Can you avoid the copy here, and just return `StringRef`?
				jhendersonUnsubmitted Done Reply Inline Actions MutObjSecCount? jhenderson: MutObjSecCount?
				auto NewErrOrObj =
				ObjectFile::createObjectFile(MemoryBufferRef(OS.str(), "YamlObject"));
				jhendersonUnsubmitted Done Reply Inline Actions This should be ASSERT_EQ, since the rest of the test assumes there's the same number. jhenderson: This should be ASSERT_EQ, since the rest of the test assumes there's the same number.
				ASSERT_THAT_EXPECTED(NewErrOrObj, Succeeded());
				auto *MutELFObjFile = dyn_cast<ELFObjectFile<ELF64LE>>(NewErrOrObj->get());
				ASSERT_TRUE(MutELFObjFile);
				MutableELFObject<ELF64LE> MutableObject(std::move(*MutELFObjFile));
				jhendersonUnsubmitted Done Reply Inline Actions Rather than reusing the same variable, create a new variable here. Then don't bother with the ObjFile variable above. It doesn't give us anything. jhenderson: Rather than reusing the same variable, create a new variable here. Then don't bother with the…
				abrachetAuthorUnsubmitted Done Reply Inline Actions I think that it does give something, The `ObjectFile`s are made from the same yaml so I'm testing that the public methods return the same values. Its a whole thing because ObjectFile has no copy constructor. Also we don't want to use the ObjectFile after moving it. I also still think that doing something like this MutableELFObject<ELF64LE> MutableObject(std::move(MutELFObjFile)); ELFObjectFile<ELF64LE> ElfObject = &MutableObject; // test the interface will never produce any differences so it doesn't test that MutableELFObject has the same behavior as ELFObjectFile on an unmodified object file. abrachet: I think that it does give something, The `ObjectFile`s are made from the same yaml so I'm…
				jhendersonUnsubmitted Done Reply Inline Actions I think you misunderstood. Having `MutELFObjFile` is fine, as is constructing two objects from the YAML. `ObjFile` doesn't need to exist however. You can use `ELFObjFile` directly now that you're not reusing the local variable for the creation of `MutableObject`, so the two variables you operate on are `ELFObjFile` (created directly from YAML) and `MutableObject` (created indirectly from YAML via MutELFObjFile). jhenderson: I think you misunderstood. Having `MutELFObjFile` is fine, as is constructing two objects from…

				ptrdiff_t ObjFileSecCount =
				std::distance(ELFObjFile->section_begin(), ELFObjFile->section_end());
				ptrdiff_t MutObjSecCount =
				std::distance(MutableObject.section_begin(), MutableObject.section_end());
				ASSERT_EQ(ObjFileSecCount, MutObjSecCount);

				auto TestSections = [](SectionRef ObjFile, SectionRef MutObj) {
				EXPECT_EQ(ObjFile.getAddress(), MutObj.getAddress());
				EXPECT_EQ(ObjFile.getAlignment(), MutObj.getAlignment());
				EXPECT_EQ(ObjFile.getIndex(), MutObj.getIndex());
				EXPECT_EQ(ObjFile.getSize(), MutObj.getSize());
				EXPECT_EQ(ObjFile.isBerkeleyData(), MutObj.isBerkeleyData());
				EXPECT_EQ(ObjFile.isBerkeleyText(), MutObj.isBerkeleyText());
				EXPECT_EQ(ObjFile.isBitcode(), MutObj.isBitcode());
				EXPECT_EQ(ObjFile.isBSS(), MutObj.isBSS());
				EXPECT_EQ(ObjFile.isCompressed(), MutObj.isCompressed());
				EXPECT_EQ(ObjFile.isData(), MutObj.isData());
				EXPECT_EQ(ObjFile.isStripped(), MutObj.isStripped());
				EXPECT_EQ(ObjFile.isText(), MutObj.isText());
				EXPECT_EQ(ObjFile.isVirtual(), MutObj.isVirtual());
				};

				for (const auto &Tuple :
				jhendersonUnsubmitted Done Reply Inline Actions IIRC, this will still abort if there is an error, since the error itself hasn't been handled. The correct thing to do would be to a) check there's no Error, and then b) call consumeError(). jhenderson: IIRC, this will still abort if there is an error, since the error itself hasn't been handled.
				abrachetAuthorUnsubmitted Done Reply Inline Actions Thanks! abrachet: Thanks!
				zip(MutableObject.sections(), ELFObjFile->sections()))
				TestSections(std::get<0>(Tuple), std::get<1>(Tuple));

				// Copy every section header but make no changes. SectionRefs now point to
				// section headers outside of the file's mapping.
				for (const SectionRef Sec : MutableObject.sections())
				ASSERT_THAT_EXPECTED(MutableObject.getMutableSection(Sec), Succeeded());

				for (const auto &Tuple :
				zip(MutableObject.sections(), ELFObjFile->sections()))
				jhendersonUnsubmitted Done Reply Inline Actions Don't use `auto` here. jhenderson: Don't use `auto` here.
				TestSections(std::get<0>(Tuple), std::get<1>(Tuple));
				rupprechtUnsubmitted Done Reply Inline Actions Is it expected that an error here should not cause the test to fail? I think you want an assertion that Expect doesn't have an error instead. rupprecht: Is it expected that an error here should not cause the test to fail? I think you want an…
				}

				// Change a section's name and test that SectionRef::getName() returns the new
				// name.
				TEST(MutableELFObject, ChangeSectionName) {
				jhendersonUnsubmitted Done Reply Inline Actions You need an ASSERT here that there are the correct number of sections. jhenderson: You need an ASSERT here that there are the correct number of sections.
				SmallString<0> Storage;
				Expected<std::unique_ptr<ObjectFile>> ErrOrObj = yaml2ObjectFile(Storage, R"(
				--- !ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_REL
				Machine: EM_X86_64
				Sections:
				- Name: .sec0
				Type: SHT_PROGBITS
				- Name: .sec1
				Type: SHT_PROGBITS
				- Name: .sec2
				rupprechtUnsubmitted Done Reply Inline Actions nit: `ZeroData` is kind of misleading since this is not `0`, but rather an ASCII `'0'`, i.e. this is {48, 48, 48, 48}. At any rate, it doesn't matter that it's "zero", any random data works. Testing against something non-zero is probably healthier for tests too. Maybe just rename it to something else and use different bytes (e.g. to test byte ordering) rupprecht: nit: `ZeroData` is kind of misleading since this is not `0`, but rather an ASCII `'0'`, i.e.
				Type: SHT_PROGBITS)");

				ASSERT_THAT_EXPECTED(ErrOrObj, Succeeded());
				auto *ELFObjFile = dyn_cast<ELFObjectFile<ELF64LE>>(ErrOrObj->get());
				ASSERT_TRUE(ELFObjFile);
				MutableELFObject<ELF64LE> MutableObject(std::move(*ELFObjFile));
				jhendersonUnsubmitted Done Reply Inline Actions You need to assert somewhere that NumSections is as expected. jhenderson: You need to assert somewhere that NumSections is as expected.
				abrachetAuthorUnsubmitted Done Reply Inline Actions I have `ASSERT_EQ(NumSections, 7);` it ended up being 7 sections because of additional sections that `yaml2obj` added. Is this what you meant? abrachet: I have `ASSERT_EQ(NumSections, 7);` it ended up being 7 sections because of additional sections…
				jhendersonUnsubmitted Done Reply Inline Actions Yes, that's right. jhenderson: Yes, that's right.

				auto compareSectionName = [](section_iterator Iter, const char *Name) {
				auto NameOrErr = Iter->getName();
				ASSERT_THAT_EXPECTED(NameOrErr, Succeeded());
				EXPECT_EQ(*NameOrErr, Name);
				};

				ptrdiff_t NumSections =
				std::distance(MutableObject.section_begin(), MutableObject.section_end());
				ASSERT_EQ(NumSections, 7);

				jhendersonUnsubmitted Done Reply Inline Actions You need to show the iterator is still valid after this, by checking std::distance. jhenderson: You need to show the iterator is still valid after this, by checking std::distance.
				auto Iter = MutableObject.section_begin();
				compareSectionName(Iter, nullptr);

				compareSectionName(++Iter, ".sec0");
				compareSectionName(++Iter, ".sec1");
				compareSectionName(++Iter, ".sec2");

				Iter = MutableObject.section_begin();
				std::advance(Iter, 2);
				auto MutSectionOrErr = MutableObject.getMutableSection(*Iter);
				ASSERT_EQ(std::distance(MutableObject.section_begin(), Iter), 2);
				ASSERT_EQ(
				std::distance(MutableObject.section_begin(), MutableObject.section_end()),
				7);
				ASSERT_THAT_EXPECTED(MutSectionOrErr, Succeeded());
				jhendersonUnsubmitted Done Reply Inline Actions You probably also need to show that if makeMutable has been called, calls to the various functions in the interface reference the mutable version. Currently, you only do this for the section name. jhenderson: You probably also need to show that if makeMutable has been called, calls to the various…
				MutSectionOrErr->Name = ".new_name";
				jhendersonUnsubmitted Done Reply Inline Actions EXPECT_EQ -> ASSERT_EQ to avoid dereferencing an invalid iterator later on. I think I either misread the code last time, as I don't know why I wanted this check here. However, you do show that the std::distance from section_begin to section_end is still correct after you called getMutableSection, but not until after you've done the iteration, meaning that your iterator could be invalid part way through your calls to compareSectionName. Move the later std::distance check up to here instead. jhenderson: EXPECT_EQ -> ASSERT_EQ to avoid dereferencing an invalid iterator later on. I think I either…
				abrachetAuthorUnsubmitted Done Reply Inline Actions Is this what you mean? And then remove the one on 156? Sorry wasn't quite sure. abrachet: Is this what you mean? And then remove the one on 156? Sorry wasn't quite sure.
				jhendersonUnsubmitted Done Reply Inline Actions Sorry, I obviously wasn't clear enough, and I realised something else whilst writing this comment. At the time of writing, the std::distance at line 127 (checks the section count before mutating), and the one at line 142 (checks it after mutating) are the ones that should exist. The one at line 155 is unnecessary, since you check it at line 142. The one at line 140 is necessary because it shows that the Iter variable is still valid after getting a mutable section. jhenderson: Sorry, I obviously wasn't clear enough, and I realised something else whilst writing this…

				Iter = MutableObject.section_begin();
				compareSectionName(Iter, nullptr);
				compareSectionName(++Iter, ".sec0");
				compareSectionName(++Iter, ".new_name");
				compareSectionName(++Iter, ".sec2");
				}

				// Test MutableELFSection::setData().
				TEST(MutableELFObject, ChangeSectionContents) {
				SmallString<0> Storage;
				Expected<std::unique_ptr<ObjectFile>> ErrOrObj = yaml2ObjectFile(Storage, R"(
				--- !ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_REL
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Content: "DEADBEEF")");

				ASSERT_THAT_EXPECTED(ErrOrObj, Succeeded());
				auto *ELFObjFile = dyn_cast<ELFObjectFile<ELF64LE>>(ErrOrObj->get());
				ASSERT_TRUE(ELFObjFile);
				MutableELFObject<ELF64LE> MutableObject(std::move(*ELFObjFile));
				rupprechtUnsubmitted Done Reply Inline Actions This macro seems small enough that it should just be typed out rupprecht: This macro seems small enough that it should just be typed out

				labathUnsubmitted Done Reply Inline Actions using `llvm::zip(ObjFile.sections(), MutableObject.sections())` is a bit cleaner way to handle parallel iteration (but don't forget to assert that the range sizes are the same). With something like `EXPECT_THAT(MutableObject.sections(), testing::ContainerEq(ObjFile.sections())` this check would be a one-liner, but it would require a bit more plumbing to make sure the elements are comparable, so it may not be worth it if this is just a one-off thing... labath: using `llvm::zip(ObjFile.sections(), MutableObject.sections())` is a bit cleaner way to handle…
				abrachetAuthorUnsubmitted Done Reply Inline Actions Thanks never knew about zip! abrachet: Thanks never knew about zip!
				ptrdiff_t NumSections =
				std::distance(MutableObject.section_begin(), MutableObject.section_end());

				auto FirstSec = ++MutableObject.section_begin();
				Expected<StringRef> Contents = FirstSec->getContents();
				ASSERT_THAT_EXPECTED(Contents, Succeeded());

				EXPECT_EQ(*Contents, "\xDE\xAD\xBE\xEF");
				EXPECT_EQ(FirstSec->getSize(), Contents->size());

				ArrayRef<uint8_t> Data{'1', '2', '3', '4'};

				auto MutSecOrErr = MutableObject.getMutableSection(*FirstSec);
				ASSERT_THAT_EXPECTED(MutSecOrErr, Succeeded());
				MutSecOrErr->setData(Data);

				FirstSec = ++MutableObject.section_begin();
				Contents = FirstSec->getContents();
				ASSERT_THAT_EXPECTED(Contents, Succeeded());
				EXPECT_EQ(Contents, StringRef(reinterpret_cast<const char >(Data.data()),
				Data.size()));
				rupprechtUnsubmitted Done Reply Inline Actions You can put both `Iter` and `End` in the for header as described here: http://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop for (auto Iter = MutableObject.section_begin(), End = MutableObject.section_end(); Iter != End; ++Iter) rupprecht: You can put both `Iter` and `End` in the for header as described here: http://llvm.

				jhendersonUnsubmitted Done Reply Inline Actions I'm pretty sure you don't need the UL suffix here. Same below. jhenderson: I'm pretty sure you don't need the UL suffix here. Same below.
				abrachetAuthorUnsubmitted Done Reply Inline Actions I don't actually need the L it turns out. Without making it unsigned I get this warning `comparison of integers of different signs` on clang. abrachet: I don't actually need the L it turns out. Without making it unsigned I get this warning…
				jhendersonUnsubmitted Done Reply Inline Actions Sounds like clang isn't doing a good enough job here - it's a spurious warning, because 2 is a positive integer literal so can clearly be handled properly in this comparison. Check elsewhere, but I thought it was more common in the code base to use lower-case for literal suffixes when they're needed. jhenderson: Sounds like clang isn't doing a good enough job here - it's a spurious warning, because 2 is a…
				abrachetAuthorUnsubmitted Done Reply Inline Actions FWIW I think there is there's a function template instantiation somewhere in these macros and so its doing `unsigned_expr == signed_expr` not `unsigned_expr == 2`. abrachet: FWIW I think there is there's a function template instantiation somewhere in these macros and…
				MutSecOrErr->Header.sh_size = 2;
				Contents = FirstSec->getContents();
				ASSERT_THAT_EXPECTED(Contents, Succeeded());
				EXPECT_EQ(*Contents,
				StringRef(reinterpret_cast<const char *>(Data.data()), 2));

				// Check that getSize properly uses the header's sh_size value.
				EXPECT_EQ(FirstSec->getSize(), 2u);

				// Check that Contents has size 2 because header's sh_size was changed.
				EXPECT_EQ(Contents->size(), 2u);

				// Make sure a section wasn't added.
				ptrdiff_t NewNumSections =
				std::distance(MutableObject.section_begin(), MutableObject.section_end());
				EXPECT_EQ(NewNumSections, NumSections);
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 1]AcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 216305

llvm/include/llvm/Object/ELFObjectFile.h

llvm/include/llvm/Object/MutableELFObject.h

llvm/unittests/Object/CMakeLists.txt

llvm/unittests/Object/MutableELFObjectTest.cpp

[Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 1]
AcceptedPublic