This is an archive of the discontinued LLVM Phabricator instance.

[Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 2]
AcceptedPublic

Authored by abrachet on Jul 26 2019, 10:57 PM.

Details

Summary

Added support for symbols to MutableELFObject.

Diff Detail

Event Timeline

abrachet created this revision.Jul 26 2019, 10:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 26 2019, 10:57 PM
abrachet updated this revision to Diff 212073.Jul 27 2019, 2:46 PM
abrachet edited the summary of this revision. (Show Details)

Added unit tests and increased functionality.

abrachet updated this revision to Diff 212122.Jul 28 2019, 2:55 PM

Made segment support similar to sections from ObjectFile for more continuity.

lebedev.ri resigned from this revision.Jul 29 2019, 1:54 PM
abrachet updated this revision to Diff 213793.Aug 6 2019, 10:18 PM
abrachet edited the summary of this revision. (Show Details)
grimar added a comment.Aug 7 2019, 2:13 AM

Few comments/suggestions are below.

llvm/include/llvm/Object/MutableELFObject.h
213

Mappings.erase(Mappings.begin() + Index)?

Perhaps also worth adding an assert like was done above.

218
Mappings.push_back({NewValues.size() - 1, MappingType::New});
243

You should probably at least add an assert checking ShType is one of the symbol table types.

250

return {};
(Its a bit confusing to see None which is normally used for returning the empty Optional<>)

253

It could be findSectionIndex, since you nowere check that ShType is a symbol table.

270

What about implementing const version using non-const? I.e. call getWhichTable from getWhichTable const.

297

I think when you have a symbol iterator, you usually know if it is a regular or dynamic
symbol. I am not sure it worth to allow mixing them. I.e. I am thinking about making 2 different functions,
like removeSymbol and removeDynamicSymbol.

I am not feeling strong here, though it seems to be less error prone.

abrachet updated this revision to Diff 214014.Aug 7 2019, 3:08 PM

Addressed review comments

abrachet marked 7 inline comments as done.Aug 7 2019, 3:15 PM

Thanks for the comments @grimar!

llvm/include/llvm/Object/MutableELFObject.h
218

I'll do you one better, emplace_back :)

270

How do I do this without const_casting this? Is that how you do it? This is very useful because this duplicating methods like this has been bugging me recently. Thanks!

297

I have added these, I don't feel strongly either. FWIW, all of the getSymbol* methods in ELFObjectFile work on either symbol type so I duplicated that here. I'll keep this as not done so others who have an opinion stronger than ours can chime in.

grimar added inline comments.Aug 8 2019, 1:12 AM
llvm/include/llvm/Object/MutableELFObject.h
270

I think there is no way to do this without const_cast, but this is a normal/common way.

I don't think you want to be modifying the number of dynamic symbols at all. That would require you to make other changes to loadable code, which would impact addresses, and effectively require you to completely relink the program.

You may however still need to modify the contents of the existing dynamic symbols (specifically you may need to be able to update their section indexes).

llvm/include/llvm/Object/MutableELFObject.h
255

How about findSectionOfType?

270

It doesn't look like you use the non-const version anywhere, so it can be deleted, right?

abrachet updated this revision to Diff 214300.Aug 8 2019, 10:33 PM
abrachet marked an inline comment as done.

Added remove to MutableTable
Added getMutableSymbol to MutableELFObject
Added unit tests for older changes

abrachet updated this revision to Diff 214301.Aug 8 2019, 10:43 PM

Remove removeDynSymbol from MutableELFObject

abrachet marked 4 inline comments as done.Aug 8 2019, 10:44 PM

I don't think you want to be modifying the number of dynamic symbols at all. That would require you to make other changes to loadable code, which would impact addresses, and effectively require you to completely relink the program.

You may however still need to modify the contents of the existing dynamic symbols (specifically you may need to be able to update their section indexes).

Ok I can just not add an addDynSymbol and removeDynSymbol methods then.

Not had time to look at everything in this patch, but it feels like you don't want removal functionality to be added in this patch at the same time as basic symbol functionality. That should be a different patch, and can then include both symbol and section removal. What do you think?

By the way, the overall approach with removal seems sound.

llvm/include/llvm/Object/MutableELFObject.h
210–211

This comment doesn't document what the function does. It talks about class implementation details, which doesn't really make sense.

212

Assert that the Index hasn't already been removed.

213

"and only the equivalent to push_back, add." sounds like an unfinished sentence to me.

213–214

I'm inclined to think that getNextIndex shouldn't provide you the first index under any circumstance. You have getFirstNonRemoved instead.

215

No need for doxygen-style comments in the middle of a function.

Your opening comment should however explain what happens when the last index is passed in/reached by the loop.

216

This comment should refer to the parameter e.g. "Adds \param New to the back of the list."

216

I'd call this lastIndex, as this doesn't return an iterator.

element, this -> element. This

217

Don't use "New" as a variable name, since it won't work with the upcoming change in variable naming rules.

222

Comment?

230

How about getFirstIndex, to match getNextIndex?

Also this function needs a comment.

231

I'd put the I == end() clause in here, rather than in the body of the loop and just return I instead of the llvm_unreachable statement.

abrachet updated this revision to Diff 214504.Aug 9 2019, 10:57 PM

Moved remove and add out of this patch

abrachet marked 11 inline comments as done.Aug 9 2019, 11:50 PM
jhenderson added inline comments.Aug 12 2019, 9:29 AM
llvm/include/llvm/Object/MutableELFObject.h
212

be, entries -> be: entries

215

Does this belong to a later patch?

243

I'd be inclined to just call this getSymbolTable.

288–291

This change seems to be unrelated?

Avoid refactors of code which don't have anything to do with the current change, and put them in their own change.

319–321

This overload seems a bit superfluous to me.

324

Call this getMutableDynamicSymbol

333–335

This overload seems a bit superfluous to me.

496

Why is this looping over only the original sections?

llvm/unittests/Object/MutableELFObjectTest.cpp
171

Looks to me like you need a checking first of all counting the number of symbols. Also, why do you need to construct a new symbol_iterator, rather than using the output of ++MutableObject.symbol_begin() directly?

178

FirstSym and SecondSym?

Also, same comment as above.

186

"no change public methods" what are those? :)

I think you need to rewrite this comment.

216

I think the same comments made to patch 1 apply here too, regarding the std::move-ing of ELFObjFile making this reference invalid.

225

I think less confusing names might be FromBase and FromMutable or something like that. My first reaction was "why is a file a SymbolRef?"

253

This loop needs a comment explaining its purpose and the consumeError call.

263

Is this comment correct?

284

Same comments as earlier.

290

This seems a bit redundant...

Same goes for the other changes below. You're basically testing that C++ works...

I think you want to be fetching the symbol again from one or both of the object interfaces (base or sub-class) and showing that the value has been updated.

302

Don't abbreviate names unnecessarily: use "BasicDynamicSymbol"

322

Same comments as above.

328

My gut reaction was "where are the equivalent dynamic symbol tests for the earlier regular symbol cases?" Perhaps worth a comment to that effect somewhere, perhaps making it clear that the earlier cases test common code.

rupprecht added inline comments.Aug 12 2019, 3:02 PM
llvm/include/llvm/Object/MutableELFObject.h
158
using Elf_Sym = typename ELFT::Sym;
llvm/unittests/Object/MutableELFObjectTest.cpp
294–298

What happens to the string table in this case?

The names for symbols in the symbol table is backed by the string table, so if you change the name from "test" to "new_name", the string table needs to be rewritten for that, but it doesn't looks like there's a corresponding MutableTable StringTable to back that change.

abrachet updated this revision to Diff 214755.Aug 12 2019, 8:44 PM

Addressed review comments

abrachet marked 23 inline comments as done.Aug 12 2019, 9:02 PM
abrachet added inline comments.
llvm/include/llvm/Object/MutableELFObject.h
215

No it is needed. I explain this in an answer to your question here

496

Because ELFObjectFile::getSymbol() (and also directly many other symbol related methods) needs the DataRef to have an index to the original section, and this is how to differentiate between dynamic and regular symbols. I will think more about this, but this was the least disruptive way to inherit from ELFObjectFile I believe. It's worth noting that this is a private method.

llvm/unittests/Object/MutableELFObjectTest.cpp
171

SymbolicFile::symbol_begin() returns a basic_symbol_iterator pointing to BasicSymbolRefs whose only meaningful method for testing is getFlags(), but symbol_iterator points to SymbolRefs which has getValue and getName.

290

Oops! They are more meaningful now :)

294–298

Nothing happens for now. The idea, and this is the same as objcopy::elf::Object, is that it gets updated only when doing final layout.

jhenderson added inline comments.Aug 13 2019, 8:18 AM
llvm/unittests/Object/MutableELFObjectTest.cpp
171

Okay. Please write this as:

symbol_iterator FirstSym(++MutableObject.symbol_begin());

189–191

I'm not sure you need this here?

194

Perhaps "in SymbolRef's public methods beteween..." to avoid using "between" twice.

257

ElfObjFile might be invalid. You will need to make another copy of the input, like in patch 1.

abrachet updated this revision to Diff 215231.Aug 14 2019, 1:38 PM
abrachet marked 2 inline comments as done.

Addressed review comments

abrachet marked 8 inline comments as done.Aug 14 2019, 1:40 PM
jhenderson added inline comments.Aug 15 2019, 4:02 AM
llvm/unittests/Object/MutableELFObjectTest.cpp
286

You've made a change to address my comment for this style above. Please make the same change throughout. It wasn't a local issue for my earlier point :)

abrachet updated this revision to Diff 215415.Aug 15 2019, 9:25 AM

Addressed comments

abrachet marked 2 inline comments as done.Aug 15 2019, 9:25 AM
abrachet added inline comments.
llvm/unittests/Object/MutableELFObjectTest.cpp
286

Sorry about that, silly mistake.

rupprecht accepted this revision.Aug 15 2019, 11:46 AM

I don't see anything major left, but please wait for James to take another look

llvm/include/llvm/Object/MutableELFObject.h
503–516

I don't think these methods have anything to do with MutableELFObject, so they should moved Object/SymbolicFile.h so others can use it. (A FIXME comment is fine for now).

llvm/unittests/Object/MutableELFObjectTest.cpp
294–298

I suppose it depends on how the writer/finalizer for MutableElfObject looks like -- a naive implementation that just iterates over sections and writes out the contents will lose the renamed values, because those will be privately owned by MutableELFSymbol, and not reflected in the (non-mutated) string table.

This revision is now accepted and ready to land.Aug 15 2019, 11:46 AM
MaskRay accepted this revision.Aug 19 2019, 10:35 PM
MaskRay added inline comments.
llvm/include/llvm/Object/MutableELFObject.h
497

uint32_t I = 0, E = Sections.originalSize(); I != E; ++I)

https://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop

Not a strong opinion though.

abrachet updated this revision to Diff 216307.Aug 20 2019, 6:14 PM
abrachet marked an inline comment as done.

Addressed review comments

abrachet marked 2 inline comments as done.Aug 20 2019, 6:15 PM
abrachet added inline comments.
llvm/include/llvm/Object/MutableELFObject.h
497

Good catch! Thanks