This initial patch adds a MutableELFObject class for doing mutations on sections of an ELFObjectFile.
Details
Diff Detail
Event Timeline
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
160 | Probably this should be handleError(MutSecOrErr.takeError()); |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
118 | Yes it is used and thus tested implicitly. How would I go about testing the private interface? I could make it protected and then inherit from it in the unit test, but that doesn't sound great. FWIW this is almost identical to the getHeader method in ELFFile. https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Object/ELF.h#L108 | |
122 | Yes. It is used for every public method on SectionRef that gets tested in the unit test. Is this enough or should I test it directly as well? | |
144 | I did an explicit copy /I think/ in the unit test now. I have an ELFObjectFile * and then deference it and assign to an ObjectFile &. Does this make a copy or does the compiler omit it and just assign the address? | |
169 | This was referring to moveSectionNext() before. It does get used, it is an overload of the function that content_iterator::operator ++() calls. It must be overriden because ELFObjectFile::moveSectionNext() iterates over sections not with their index but their mapped address. This uses index into MutableTable. | |
llvm/unittests/Object/MutableELFObjectTest.cpp | ||
44 | I don't think so. Then both will be MutableELFObject's and the same virtual methods will be called for each, I believe. I think testing the methods on an ELFObjectFile and MutableELFObject test that overloads for MutableELFObject behave the same as the methods in ELFObjectFile when the sections haven't been changed. | |
79 | Thanks! | |
115 | I have ASSERT_EQ(NumSections, 7); it ended up being 7 sections because of additional sections that yaml2obj added. Is this what you meant? | |
193 | I don't actually need the L it turns out. Without making it unsigned I get this warning comparison of integers of different signs on clang. |
I added getMutableSection(SectionRef) and changed getMutableSection(section_iterator Sec) to just call that with a *Sec. Maybe I should remove the section_iterator overload though?
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
144 | Kind of the whole point of references is that they allow you to "refer" to other objects without making copies of them. If you want to make a copy, drop the & and assign it to an ObjectFile (or ELFObjectFile, or something) |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
144 | Oh yeah. What am I talking about! |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
21 | Comment? | |
44 | How about a comment for MutableELFObject? | |
87 | "regardless if it" sounds off to me. How about "regardless of whether it" | |
118 | Sorry, missed that getHeader was private. Being able to modify the header contents is a required thing however, so you might want to consider how this could be done (possibly a different change though). | |
122 | I have no idea which method this was referring to any more... | |
141 | overrie -> override | |
142 | wether -> whether | |
169 | Sorry, I missed that moveSectionNext() is an override. | |
llvm/unittests/Object/MutableELFObjectTest.cpp | ||
55 | Rather than reusing the same variable, create a new variable here. Then don't bother with the ObjFile variable above. It doesn't give us anything. | |
89 | Don't use auto here. | |
115 | Yes, that's right. | |
142 | EXPECT_EQ -> ASSERT_EQ to avoid dereferencing an invalid iterator later on. I think I either misread the code last time, as I don't know why I wanted this check here. However, you do show that the std::distance from section_begin to section_end is still correct after you called getMutableSection, but not until after you've done the iteration, meaning that your iterator could be invalid part way through your calls to compareSectionName. Move the later std::distance check up to here instead. | |
193 | Sounds like clang isn't doing a good enough job here - it's a spurious warning, because 2 is a positive integer literal so can clearly be handled properly in this comparison. Check elsewhere, but I thought it was more common in the code base to use lower-case for literal suffixes when they're needed. |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
122 | Oops I think I responded to this before adding the comments so I forgot to add a what the comment was originally referring to. It was getSection(DataRefImpl). Its the function that I made virtual from ELFObjectFile. All the tests that I have now like FirstSec->getSize() test this method |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
21 | Not really sure what to say, is that enough you think? | |
44 | Same as above, not quite sure what to say. | |
llvm/unittests/Object/MutableELFObjectTest.cpp | ||
55 | I think that it does give something, The ObjectFiles are made from the same yaml so I'm testing that the public methods return the same values. Its a whole thing because ObjectFile has no copy constructor. Also we don't want to use the ObjectFile after moving it. I also still think that doing something like this MutableELFObject<ELF64LE> MutableObject(std::move(*MutELFObjFile)); ELFObjectFile<ELF64LE> *ElfObject = &MutableObject; // test the interface will never produce any differences so it doesn't test that MutableELFObject has the same behavior as ELFObjectFile on an unmodified object file. | |
142 | Is this what you mean? And then remove the one on 156? Sorry wasn't quite sure. | |
193 | FWIW I think there is there's a function template instantiation somewhere in these macros and so its doing unsigned_expr == signed_expr not unsigned_expr == 2. |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
21 | Why would you use this class? What is the overall aim here? It probably wants to say something like it is used by MutableELFObject to provide a mutable version of an ELFObjectFile Section or something to that effect. | |
44 | I think this comment needs to explain why this class exists. Essentially it should say why we can't use ELFObjectFile directly for our purposes. | |
llvm/unittests/Object/MutableELFObjectTest.cpp | ||
55 | I think you misunderstood. Having MutELFObjFile is fine, as is constructing two objects from the YAML. ObjFile doesn't need to exist however. You can use ELFObjFile directly now that you're not reusing the local variable for the creation of MutableObject, so the two variables you operate on are ELFObjFile (created directly from YAML) and MutableObject (created indirectly from YAML via MutELFObjFile). | |
142 | Sorry, I obviously wasn't clear enough, and I realised something else whilst writing this comment. At the time of writing, the std::distance at line 127 (checks the section count before mutating), and the one at line 142 (checks it after mutating) are the ones that should exist. The one at line 155 is unnecessary, since you check it at line 142. The one at line 140 is necessary because it shows that the Iter variable is still valid after getting a mutable section. |
Aside from a comment change, this looks good to me. I'd wait for others to approve too though, and I wouldn't necessarily land it even then until some later patches are ready too, although I'm not sure about that.
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
46 | I'd change this slightly: ... an ELFObjectFile. The ELFObjectFile class only provides ... |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
24 | You should be able to use the more common: using Elf_Shdr = typename ELFT::Shdr; | |
82–85 | "OriginalValues" is overloaded, so it's not clear which one is being referenced in the constructor A common hacky practice in LLVM is to name the constructor arg "TheOriginalValues" or something -- that's fine to use if you can't come up with any other different name. | |
135–136 | ditto | |
llvm/unittests/Object/MutableELFObjectTest.cpp | ||
89–90 | Is it expected that an error here should not cause the test to fail? I think you want an assertion that Expect doesn't have an error instead. |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
85 | llvm::transform(OriginalValues, ...) | |
114 | I think such Out of bounds assertions are probably not very necessary. They will certainly segfault if the user does have out-of-bounds accesses. | |
117 | Mappings[Index] = MappingType(NewValues.size(), MappingType::New); NewValues.emplace_back(std::forward<Args>(Arguments)...); | |
151 | typo: overridden the section_iterators in MutableELFObject -> MutableELFObject::section_begin ? (Just name it explicitly) | |
185 | for (SectionRef Sec : MutObj.sections()) SectionRef has just 2 words. It is usually passed by value. | |
188 | if (auto MutSecOrErr = MutObj.getMutableSection(Sec)) then; else handleError(...); | |
190 | typo: sh_addralign |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
114 | Won't that entirely depend on how the memory is laid out, compiler-generated checks etc? A segmentation fault hardly provides any useful information, whereas at least the assertion shows what's going wrong in builds with assertions enabled... |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
114 | I think if the user of this library makes an out-of-bounds access, the program will very quickly segfault with a clear stack trace (symbolization can happen with a symbol table, even if debug info is absent) that the problem is related to the use of MutableELFObject. There are lots of ways to detect errors: -D_GLIBCXX_DEBUG / asan / ... We can add some assert() in some tricky places (they also serve as documentation of some invariants) but here I think it is probably not very necessary. |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
114 |
Having dealt with many prod crashes in the past, I strongly disagree. C++ optimizations can often make the crash happen far away from the actual bug. I think it's much more common to run tools in debug mode than in asan, so it's useful to have these. |
FYI, you'll have to rebase this whole patch series after rL368826 due to Section::getName() changing signature
I was safe for this patch it seems but not others, thanks for the heads up.
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
114 | I haven't removed the assertions in this diff. I think worst case it looks ugly and I agree that they aren't super necessary but they have helped me track bugs down before so I think the benefit outweighs the cost. |
I've been thinking about this, is size_t safe to use? Some methods from inherited classes use uint64_t on my machine its fine because even unsigned long and unsigned long long have the same width and sign but on a 32 bit machine I think that assigning to a size_t (which is guaranteed to be word size, I think) from a uint64_t might produce a warning. I don't think there are any examples in this patch, but this seemed like the best place to ask.
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
147 | Actually, here is an example, MutableTable::operator [](size_t) being called with DataRefImpl::p which is uint64_t. |
You should use size_t and consider casting for cases using uint64_t. In a 32-bit build, size_t is usually a 32-bit number (i.e. unsigned int), so theoretically, the compiler could produce warnings for truncation issues when converting from a uint64_t. Since the main thing these indexes are used for is for accessing a vector, the type should match the argument type of std::vector<T>::operator[], which is (usually) a size_t.
@MaskRay In the past I remember you finding usages on certain types (or similar) in the code base. Is there some kind of tool you use that could search every time a function that takes a size_t but is called with a unit64_t? Do you use clang-query for these kinds of things?