Implement -rpath option for install-name-tool via replacing the old
rpath entry with the new one in place.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/tools/llvm-objcopy/CopyConfig.cpp | ||
---|---|---|
853 | Unlike -delete_rpath, we thought it was necessary to check duplication in case of -rpath. This is because it would not be useful to do something like -rpath A B .... -rpath A C or something like -rpath A B ... -rpath B C. Additionally, this also matches XCode's -rpath behavior. | |
867 | This highlights a warning : Is this fine? Not sure what I should do about it? |
llvm/tools/llvm-objcopy/CopyConfig.cpp | ||
---|---|---|
853 | I can remove this check if people feel the need? |
llvm/test/tools/llvm-objcopy/MachO/install-name-tool-rpath.test | ||
---|---|---|
39–40 | Nit: indent the FileCheck line by an additional space, matching e.g. lines 8 and 19. | |
84 | It probably is best to do this after each individual error, rather than once at the end (just to make sure that one doesn't change the binary and the next change it back). |
llvm/test/tools/llvm-objcopy/MachO/install-name-tool-rpath.test | ||
---|---|---|
35 | Use cp rather than llvm-objcopy to copy the file. | |
68 | You probably want to show the behaviour when different values are used too. | |
75 | Ditto. | |
77–81 | How do you say "RPath" - "ah-path" or something else? If "ah-path", (which is how I read it), the "a" before should be "an". Also might be worth having another case here where -rpath has no arguments at all. | |
llvm/tools/llvm-objcopy/CopyConfig.cpp | ||
853 | No, it makes reasonable sense to me. I guess if people ware used to XCode's behaviour, it's okay to follow it. | |
854–855 | There's an llvm::find_if which just takes the container. There's an is_contained method too which does std::find + the end check, which might be worth extending to allow it to take a lambda (something like is_contained_if). You might want to do that (in a prerequisite patch), to simplify a lot of this code. | |
860 | adl_end(Config.RPathsToUpdate) -> Config.RPathsToUpdate.end() is more common syntax in LLVM. | |
867 | LLVM is built with C++14 so this will fail the build on most build bots. I'm actually struggling to follow the logic of this whole section, if I'm honest! If I follow it right, I think you could simplify it to something like below: if (is_contained(Config.RPathsToRemove), OldRPath) return creatStringError(...); if (is_contained(Config.RPathsToRemove), NewRPath) return creatStringError(...); if (is_contained(Config.RPathsToAdd), OldRPath) return creatStringError(...); if (is_contained(Config.RPathsToAdd), NewRPath) return creatStringError(...); Alternatively, with an is_contained_if you could fold it down to just two cases, one for each of the two existing Config variables. | |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
124–125 | These two lines can be folded together I believe using std::back_insertor in the std::copy line, right? | |
273 | NULL -> nullptr Same below. |
llvm/test/tools/llvm-objcopy/MachO/install-name-tool-rpath.test | ||
---|---|---|
68 | With different values it works fine - added the test now, thanks! | |
77–81 | yup, it should be "an", thanks 😄(added the test too) | |
llvm/tools/llvm-objcopy/CopyConfig.cpp | ||
854–855 | oh thanks, llvm::find_if helps. But, I don't want is_contained because it just returns true or false. I also need access to the element itself in case true is returned (error msg use the element). Therefore, I am explicitly checking whether It != RPaths.end() and using the iterator to access the element if the above condition is true. If I use is_contained, I would have to again iterate through the vector to find the element, right? | |
867 | ok, i got rid of the lambda. No warnings are thrown (the code is little cleaner too perhaps). For the pattern, if (is_contained(Config.RPathsToRemove), OldRPath) return creatStringError(...); the issue is again that I need access to the element (if is_contained is true) to construct the error msg which means iterating the vector again. Instead I am following this pattern: It = find_if(Config.RPathsToRemove, StringChecker) if (It != Config.RPathsToRemove.end()) return createStringError(..<use It here>..); It = find_if(Config.RPathsToAdd, StringChecker) if (It != Config.RPathsToAdd.end()) return createStringError(..<use It here>..); Is this fine? | |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
124–125 | So I tried folding the 2 lines into std::copy(NewName.begin(), NewName.end(), std::back_insertor(LC->Payload)); but it fails the rpath test giving the following error: truncated or malformed object (load command 1 extends past end of file). Surprisingly, even LC->Payload.assign(NewName.begin(), NewName.end()); didn't work (giving the same error as above). However, the initial approach seems to work lol, not sure what's the difference in behavior from these new approaches? |
llvm/tools/llvm-objcopy/CopyConfig.cpp | ||
---|---|---|
853–855 | Move this down to where it's used for the first time. | |
854–855 | Yeah, that's right. I didn't realise you wanted to use the element, so find_if looks good. | |
867 | Thanks. I find the new code much easier to follow. | |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
124–125 | That error message doesn't really give me enough info to understand what's going on. Did you perhaps write one too many bytes somewhere/not write the null byte? Regarding the copy, you'd also need to clear beforehand, to avoid it just appending the data to the end, so actually assign is probably want you want anyway, but I believe you may need to push a null byte to the end too, i.e. LC->Payload.assign(NewName.begin(), NewName.end()); LC->Payload.push_back('\0'); |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
---|---|---|
124–125 | Oh I see now why it wasn't working - I didn't align the Payload to a multiple of 8. NewCmd has been aligned to a multiple of 8 as uint32_t NewCmdsize = alignTo(sizeof(T) + NewName.size() + 1, 8); So using the following approach LC->Payload.assign(NewName.begin(), NewName.end()); LC->Payload.push_back('\0'); still won't be sufficient as we need to align it to multiple of 8. (Something like pushing '\0' for NewCmdSize - sizeof(T) - NewName.size() times) I couldn't find a cleaner way of doing it beside the current implementation itself, what do you think? |
LGTM, but I would like @smeenai or @alexshap (or someone else with Mach-O knowledge) to be happy with the logic too.
llvm/tools/llvm-objcopy/CopyConfig.cpp | ||
---|---|---|
852 |
auto Match = [=](StringRef RPath) { return RPath == Old || RPath == New; }; auto It1 = find_if(Config.RPathsToUpdate, [](const std::pair<StringRef, StringRef> & OldNew) { return Match(OldNew.first) || Match(OldNew.second); }); | |
888 | Config.RPathsToUpdate.emplace_back(Old, New); | |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
120 | Different load commands have very different payloads, so i think this function is a bit misleading. static void updateRPathLoadCommand(LoadCommand &LC, StringRef NewRPath) { assert( .... ); // make sure that LC is indeed LC_RPATH; ... } | |
269 | StringRef Old, New; for (std::tie(Old, New) : Config.RPathsToUpdate) { ... } | |
276 | If I'm not mistaken there are multiple places where this construction is used, so i think we need to factor out it into a helper function. StringRef getRPath(const LoadCommand &LC) { assert(LC.MachOLoadCommand.load_command_data.cmd == MachO::LC_RPATH && "LC_RPATH is expected"); return StringRef(reinterpret_cast<char *>(LC.Payload.data()), LC.Payload.size()).rtrim('\0'); } Plus I would slightly reorganize the code (see below): StringRef Old, New; for (std::tie(Old, New) : Config.RPathsToUpdate) { auto It = find_if(Obj.LoadCommands, [](const LoadCommand &LC) { return LC.MachOLoadCommand.load_command_data.cmd == MachO::LC_RPATH && getRPath(LC) == Old); }); if (It == Obj.LoadCommands.end()) ... if (getRPath(*It) == New) ... updateRPathLoadCommand(*It, New); } |
llvm/tools/llvm-objcopy/CopyConfig.cpp | ||
---|---|---|
852 | nice, thanks! 😊 | |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
120 | So it turns out the way to update LC name follows the same logic (as above) for all the options -rpath, -id, and -change for llvm-install-name-tool. Here is how I was thinking about using it : -rpath : updateLoadCommandName<MachO::rpath_command>(&LC, NewRPath);, -id : updateLoadCommandName<MachO::dylib_command>(&LC, NewID); -change : updateLoadCommandName<MachO::dylib_command>(&LC, NewInstallName); This allowed me to refactor the common logic of updating the name payload of any load command (instead of making it specific for rpath). Would this not be ok? | |
269 | for (std::tie(Old, New) : Config.RPathsToUpdate) is perhaps not allowed : it says for range declaration must declare a variable. I have updated it to StringRef OldRPath, NewRPath; for (auto &RPair : Config.RPathsToUpdate) { std::tie(OldRPath, NewRPath) = RPair; ... } I hope this works? | |
276 | Ok, I refactored getting the RPath construction into a separate function and updated it to be used by add_rpath, delete_rpath and rpath options. For the reorganization, we need to check that New doesn't exists for any of the LC_RPATH commands. Please correct me in case I am wrong, but it seems that the above suggested code finds an LC_RPATH that matches Old. However, above, we throw an error only when either we can't find such a command (It == Obj.LoadCommands.end()) or if the command's rpath equals the New value(getRPath(*It) == New). But we need to throw the latter error in case any command's value equals New. Here is what I have done now: StringRef Old, New; for (auto &RPair : Config.RPathsToUpdate) { std::tie(Old, New) = RPair; auto Match = [&Obj] (StringRef RPath) { return find_if(Obj.LoadCommands, [=](const LoadCommand &LC) { return LC.MachOLoadCommand.load_command_data.cmd == MachO::LC_RPATH && getRPath(LC) == RPath; }); }; auto NewIt = Match(New); if (NewIt != Obj.LoadCommands.end()) return Error auto OldIt = Match(Old); if (OldIt == Obj.LoadCommands.end()) return Error updateLoadCommandName<MachO::rpath_command>(&(*OldIt), New); } Would this work? Thanks. |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
---|---|---|
25 | nit: static P.S. but, please, take a look at my comment below (line 278), let's try to understand the broader picture. | |
120 | alright, let's give it a try. A few nits:
so I'd either use something like updateLoadCommand or updateLoadCommandPayload.
| |
273 | I'm not sure that Match is the best possible name here (it's more like findLoadCommand, but this requires more thinking), |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
---|---|---|
25 | ok thanks 😊 I have refactored this too we'll be needing it for -change option as well. I have renamed it to getLoadCommandPayload. Is this fine? | |
273 | So it seems that this should not be generalized as -id and -change would not be making use of it. This is because:
here is what we are doing here in case of -rpath: for (RPair : Config.RPathsToUpdate) {. // iterate over provided (old,new) pairs ... find_if( Obj.LoadCommands, ...) // iterate over all LCs to find the right LC ... However, the iteration would be reversed in case of -change option as follows: for (LoadCommand &LC : Obj.LoadCommands) {. // iterate over all LCs first ... for (auto InstallNamePair : Config.InstallNamesToUpdate) // for each LC now, find the (old,new) pair ... You see how the order of for loops is reversed. This reversed iteration is necessary in case of -change option to match the behavior of XCode's install_name_tool. And because of this, we wouldn't be able to use the Match function above (as that requires taking in a name argument and iterating over LCs after). In light of this I have currently renamed Match to FindRPathLC (as it is rpath specific). What do you think? Also. in case we do end up changing -change option's behavior in a later diff, I can then perhaps refactor the common code out later? (For now though, it doesn't seems to be needing refactoring.) |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
---|---|---|
25 | I haven't added an assert for LC_REEXPORT_DYLIB , LC_LOAD_UPWARD_DYLIB or LC_LAZY_LOAD_DYLIB as llvm-objcopy doesn't support them yet. I can add a TODO comment here if needed? |
@alexshap Please also take a brief look at how updateLoadCommandPayload function is being used again for -id option here: https://reviews.llvm.org/D82410. What do you think?
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
---|---|---|
25 | TODO comments for missing functionality is usually reasonable, though I wonder whether we should just have an unsupported error somewhere for such cases. | |
121–129 | Since this assert is common between the two functions, it might be worth pulling it into a simple helper. Something like "isLoadCommandWithPayload" and just call it instead: assert(isLoadCommandWithPayload(LC && "unsupported load command encountered"); |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
---|---|---|
25 | so there is currently an unsupported error thrown by MachO/MachOLayoutBuilder.cpp for such cases: ... case MachO::LC_SOURCE_VERSION: // Nothing to update. break; default: // Abort if it's unsupported in order to prevent corrupting the object. return createStringError(llvm::errc::not_supported, "unsupported load command (cmd=0x%x)", cmd); } ... But yeah I can add a TODO above as well. Thanks! | |
121–129 | nice, thanks 😊 |
Here is the diff for -change options as well: https://reviews.llvm.org/D82613. Please take a look at how I am using updateLoadCommandPayload for -id and -change. Thanks!
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
---|---|---|
35 | In release mode warning: unused function 'isLoadCommandWithPayloadString'. Maybe you want to guard the function definition? #ifndef NDEBUG static bool isLoadCommandWithPayloadString(const LoadCommand &LC) { ... } #endif |
llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp | ||
---|---|---|
35 | Thanks for catching this! I was running my tests with asserts on. I have pushed a fix for the issue here: https://reviews.llvm.org/D82768 Please review. |
Use cp rather than llvm-objcopy to copy the file.