When processing linker scripts or dependent libraries, if there's a
directory of the same name as the library being searched for, either in
the current directory or earlier in the search order, LLD will try to
open it and report an error. This is because LLD currently uses file
existence check. While we could expand this check and verify that the
file type is not a directory, such a check could be susceptible to
races. Rather, we always try to open the file, but silently ignore the
error and continue with other cases.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I'm not yet sure if this is the best solution, but it's one possible solution to an issue we recently discovered.
lgtm
lld/test/ELF/deplibs-dir.s | ||
---|---|---|
14 | Another good thing to test here is -L%t.dir/notdir/foo when %t.dir/notdir exists as a regular file. With the old bug this will get "Not a directory" (ENOTDIR, which should also be treated the same as ENOENT). Ideally, I think we'd also test that a regular file that cannot be opened fails as expected rather than being skipped. That would need two matching files in the search path to be sure it doesn't go on and open the file that works. e.g., make the first match in the path be a file without read permissions. It might be hard to reproduce an existing but unopenable file across all operating systems. | |
18 | Matching the exact perror text might not be reliable across operating systems, but for a negative match that might be tolerable. Also of potential concern is if messages could ever get translated vs running under LC_ALL=C. |
ignoreError can renamed to something like skipDirectory.
Can you check whether the error_code is an EISDIR?
The code move to Driver.h feels a bit uneasy, hope someone may suggest whether there is a better way.
@arichardson @bd1976llvm @ikudrin
The functionality looks good and I am happy to LG if we can't think of a better way.
lld/ELF/Driver.h | ||
---|---|---|
97 | This change should be made separately. | |
107 | return func(name) || findFromSearchPaths(name, func); |
I also considered keeping the functions in DriverUtils.cpp and change the argument to take std::function<bool(StringRef)>. It is less efficient than the template argument, but these functions aren't on a critical path so that's probably not an issue. Would that be preferable?
Function definitions remaining in .cpp looks fine. You may use llvm::function_ref<bool(llvm::StringRef)> instead of std::function (which needs heap allocation).
Currently a deplib name (${name}) indicates either the literal ${name} or the -l style lib${name}.{a,so}.
We have a conflict if ${name} is foo and there is a directory named foo.
What if we do
if (name.endswith(".a") || name.endswith(".so")) findFromSearchPaths else searchLibraryBaseName
?
That should avoid the conflict, too. A directory is unlikely named .so or .a.
I think when the deplib extension was added, I got slightly uneasy with it attempted both findFromSearchPaths and searchLibraryBaseName. And you raised a justifying example which makes users confused...
lld/ELF/InputFiles.cpp | ||
---|---|---|
119 | Remove ENOTDIR. It is for diagnostics when the file type is expected to be a directory. We may need to check whether category() is https://en.cppreference.com/w/cpp/error/system_category | |
lld/ELF/ScriptParser.cpp | ||
319 | The comment says skipDirectory but it is actually used for sys::fs::exists(path). The semantic is a bit different which may suggest skipDirectory is not suitable. |
I don't think introducing more magical rules about naming patterns ever makes things easier.
It is surprising at first blush that deplibs searches for "foo" and not just what "-lfoo" would do, but it makes sense if you want to allow saying "libfoo.a" in deplibs to avoid matching "libfoo.so" in arcane cases.
So I think just the fix here to make all cases of not-existing-files candidates on search paths be ignored robustly is sufficient. The extra set of candidates that deplibs cases will attempt is tolerable and IMHO it's far preferable to having any more magic in the semantics of the behavior than there already is
lld/ELF/InputFiles.cpp | ||
---|---|---|
119 | This is inaccurate. ENOTDIR is also an expected error for certain cases of a nonexistent path, for example foo/bar/baz when foo/bar is an existing non-directory file is expected to be ignored and not stop continuing to search the next path candidate. The only case when open without O_DIRECTORY returns ENOTDIR is a case that should be treated as equivalent to ENOENT for this use. |
It's not adding more magical rules. The chained findFromSearchPaths and searchLibraryBaseName is replaced with testing just one form.
We can then get rid of llvm::function_ref<bool(StringRef)>.
I would rather keep the search scheme as simple and general as possible. The ".a" and ".so" suffixes are just conventions (not part of the ELF standard). My hope was to make the autolinking scheme implementable by any ELF linker.
LGTM to the proposed fix from me. If memory serves, I experimented with the same approach (as proposed here) when autolinking was first implemented; however, I decided to make the code as minimal as possible and see what fell out from real world usage as It wasn't clear to me what problems build systems would be able to workaround vs which would be blockers.
If we want to keep minimum code, we can just swap searchLibraryBaseName and findFromSearchPaths branches.
I'd really liked that the feature started with -lfoo (instead of foo) in the first place if it were augmented with the ability to link an arbitrary absolute/relative path, since that is unambiguous.
Perhaps the ship has sailed now.
Have we reached a decision on this? I'd like to land this change since it's currently blocking a Clang roll for us.
Just swap searchLibraryBaseName and findFromSearchPaths branches. It's the simplest/least intrusive fix.
That alone is not sufficient if the directory with the same name exists in the current directory as is the case in the attached test case, since we'll take this branch https://github.com/llvm/llvm-project/blob/249a21ab188419b019adea21716f5c779c063de3/lld/ELF/InputFiles.cpp#L493 and fail.
OK. Then move searchLibraryBaseName to the top. I think the comment https://reviews.llvm.org/D117933#3267443 is also relevant to decrease the number of filesystem probing.
return func(name) || findFromSearchPaths(name, func);