Page MenuHomePhabricator

[llvm-libtool-darwin] Print a warning if object file names are repeated
ClosedPublic

Authored by Roger on Nov 3 2021, 11:03 AM.

Details

Summary

Print a warning if llvm-libtool-darwin if any of the object
files provided by the user have the same file name.

The tool will now print a warning if there is a name collision across:

  • Two object files
  • An object file and an object file from within a static library
  • Two object files from different static libraries

Here is an example of the error:

$ llvm-libtool-darwin -static -o archive.a out.o out.o
error: file 'out.o' was specified multiple times.
in: out.o
in: out.o

$ llvm-libtool-darwin -static -o archive.a out.o
$ llvm-libtool-darwin -static -o combined.a archive.a out.o
error: file 'out.o' was specified multiple times.
in: archive.a
in: out.o

This change mimics apple's cctools libtool's behavior which always shows a warning in such cases.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
alexander-shaposhnikov requested changes to this revision.Nov 6 2021, 6:32 PM

Before reviewing the details I'd like to better understand the rationale for the proposed changes (especially given the discrepancy with cctools etc), it'd be good to include it into the summary of the diff.

This revision now requires changes to proceed.Nov 6 2021, 6:32 PM
Roger updated this revision to Diff 385364.Nov 7 2021, 12:33 PM

Change this feature from throwing an error to printing a warning.

Roger updated this revision to Diff 385369.Nov 7 2021, 1:36 PM
Roger marked 2 inline comments as done.

Fix and address comments in tests.

Roger updated this revision to Diff 385371.Nov 7 2021, 1:39 PM
Roger marked 3 inline comments as done.

Fix comment.

llvm/test/tools/llvm-libtool-darwin/create-static-lib.test
54–55

I am just removing that line since I'm changing this behavior to print an error.

Roger planned changes to this revision.Nov 7 2021, 1:39 PM

This change was motivated by an issue observed internally where llvm-libtool silently accepting two input files with the same basename and putting them into the same archive led to a crash. This was ultimately down to an ODR violation on our end, but we noticed that cctools' libtool gives a warning in the same scenario, which at least gives the user some hint of what might be going on.

We originally thought of an emitting an error instead of a warning because we couldn't think of a good reason you'd want two files with the same basename in the same archive (it also results in confusing linker diagnostics, ar extractions, etc.), and because warnings are super easy to miss in build output. However, there might be cases where this situation arises naturally (especially with third-party archives out of your control), so matching cctools behavior and emitting a warning instead of an error does make more sense (and we could always add an option to upgrade the warning to an error if we desired).

Roger updated this revision to Diff 385559.Nov 8 2021, 10:35 AM

Fix test

Roger planned changes to this revision.Nov 8 2021, 11:15 AM

This change was motivated by an issue observed internally where llvm-libtool silently accepting two input files with the same basename and putting them into the same archive led to a crash. This was ultimately down to an ODR violation on our end, but we noticed that cctools' libtool gives a warning in the same scenario, which at least gives the user some hint of what might be going on.

We originally thought of an emitting an error instead of a warning because we couldn't think of a good reason you'd want two files with the same basename in the same archive (it also results in confusing linker diagnostics, ar extractions, etc.), and because warnings are super easy to miss in build output. However, there might be cases where this situation arises naturally (especially with third-party archives out of your control), so matching cctools behavior and emitting a warning instead of an error does make more sense (and we could always add an option to upgrade the warning to an error if we desired).

Thanks fo the explanation, it makes sense to me.

Roger updated this revision to Diff 385913.Nov 9 2021, 11:42 AM

Rebase to new parent

Roger planned changes to this revision.Nov 9 2021, 12:06 PM
Roger planned changes to this revision.Nov 9 2021, 12:29 PM
Roger updated this revision to Diff 385942.Nov 9 2021, 12:45 PM

Stop manually passing FileName around.

Roger planned changes to this revision.Nov 9 2021, 12:46 PM
Roger retitled this revision from [llvm-libtool-darwin] Throw an error if object file names are repeated to [llvm-libtool-darwin] Print a warning if object file names are repeated.
Roger updated this revision to Diff 385981.Nov 9 2021, 2:40 PM

Update asserts

Roger planned changes to this revision.Nov 9 2021, 2:41 PM
Roger updated this revision to Diff 385984.Nov 9 2021, 2:44 PM

Clang format

Roger planned changes to this revision.Nov 9 2021, 2:45 PM
Roger updated this revision to Diff 386301.Nov 10 2021, 1:30 PM

Small changes

Roger planned changes to this revision.Nov 10 2021, 1:39 PM
Roger edited the summary of this revision. (Show Details)Nov 10 2021, 2:53 PM
Roger edited the summary of this revision. (Show Details)
Roger updated this revision to Diff 386328.Nov 10 2021, 2:56 PM
Roger marked 9 inline comments as done.

Respond to comments

Roger updated this revision to Diff 386329.Nov 10 2021, 2:59 PM

Addressing comments

Roger updated this revision to Diff 386332.Nov 10 2021, 3:05 PM

Formatting

Roger added inline comments.Nov 10 2021, 3:47 PM
llvm/tools/llvm-libtool-darwin/llvm-libtool-darwin.cpp
540–542

Done.

543

You're right I shouldn't use an ordered map. I didn't know about StringMap (am very new to the LLVM codebase). I'll use it :)

Roger added a comment.Nov 10 2021, 3:54 PM

I'm not an Apple/Mach-O developer, and don't really know much, if anything, about the platform or the file format, so take my thoughts with a pinch of salt. However, I wanted to question this:

  1. For traditional Unix static libraries, the filenames are largely irrelevant - you can have multiple library members without any issue and it'll just work at link time. This is because such library members are picked based on their symbol names, rather than anything to do with their filenames.
  2. If the existing cctools libtool reports a warning rather than an error, why not do the same here? It seems like reporting an error might cause problems where existing users are creating such a library for good reason. NB: if there is no possible good reason, i.e. the library is fundamentally broken, then an error makes sense - see my above point about not being too familiar with the format.

I changed the behavior to output a warning rather than throw an error for the reasons provided above

Some nits, but I haven't done a detailed review, and am leaving that to those actually invested in this tool.

llvm/test/tools/llvm-libtool-darwin/archive-flattening.test
69

Missing warning: prefix.

llvm/test/tools/llvm-libtool-darwin/create-static-lib.test
66
llvm/tools/llvm-libtool-darwin/llvm-libtool-darwin.cpp
559–561

Nit: delete braces for single-line loop.

Roger updated this revision to Diff 386589.Nov 11 2021, 10:49 AM
Roger marked 3 inline comments as done.

Address comments

llvm/test/tools/llvm-libtool-darwin/archive-flattening.test
69

This stanza is all part of one warning message so only the first line will have the prefix.

smeenai accepted this revision.Dec 1 2021, 2:33 PM

LGTM, thanks!

Roger updated this revision to Diff 399078.Jan 11 2022, 2:20 PM

Insert missing period from warning.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 11 2022, 2:50 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Jan 11 2022, 7:20 PM

The test fails on Windows: http://45.33.8.238/win/52360/step_11.txt

Please take a look, and revert for now if it takes a while to fix.

Thanks for the heads up. We'll need to account for forward vs. back slashes in the test. I reverted for now and will recommit with a fix.

The Windows test failure here is pretty annoying :(

For the test where we specify one file via -l and the other via its path, the output on Windows looks like

warning: file 'L-and-l.test.tmp-input1.o' was specified multiple times.
in: C:\Users\smeenai\llvm-project\build\Release\test\tools\llvm-libtool-darwin\Output\L-and-l.test.tmp/copy/L-and-l.test.tmp-input1.o
in: C:\Users\smeenai\llvm-project\build\Release\test\tools\llvm-libtool-darwin\Output\L-and-l.test.tmp/copy\L-and-l.test.tmp-input1.o

Note that the last slash is a forward slash in one path and a backslash in the other. The forward slash is coming from our path specification in the test, and the backslash is coming from sys::path::append in libtool (to concatenate -L directories and -l names).

One way to fix this would be to introduce a new %{sep} substitution in lit (similar to the existing %{pathsep}), so we could use the appropriate slashes for the platform in our test automatically. Another would be to use sys::path::Style::windows_slash in our sys::path::append call, but that could break with \\?\ paths, which can't use forward slashes.

@mstorsjo, I believe you have a bunch of experience dealing with slash differences on Windows. Any suggestions on the best way to proceed here?

For the test where we specify one file via -l and the other via its path, the output on Windows looks like

warning: file 'L-and-l.test.tmp-input1.o' was specified multiple times.
in: C:\Users\smeenai\llvm-project\build\Release\test\tools\llvm-libtool-darwin\Output\L-and-l.test.tmp/copy/L-and-l.test.tmp-input1.o
in: C:\Users\smeenai\llvm-project\build\Release\test\tools\llvm-libtool-darwin\Output\L-and-l.test.tmp/copy\L-and-l.test.tmp-input1.o

Note that the last slash is a forward slash in one path and a backslash in the other. The forward slash is coming from our path specification in the test, and the backslash is coming from sys::path::append in libtool (to concatenate -L directories and -l names).

One way to fix this would be to introduce a new %{sep} substitution in lit (similar to the existing %{pathsep}), so we could use the appropriate slashes for the platform in our test automatically. Another would be to use sys::path::Style::windows_slash in our sys::path::append call, but that could break with \\?\ paths, which can't use forward slashes.

@mstorsjo, I believe you have a bunch of experience dealing with slash differences on Windows. Any suggestions on the best way to proceed here?

Hmm... Overall I'm not a huge fan of tests like FileCheck -DSOMETHING=%t/suffix or similar, because as we've noticed, the FileCheck literal checks are quite brittle for such things - especially as we can configure which kind of separator that sys::path::append uses too. Although I guess we could mirror that into the lit subtsitutions too. (Also, for such cases, we could maybe make other lit substitutions, like %t, use the same path separator as is configured to be the default for sys::path - that would help fix a bunch of test cases that still are broken with forward slash on Windows.) But it's quite unintuitive to write the match patterns for such cases, to try to mimic exactly what combination of slashes you end up with in each case (when considering people writing tests on unix without access to testing thigs on Windows.)

I don't think considering using sys::path::Style::windows_slash explicitly is the right way to go either. I don't think we should complicate the code to do something non-obvious just for the sake of making the tools produce the path that the test framework expects.

For cases within code itself, I've considered adding a sys::path::equals that accepts either slash direction, and which would be case insensitive on windows (and darwin?). I wonder how hard it would be to add something similar to FileCheck. I don't offhand know exactly how it would work... But I think something like that would be good for easily writing testcases on unix, without spending lots of unnecessary fiddling on making them pass on Windows.

For the test where we specify one file via -l and the other via its path, the output on Windows looks like

warning: file 'L-and-l.test.tmp-input1.o' was specified multiple times.
in: C:\Users\smeenai\llvm-project\build\Release\test\tools\llvm-libtool-darwin\Output\L-and-l.test.tmp/copy/L-and-l.test.tmp-input1.o
in: C:\Users\smeenai\llvm-project\build\Release\test\tools\llvm-libtool-darwin\Output\L-and-l.test.tmp/copy\L-and-l.test.tmp-input1.o

Note that the last slash is a forward slash in one path and a backslash in the other. The forward slash is coming from our path specification in the test, and the backslash is coming from sys::path::append in libtool (to concatenate -L directories and -l names).

One way to fix this would be to introduce a new %{sep} substitution in lit (similar to the existing %{pathsep}), so we could use the appropriate slashes for the platform in our test automatically. Another would be to use sys::path::Style::windows_slash in our sys::path::append call, but that could break with \\?\ paths, which can't use forward slashes.

@mstorsjo, I believe you have a bunch of experience dealing with slash differences on Windows. Any suggestions on the best way to proceed here?

Hmm... Overall I'm not a huge fan of tests like FileCheck -DSOMETHING=%t/suffix or similar, because as we've noticed, the FileCheck literal checks are quite brittle for such things - especially as we can configure which kind of separator that sys::path::append uses too. Although I guess we could mirror that into the lit subtsitutions too. (Also, for such cases, we could maybe make other lit substitutions, like %t, use the same path separator as is configured to be the default for sys::path - that would help fix a bunch of test cases that still are broken with forward slash on Windows.) But it's quite unintuitive to write the match patterns for such cases, to try to mimic exactly what combination of slashes you end up with in each case (when considering people writing tests on unix without access to testing thigs on Windows.)

I don't think considering using sys::path::Style::windows_slash explicitly is the right way to go either. I don't think we should complicate the code to do something non-obvious just for the sake of making the tools produce the path that the test framework expects.

For cases within code itself, I've considered adding a sys::path::equals that accepts either slash direction, and which would be case insensitive on windows (and darwin?). I wonder how hard it would be to add something similar to FileCheck. I don't offhand know exactly how it would work... But I think something like that would be good for easily writing testcases on unix, without spending lots of unnecessary fiddling on making them pass on Windows.

Thanks for the suggestions!

I'm pretty short on time right now, so I just loosened the test conditionals to make it pass. I agree that a slash-agnostic path comparison would be super useful for FileCheck. I imagine you could implement it via regex (which FileCheck already has support for), but I've never looked into FileCheck's code much to say for sure.

The required FileCheck regex pattern would be {{\\/}} or something to that effect, I believe. Could we add a lit substitution, similar to %\t or %/t, which is specifically for use within FileCheck patterns, so uses the regex pattern, maybe?