This flag is currently missing in llvm-nm, so I'd like to add it for compatibility with existing build scripts.
|1 ↗||(On Diff #194471)|
This is a test for testing llvm-nm's ability to handle no symbols in the output. You shouldn't test --special-syms here unless it is somehow special for no symbols. You should add it as a new test separately.
|5 ↗||(On Diff #194471)|
"error" -> "an error."
|168 ↗||(On Diff #194471)|
What does --special-syms do in GNU nm?
Okay, thanks. llvm-nm shows them by default, I take it?
I wonder if this is something we should change for compatibility?
I don't know. May be. I am personally fine with current behavior of llvm-nm - whenever I don't need special syms I just grep them out.
In general, we should try to emulate GNU nm behaviour (that was the conclusion from the recent LLVM conference). However, I'm slightly concerned about changing behaviour breaking users of the LLVM tools. It might be worth pinging llvm-dev to see if there's anybody who would be unhappy with changing the default at this point, or even just to let people know that this will happen, unless somebody objects. Personally, I have no objections to it, but I don't have any use for llvm-nm and ARM currently.
Also, the test still needs fixing, even if we decide to not change the default behaviour.
Just to see if I understand the comments so far: --special-syms is essentially already the default for llvm-nm (i.e. we print special symbols), but not for GNU nm. And so even though we print special symbols, a configure script that checks if $NM --special-syms is supported will fail when it shouldn't.
If so, then LGTM -- this patch makes things less worse in all regards. If it's not the case, e.g. llvm-nm doesn't print special symbols but we want llvm-nm --special-syms to work in order to trick configure scripts, I think this can lead to confusion -- it is much easier to track down failures if the flag is not at all supported than if it is silently ignored.
There's a counter-example to this though, which I'm concerned about. If we silently accept --special-syms, with it being the default always, there could be configure scripts that don't specify --special-syms that expect to NOT see those symbols (e.g. they're looking for all "normal" symbols). This in turn could result in a silent breakage.
Those people weren't using llvm-nm apparently so we won't break them. If they were they'd already be broken. I think this strictly increases compatibility. Either way we have to choose who to break. People who might use llvm-nm in the future in place of nm and have written their script in a very specific way such that it needs to have those symbols filtered out or people who are using this flag today and are 100% broken but the flag not existing.
The symlink option above would work as well and ensure that future users had a paved way forward. We seem to be hitting an increasing number of cases where "llvm-gnu-foo" tools might be useful. I think the solution to using the "--strip-all-gnu" issue in llvm-objcopy is exactly this as well.
This patch has my LGTM since I think this will only strictly improve active users. As James points out however this might break future users who switch over but I think on balance that's an unlikely subset.
Okay, thinking about it, I think that adding the switch and doing nothing is probably an improvement, although I have some concerns about some people's expectations if we do provide the switch. I agree that some form of GNU mode is probably needed one way or the other, so this is fine as long as my other comments are addressed, so that users can have a path forward who want to migrate. I'd prefer a tool naming option, similar to what we have done in other tools rather than a switch naming option.
That being said, I don't think that the proposed symlink quite works as it stands (i.e. nm versus llvm-nm). In particular, in both llvm-readelf/readobj and llvm-symbolizer/llvm-addr2line, the name is different such that it is possible for the code to decide the behaviour by just looking for a partial match in its own name (e.g. "addr2line" versus "symbolizer" etc). I have seen examples of systems where the executable name format is essentially fixed (e.g. <platform>-<tool>-<version> where the <tool> part has to exactly match an LLVM executable's name), but who also want drop-in compatibility with GNU with only having to change the executable name they are using in their script. I don't have a clear solution to this though for the tools with GNU names currently, where we want to diverge.
We have three options here:
- Don't change anything. New users who expect the switch just have to remove it from their scripts, and will get a clear error message indicating what the problem is. Those who are not expecting the symbols to appear without the switch will be broken. Existing users of llvm-nm are unaffected.
- Add the switch, do nothing. New users who require the switch's behaviour when enabled are happy. Those who are not expecting the symbols to appear without the switch will be broken. Existing users are unaffected. Clearly this is better than 1).
- Add the switch doing the right thing, and change the default behaviour. New users who require the switch's behaviour when enabled are happy. Those who are not expecting the symbols to appear without the switch will also be happy. Existing users relying on the old default behaviour will be broken. Other existing users will be unaffected.
I don't agree that 2) is clearly better than 3), because I suspect the number of users of llvm-nm who rely on the different behaviour is likely quite small currently, whereas there are certainly many people who are transitioning or want to do so and we have no idea of their requirements. However, I cannot guarantee any of that. I only have the discussion from the BoF, where one point was that if we are going to break anything, do so now, and make sure there is a path forwards to work around the issue (in this case it would be to add the switch). 2) and 3) aren't strictly mutually exclusive (i.e. doing 2) before 3) is probably an option - both are better than 1).
|2 ↗||(On Diff #197268)|
is no-op -> is a no-op
We should probably avoid abbreviations like w/o since they aren't necessary and may be hard to read for some people.
|8 ↗||(On Diff #197268)|
Can you confirm that if you run GNU nm on %t output that it produces/does not produce the symbol as appropriate? I just want to make sure that the symbols have been correctly setup to flag up any behaviour change if there is any.
|21 ↗||(On Diff #197268)|
You can make this test slightly more concise by getting rid of the AddressAlign fields at least, and probably Size too.
|33–34 ↗||(On Diff #197268)|
You can delete Value and Size here and below. I don't think you need to explicitly state the Type either.
|8 ↗||(On Diff #197268)|
Yes, I did. GNU nm doesn't print anything unless you provide --special-syms when using yaml2obj output from the test case attached. With --special-syms output exactly matches one from llvm-nm