lit can do substitution in an argument place (like llvm-ar in ln -s llvm-ar) which
is counterintuitive. Replace the symlink mechanism with the more intuitive
EXE=$(command -v) (POSIX shell comformant).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
I believe if we do this without removing the substitutions, we're effectively running EXE=$(command -v /full/path/to/llvm-ranlib) which is a bit weird and IMO doesn't make sense as a final state.
I haven't done much investigation into the history/motivation of these substitutions, but I imagine we can remove at least some of them, and I'm happy to help out.
How do you think we should sequence this?
Agreed that the lit feature is weird and should be removed. AFAIK I don't think the feature is used in other places. We should be able to drop the feature.
How do you think we should sequence this?
Apologies that as a non-native speaker I don't know what "sequence" means in the sentence. If you mean "dropping the feature from lit", are you signing off for the work? :) I'd be happy to review (I don't know enough about lit to locate the feature...)
What you do mean by "in an argument place"? Does llvm-ar appear in an "argument place" in not --crash llvm-ar or env -i LANG=C llvm-ar?
Sorry to be unclear, I was on my phone and being lazy!
I mean do we want to e.g.:
- land patches removing particular substitutions and their uses, and eventually remove the feature if they can all be eliminated
- determine whether removing the feature is feasible, then land patches like this one, then remove the feature
- land patches like this one, then determine whether removing the feature is feasible, then remove the feature (I'm afraid this way will get stuck in a bad state)
But to back up a bit, I'm a bit worried about a chesterton's fence situation - why are these substitutions there. Unfortunately r315085 deleted the explanation for these substitutions, which read:
# For each occurrence of an llvm tool name as its own word, replace it # with the full path to the build directory holding that tool. This # ensures that we are testing the tools just built and not some random # tools that might happen to be in the user's PATH. Thus this list # includes every tool placed in $(LLVM_OBJ_ROOT)/$(BuildMode)/bin # (llvm_tools_dir in lit parlance).
This was originally added to resolve https://bugs.llvm.org/show_bug.cgi?id=8199 in https://github.com/llvm/llvm-project/commit/dc276c315cec9e33df8e0e171b686f79143f651d and is still described in the testing guide.
I'm not sure the causes of that bug are gone, so we might just be reintroducing it. It's easy for me to say "it's better if the path is set correctly, and possibly hermetically" but I'm not confident I have a good enough handle on the various platforms and build modes to know how to get there...
No particular opinion on this change itself, and I might be misunderstanding some of hte comments, but I wouldn't want to see the feature of expanding tool names to full paths to be dropped, at least for some cases. It is useful to me when I'm debugging a test as I can copy the tool path straight from the lit output to the command I wish to execute to see what actually is happening/tweak stuff/make sure I'm running what I think I'm running etc.
The expansion to full paths will be retained at the command position but not argument positions. This is to emulate a shell. The feature is very rarely used: AFAIK only these tool-name.test need the feature. There may be a couple of others but we can probably migrate them.
(I haven't investigated enough about lit to understand the previous comment https://reviews.llvm.org/D83578#2145139 )
not --crash llvm-ar or env -i LANG=C llvm-ar
llvm-ar are in comment positions. It seems that these rely on the bin directory being first in PATH.
Fix llvm-objdump
I think this is still better than the current replacement at non argv[0] place.
This does add more reliance on PATH (which is already being relied upon), but if this is deemed brittle, we can let lit recognize command -v tool as a special form.
lit can do substitution in an argument place (like llvm-ar in ln -s llvm-ar) which is counterintuitive.
Why is it counterintuitive? To me, it is most intuitive if references to an LLVM executable are always to the version in the build directory. By substituting the full path, it is always possible to see which version of a tool is actually being referenced, which is important for test debugging, if nothing else.
FWIW, I find the patch makes the testd less readable. Admittedly, I'm not a massive Linux user, but I had to look up the meaning of command -v online, and it's quite likely I'll forget it again later. It's also not clear to future test writers why one might need to use this pattern.
The utilities and the shell are all of Unix style. It is well established that argv[0] is magic and the shell/OS can find it from PATH. argv[i] (i!=0) is not and an argument such as`ls`, ln, llvm, and objdump in a non argv[0] place does not do anything special.
For example, echo ls prints ls, not the absolute path to ls. I think Windows cmd.exe behaves this way as well. Some three-letter utilities are already widely used, and the replacement at non-argv[0] place can be quite confusing echo llc => /path/to/llc.
There is a little mental gap for readers who haven't used command -v before, but this probably still looks better than ln -s llvm-ranlib .... which gives a false impression that this command is incorrect.
We could also add a replacement and use ln -s %llvm-objdump ... but all uses except the few tool-name.test tests use the raw names. The replacement rules seem a bit overkill (needs some code in lit.*.py files.)
I get what you're saying, but that just isn't how I think about the test, if I'm honest. When I read it, I see a reference to an executable, and assume that this is the executable under test, rather than just an arbitrary string that happened to match a tool name. I could see changing the behaviour to not do the substitution to have the potential to cause confusion.
There is a little mental gap for readers who haven't used command -v before, but this probably still looks better than ln -s llvm-ranlib .... which gives a false impression that this command is incorrect.
It doesn't give a false impression to me, but maybe I'm an exception rather than the rule?
We could also add a replacement and use ln -s %llvm-objdump ... but all uses except the few tool-name.test tests use the raw names. The replacement rules seem a bit overkill (needs some code in lit.*.py files.)
Agreed.
I don't view the problem in the same way as you do (in fact, I don't view it as a problem at all), and as said before, I find the new test harder to follow than the old test. That being said, if multiple others view it differently to me, don't let me stop this proceeding.
FWIW I'm pretty ambivalent either way on this.
But this patch says it replaces the feature - which I don't think it does, since it doesn't remove the functionality from lit (& not sure if replaces all uses of the feature). I think if it's going to be removed, the functionality should be removed or we'll get more of it & more confusion about which way things should be done.
It replaces a usage, not the feature. The usage would be the blocker if we remove the functionality from lit.
Right - what I mean is I don't think it's super valuable to remove the usage without also removing the functionality (& part of that - knowing we've removed all usage of the feature)
This likely removes all the usage. I hoped someone more familiar with lit can remove the functionality.
I think avoiding using a deprecated feature justifies a patch, not necessarily removing the functionality in the same patch.
However, from the patch I see some people may disagree that this is a deprecated feature.
I may have to agree to disagree.
Generally deprecation is used when there's a lot of use we can't practically clean up in the short term - if all the uses are removed I think it's appropriate to remove the functionality. Otherwise I think it'll be pretty easy for new uses to creep in.
& yeah, if it's only this one point of confusion - I guess it's a bit hard to say if this is an improvement. (on the one hand if this is all the usage, it's not a lot of usage so people haven't come across this much - but on the other hand folks don't seem to have had trouble with it so far)