As per discussion in D69207
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Please add a test to lit's test suite, perhaps something like llvm/utils/lit/tests/Inputs/shtest-shell/stdout-encoding.txt.
Thanks for working on this.
llvm/utils/lit/lit/display.py | ||
---|---|---|
89 | Please add a comment documenting the platform where this problem can be seen. Include the python version. If people try to simplify this subtle passage of code later, perhaps to abandon python 2 support, such comments should prove helpful. |
Hi @jdenny
Agreed on adding a comment and test - the original patch was a bit information-light, sorry about that.
I am really struggling to figure out how to add a test for this and wonder if you can help/advise.
My environment is python 3.5, ubuntu 16.04 LTS, bash shell with this locale:
LANG=en_US.UTF-8
LANGUAGE=en_US:
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=C
In my environment, the test added under D69207 - shtest-shell/stdout-encoding.txt - hits the UnicodeDecodeError when I run it directly with llvm-lit -a. Setting LC_ALL=en_US.utf-8 avoids the error, obviously. When running the test via make with LIT_OPTS="-a --filter=shtest-shell" make check-lit it does not fail, which is, I assume, why the problem has gone un-detected. I cannot work out how to build a test that runs the 'inner lit' in such an environment where the error would be raised. I can make a test that is adapted from shtest-shell/stdout-encoding.txt with this RUN line
# RUN: not env PYTHONIOENCODING=ascii %{lit} -j 1 -v %{inputs}/shtest-shell-ascii > %t.out
which will trigger the error for me when run with llvm-lit without -a (although the failure mode is to hang, so pretty nasty) but this test passes in make check-all.
I can continue to look further but can you see a trick I am missing? Alternatively, am I on a fool's errand here and a new test will not be possible or practical?
Hi. Sorry for the delay.
I can make a test that is adapted from shtest-shell/stdout-encoding.txt with this RUN line
# RUN: not env PYTHONIOENCODING=ascii %{lit} -j 1 -v %{inputs}/shtest-shell-ascii > %t.outwhich will trigger the error for me when run with llvm-lit without -a (although the failure mode is to hang, so pretty nasty)
Thanks for finding this reproducer!
but this test passes in make check-all.
By "passes", I think you're saying that it doesn't reproduce the error. Right?
If so, I suspect that llvm-lit called directly uses python2 on your system, but check-all uses python3. Can you confirm?
I also suspect this is a python2-specific bug.
By "passes", I think you're saying that it doesn't reproduce the error. Right?
Yes, sorry. I was trying to write a test that would fail _without_ my patch applied so had got myself turned around.
If so, I suspect that llvm-lit called directly uses python2 on your system, but check-all uses python3. Can you confirm?
I also suspect this is a python2-specific bug.
Spot on - I can confirm this is what is happening. Thanks for clearing that up for me.
I will push an update with a new comment and my new test. I guess the evilness of it when/if it were to regress is ok - make sure we don't regress I suppose!
Arg - finger trouble using arc!
I pushed the Flang test that I was writing that made me stumble across this
issue as part of this patch by mistake. Remove it from the new version.
Thanks!
I guess the evilness of it when/if it were to regress is ok - make sure we don't regress I suppose!
I agree that the possibility of regressions causing hangs is not nice, but I think it's better for them to occur in our test suite immediately than in the wild later.
llvm/utils/lit/lit/display.py | ||
---|---|---|
89 | When I first read this new comment, I thought it might be repeating the info from the earlier comment but at a more specific location in the code. To make it clear this is a different issue, I'd prefer "can raise UnicodeDecodeError" -> "can raise UnicodeDecodeError here too". Don't forget the . on the last sentence. | |
llvm/utils/lit/tests/shtest-shell-ascii.py | ||
7 ↗ | (On Diff #276096) | This test mostly copies a piece of shtest-shell.py but with PYTHONIOENCODING=ascii. In doing so, it separates the original shtest-encoding.txt from the new one even though they're covering related bugs. I wonder if it would be better to keep everything together and avoid duplication by just extending shtest-shell.py with these new RUN: lines. A comment could explain that shtest-encoding.txt is the main focus but that the other tests are covered by these new RUN: lines too just in case a problem crops up in them. What do you think? |
llvm/utils/lit/tests/shtest-shell-ascii.py | ||
---|---|---|
7 ↗ | (On Diff #276096) | Agreed. We have to slightly relax the expected output to work for both though as now the shell will skip a character rather than emit one. |
LGTM. Thanks for the changes!
llvm/utils/lit/tests/shtest-shell-ascii.py | ||
---|---|---|
7 ↗ | (On Diff #276096) | Ah, I didn't think of that. But it's probably fine. |
Please add a comment documenting the platform where this problem can be seen. Include the python version. If people try to simplify this subtle passage of code later, perhaps to abandon python 2 support, such comments should prove helpful.