This is an archive of the discontinued LLVM Phabricator instance.

[lit] Read command stdout/stderr as text on Windows
ClosedPublic

Authored by mstorsjo on Feb 26 2022, 2:42 PM.

Details

Summary

This takes care of normalizing newlines back to single LF instead
of CRLF.

This on itself breaks on a couple tests that accidentally seem to
be writing binary data to stdout; make sure those cases are piped
to /dev/null instead of actually written to a terminal.

This is an alternative to half of the fixes in D120546.

Diff Detail

Event Timeline

mstorsjo created this revision.Feb 26 2022, 2:42 PM
mstorsjo requested review of this revision.Feb 26 2022, 2:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2022, 2:42 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
rnk accepted this revision.Feb 28 2022, 5:01 PM

lgtm

This will probably have the side effect of reducing the double newlines I have seen in lit test output... That could be nice.

llvm/utils/lit/lit/TestRunner.py
315

The intra-pipeline file descriptors are all *binary*, which is key to making the test suite pass.

787

docs suggest that text=True is the preferred spelling:
https://docs.python.org/3/library/subprocess.html#frequently-used-arguments

My reading of the docs is that only the files opened internally for the STDOUT / STDERR sentinel pipe values are affected by this setting, so it's safe. Anything being read by lit will be printed, and won't be important binary data.

This revision is now accepted and ready to land.Feb 28 2022, 5:01 PM
mstorsjo added inline comments.Mar 1 2022, 12:51 AM
llvm/utils/lit/lit/TestRunner.py
787

The new option spelling text seems to exist since Python 3.7, but our build requirement is Python 3.6, so I’d stick with the older name for now.

mstorsjo updated this revision to Diff 411997.Mar 1 2022, 12:56 AM

Squashed in a libcxx test update, removing an XFAIL, rerunning premerge tests in that form.

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 12:56 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
This revision now requires review to proceed.Mar 1 2022, 12:56 AM
This revision was not accepted when it landed; it landed in state Needs Review.Mar 1 2022, 4:25 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

This broke a lit test in utils/lit/tests/Inputs/shtest-shell/stdout-encoding.txt. (Too bad that the premerge tests are broken due to unrelated issues, hiding such failures.)

mstorsjo reopened this revision.Mar 1 2022, 4:50 AM

@rnk What do you think about the test in https://github.com/llvm/llvm-project/blob/main/llvm/utils/lit/tests/Inputs/shtest-shell/stdout-encoding.txt which gets broken by this patch? It was added in https://github.com/llvm/llvm-project/commit/27fdf8a29d1e0740c342d428fa48eda7b088ac8e / D69207. Should we just conclude that we shouldn't print non-ascii to stdout in lit tests and remove that testcase?

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 1:16 PM
rnk added a comment.Mar 1 2022, 1:23 PM

@rnk What do you think about the test in https://github.com/llvm/llvm-project/blob/main/llvm/utils/lit/tests/Inputs/shtest-shell/stdout-encoding.txt which gets broken by this patch? It was added in https://github.com/llvm/llvm-project/commit/27fdf8a29d1e0740c342d428fa48eda7b088ac8e / D69207. Should we just conclude that we shouldn't print non-ascii to stdout in lit tests and remove that testcase?

What is the failure mode? Does lit behave reasonably when binary output is printed to stdout, or does it through an exception? We should expect developers to do this by accident from time to time, and we should adjust that test to express what we want from lit. I guess I'm saying we should keep the test and update the expectations in the outer test.

@rnk What do you think about the test in https://github.com/llvm/llvm-project/blob/main/llvm/utils/lit/tests/Inputs/shtest-shell/stdout-encoding.txt which gets broken by this patch? It was added in https://github.com/llvm/llvm-project/commit/27fdf8a29d1e0740c342d428fa48eda7b088ac8e / D69207. Should we just conclude that we shouldn't print non-ascii to stdout in lit tests and remove that testcase?

What is the failure mode? Does lit behave reasonably when binary output is printed to stdout, or does it through an exception? We should expect developers to do this by accident from time to time, and we should adjust that test to express what we want from lit. I guess I'm saying we should keep the test and update the expectations in the outer test.

The failure mode is pretty nonobvious - it will indeed come up occasionally. When that happens, we get a python exception:

UNRESOLVED: shtest-shell :: stdout-encoding.txt (1 of 1)
******************** TEST 'shtest-shell :: stdout-encoding.txt' FAILED ********************
Exception during script execution:
Traceback (most recent call last):
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/worker.py", line 76, in _execute_test_handle_errors
    result = test.config.test_format.execute(test, lit_config)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/formats/shtest.py", line 27, in execute
    return lit.TestRunner.executeShTest(test, litConfig,
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1608, in executeShTest
    return _runShTest(test, litConfig, useExternalSh, script, tmpBase)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1556, in _runShTest 
    res = runOnce(execdir)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1535, in runOnce
    res = executeScriptInternal(test, litConfig, tmpBase, script, execdir)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 915, in executeScriptInternal
    exitCode, timeoutInfo = executeShCmd(cmd, shenv, results, timeout=litConfig.maxIndividualTestTime)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 161, in executeShCmd
    finalExitCode = _executeShCmd(cmd, shenv, results, timeoutHelper)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 605, in _executeShCmd
    res = _executeShCmd(cmd.lhs, shenv, results, timeoutHelper)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 593, in _executeShCmd
    return _executeShCmd(cmd.rhs, shenv, results, timeoutHelper)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 816, in _executeShCmd
    procData[-1] = procs[-1].communicate()
  File "/usr/lib/python3.8/subprocess.py", line 1028, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/usr/lib/python3.8/subprocess.py", line 1906, in _communicate
    stdout = self._translate_newlines(stdout, 
  File "/usr/lib/python3.8/subprocess.py", line 905, in _translate_newlines
    data = data.decode(encoding, errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 14: invalid start byte
     

********************
******************** 
Unresolved Tests (1):
  shtest-shell :: stdout-encoding.txt
mstorsjo updated this revision to Diff 412261.Mar 1 2022, 2:47 PM

Update the lit tests references

What is the failure mode? Does lit behave reasonably when binary output is printed to stdout, or does it through an exception? We should expect developers to do this by accident from time to time, and we should adjust that test to express what we want from lit. I guess I'm saying we should keep the test and update the expectations in the outer test.

The failure mode is pretty nonobvious - it will indeed come up occasionally. When that happens, we get a python exception:

UNRESOLVED: shtest-shell :: stdout-encoding.txt (1 of 1)
******************** TEST 'shtest-shell :: stdout-encoding.txt' FAILED ********************
Exception during script execution:
Traceback (most recent call last):
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/worker.py", line 76, in _execute_test_handle_errors
    result = test.config.test_format.execute(test, lit_config)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/formats/shtest.py", line 27, in execute
    return lit.TestRunner.executeShTest(test, litConfig,
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1608, in executeShTest
    return _runShTest(test, litConfig, useExternalSh, script, tmpBase)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1556, in _runShTest 
    res = runOnce(execdir)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1535, in runOnce
    res = executeScriptInternal(test, litConfig, tmpBase, script, execdir)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 915, in executeScriptInternal
    exitCode, timeoutInfo = executeShCmd(cmd, shenv, results, timeout=litConfig.maxIndividualTestTime)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 161, in executeShCmd
    finalExitCode = _executeShCmd(cmd, shenv, results, timeoutHelper)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 605, in _executeShCmd
    res = _executeShCmd(cmd.lhs, shenv, results, timeoutHelper)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 593, in _executeShCmd
    return _executeShCmd(cmd.rhs, shenv, results, timeoutHelper)
  File "/home/martin/code/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 816, in _executeShCmd
    procData[-1] = procs[-1].communicate()
  File "/usr/lib/python3.8/subprocess.py", line 1028, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/usr/lib/python3.8/subprocess.py", line 1906, in _communicate
    stdout = self._translate_newlines(stdout, 
  File "/usr/lib/python3.8/subprocess.py", line 905, in _translate_newlines
    data = data.decode(encoding, errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 14: invalid start byte
     

********************
******************** 
Unresolved Tests (1):
  shtest-shell :: stdout-encoding.txt

@rnk Do you think this failure mode is tolerable?

rnk added inline comments.Mar 2 2022, 4:21 PM
llvm/utils/lit/lit/TestRunner.py
787

Can we pass errors="replace" to get something like the previous behavior? I think it's more helpful to show something like ?f?o?o when tests fail.

mstorsjo marked an inline comment as done.Mar 3 2022, 1:55 AM
mstorsjo added inline comments.
llvm/utils/lit/lit/TestRunner.py
787

Oh, yes, that seems to work nicely!

mstorsjo updated this revision to Diff 412638.Mar 3 2022, 2:02 AM
mstorsjo marked an inline comment as done.

Add the errors = 'replace' parameter to the Popen call.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 3 2022, 3:32 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.