Due to unavailability of Flang testing on Windows, the shell scripts are being ported to Python. The following changes are being made in this patch: removed test_symbols.sh and common.sh, and ported them to Python. Changes to the tests themselves so that they use the python scripts instead of the shell script.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Some tests are failing, (see message below). I think we will have to retain common.sh till all scripts that use it are ported to python.
Command Output (stderr): -- /var/lib/buildkite-agent/builds/llvm-project/flang/test/Semantics/test_errors.sh: line 7: /var/lib/buildkite-agent/builds/llvm-project/flang/test/Semantics/common.sh: No such file or directory
flang/test/Semantics/OpenACC/acc-symbols01.f90 | ||
---|---|---|
1 | The python used here should be the PYTHON_EXECUTABLE discovered by cmake, not the one in the default path. Typically, the lit.cfg.py determines what the correct python executable. You could define a placeholder %test_symbols in lit.cfg.py that expands to the correct python executable and test_symbols.py program (similar to %flang_fc1 or %clang_cc1 by clang). Otherwise, I think llvm-lit even expands some standard tools itself, I know it does in LLVM for opt. Could you confirm this is done for python? | |
flang/test/Semantics/common.py | ||
10–14 | [not a change request] It does not seem necessary to make this a class. I would understand if test_symbols.py would derive from it to e.g. customize main, but now its a mixed of object-oriented and script style. | |
67–69 | AFAIU, this file is never executed directly. Could you remove this? | |
flang/test/Semantics/test_symbols.py | ||
22–24 | This script write intermediate strings into temporary files just to read them back in again. I don't see reason to do that. | |
25 | diffs is written to, but never used. | |
33 | [typo] blank | |
53 | Prefer subprocess.run or subprocess.check_output | |
73 |
LGTM. Thanks for all the work.
There would be possibilities to use re.compile and/or re.MULTILINE for better perfomance, but I don't think its a bottleneck.
flang/test/Semantics/test_symbols.py | ||
---|---|---|
20 | diff3 is overwritten later, no need set it to an empty string in advance. |
A few minor comments.
flang/test/Semantics/common.py | ||
---|---|---|
2 | Nit: Is this script directly executed? If not do we need this here? | |
37 | Nit: you can rename f18-command as flang-command. | |
flang/test/Semantics/test_symbols.py | ||
48 | Would it be better to rename 1.f90 to source.f90 and 3.f90 to compiled.f90? |
flang/test/Semantics/test_symbols.py | ||
---|---|---|
48 | Also, in the original test script it was $temp/1.90: https://github.com/llvm/llvm-project/blob/main/flang/test/Semantics/test_symbols.sh#L12-L14. IIUC, $temp is a directory unique to the corresponding test. Are you preserving this here? Otherwise tests will start failing in weird way when run in parallel. |
flang/test/Semantics/test_symbols.py | ||
---|---|---|
48 | difflib is a pure-python implementation, it does not write to those files. They are only used in to indicate what to appear in the output as: --- 1.90 +++ 3.f90 See https://docs.python.org/3/library/difflib.html#difflib.unified_diff They also don't need to be filenames, but could also be descriptions such as "Actual output" and "Expected output". The files themselves (1.f90, 2.f90 and 3.f90) were just used in the shell script to pass data between commands, as shell scripts work. This python version keeps the data in strings. |
Fixed a bug in a test file. And also updated the scripts to use docstrings instead of normal comments.
When I run tests after this patch, a directory (flang/test/Semantics/__pycache__) gets created in my checked-out source tree.
Thanks for drawing our attention to this. Not sure what the best approach would be here. See https://reviews.llvm.org/D108023 for a suggestion.
Nothing to do here. This is normal python behavior and improves execution speed. When running tests, a __pycache__ is also created in llvm/utils/lit/lit since this is where the python source code resides. The .pyc files it creates are already in the global .gitignore.
If you don't want these to be created, set PYTHONDONTWRITEBYTECODE=1 in your shell environment.
I don't care whether they are created or not so long as a git commit -a doesn't pick them up.
It shouldn't (from git commit --help):
-a, --all Tell the command to automatically stage files that have been modified and deleted, but new files you have not told Git about are not affected.
Are you seeing different behavior?
Yes, I was, while doing out-of-tree development, since I typically run "git add flang" before reformatting before committing. I'll modify my process accordingly to work around this change.
Do you have a sparse git checkout that is missing the top-level .gitignore (sibling of the llvm, flang, clang etc directories)? Normally, git outright refuses to add files that match .gitignore, unless you pass -f.
@klausler mentions above that he uses an out of tree build which does not have the top level gitignore file. I have created a patch to add pycache and pyc to gitgnore in flang directory.
https://reviews.llvm.org/D108057
A "out-of-tree build" is a build directory that is not the same as the source directory (mandatory for LLVM) and is independent of how the source tree is organized.
I haven't understood @klausler's workflow, as long as -f is not used for the git add command, ignored files mentioned in the .gitignore should not be added.
In the context of LLVM Flang, "out-of-tree" usually means that Flang itself is out-of-tree: https://github.com/llvm/llvm-project/tree/main/flang#building-flang-out-of-tree. This means building in two stages:
- stage 1 - build LLVM, MLIR, Clang
- stage 2 - build Flang against the binaries generated in stage 1 (i.e. as if Flang was an out-of-tree project)
In this workflow, LLVM Flang is like an independent project. It's more like "out-of-tree development" than "out-of-tree build" and .gitignore from LLVM doesn't really exist in this set-up.
This would usually be referred to as a standalone build: https://github.com/llvm/llvm-project/blob/94b4598d77fe0585a8a3bd2a798fc7ce15a6aa56/flang/CMakeLists.txt#L25
In this workflow, LLVM Flang is like an independent project. It's more like "out-of-tree development" than "out-of-tree build" and .gitignore from LLVM doesn't really exist in this set-up.
Your linked readme suggests that the source is checked out in ~/flang/src. This does not correspond to structure from a git clone of the llvm-project repository. Maybe it's a symlink to the flang directory of the llvm-project repository? Why not just ask cmake to use that directory for the source directory? However, I also don't have any problem with adding additional files to flang's .gitignore (D108057) but would ask someone who thinks it is useful to approve that patch.
That's a typo and probably a leftover from when F18 was a separate project (i.e. before merging into llvm-project). Updated in https://reviews.llvm.org/D108198.
This does not correspond to structure from a git clone of the llvm-project repository.
When I test "standalone" builds, I clone llvm-project once, but then build twice (i.e. in two stages). I'm not sure whether that's the intended workflow - I only use it when testing changes that can potentially break it.
Why not just ask cmake to use that directory for the source directory?
But with LLVM's CMake. you can't really build just one sub-project, can you?
We don't really use this workflow, so we just focus on making sure that our changes don't break it.
You can have an LLVM build and a flang standalone build with just a single git checkout. That is:
$ git clone https://github.com/llvm/llvm-project.git $ cmake -S llvm-project/llvm -B build-llvm -GNinja ... $ (cd build-llvm && ninja) $ cmake -S llvm-project/flang -B build-flang -GNinja -DLLVM_DIR=../build-llvm ...
My point is, it is irrelevant how you build LLVM & flang, you will have that same layout of the source checkout in all cases. This layout has a top-level .gitignore that already excludes *.pyc and automatically used by git while doing any operation in any subdirectory of the source checkout. git doesn't even know/care where/how the build directories are located.
I tested what happens if you have a symlink into a subdirectory of the llvm-project checkout. Turns out that git recognizes it is a symlink and still finds the .gitignore of the parent directory the symlink points to. That is, the only means I can think of where git does not recognize the .gitignore is when copying the flang subdirectory somewhere else (and copy back when upstreaming a change) or using git sparse checkouts. The former is not how git was intended to be used and the latter only has experimental support.
Despite doing exactly that, I completely missed that indeed the top .gitignore would still be there and that Git would use it. My bad, sorry. Thank you for the detailed explanation!
The new script doesn't produce anything like the output of the old one, which was a lot more readable.
If I change the second occurrence of "Implicit" to "xxx" in symbol01.f90 to make it fail, the output starts like this.
Command Output (stdout): -- ---Expectedoutput +++Actualoutput @@-1,586+1,591@@ !DEF:/mModule modulem !DEF:/m/fPRIVATE,PURE,RECURSIVE(Function)SubprogramREAL(4) private::f contains !DEF:/m/sBIND(C),PUBLIC,PURE(Subroutine)Subprogram !DEF:/m/s/xINTENT(IN)(Implicit)ObjectEntityREAL(4) !DEF:/m/s/yINTENT(INOUT)(-x-x-x+I+m+p+l+i+c+i+t)ObjectEntityREAL(4) puresubroutines(x,y)bind(c)
With the old script it was
Command Output (stdout): -- !DEF: /m Module module m !DEF: /m/f PRIVATE, PURE, RECURSIVE (Function) Subprogram REAL(4) private :: f contains !DEF: /m/s BIND(C), PUBLIC, PURE (Subroutine) Subprogram !DEF: /m/s/x INTENT(IN) (Implicit) ObjectEntity REAL(4) - !DEF: /m/s/y INTENT(INOUT) (xxx) ObjectEntity REAL(4) + !DEF: /m/s/y INTENT(INOUT) (Implicit) ObjectEntity REAL(4) pure subroutine s (x, y) bind(c)
A file might have a line with whitespace, such as real x(100, 100), whereas the output would have stripped some of the whitespace and have it as real x(100,100). However, Python's difflib library[1] unfortunately doesn't have support for ignoring whitespace, unlike diff which was used in the shell scripts. There are some threads[2][3] asking about charjunk mentioned in the Python docs but it's poorly documented and it doesn't actually ignore characters.
[1] https://docs.python.org/3/library/difflib.html
[2] https://stackoverflow.com/questions/63893283/difflib-ignore-whitespace-diffs-w-ndiff
[3] https://stackoverflow.com/questions/65780782/how-to-do-a-diff-in-python-of-two-text-files-and-ignore-white-space-and-blank-li
flang/test/Semantics/test_symbols.py | ||
---|---|---|
46–50 | "unified_diff" returns a generator. diff_check = "".join(list(unified_diff(diff1, diff3, n=999999, fromfile="1.90", tofile="3.f90"))) |
The python used here should be the PYTHON_EXECUTABLE discovered by cmake, not the one in the default path. Typically, the lit.cfg.py determines what the correct python executable. You could define a placeholder %test_symbols in lit.cfg.py that expands to the correct python executable and test_symbols.py program (similar to %flang_fc1 or %clang_cc1 by clang). Otherwise, I think llvm-lit even expands some standard tools itself, I know it does in LLVM for opt. Could you confirm this is done for python?