This is an archive of the discontinued LLVM Phabricator instance.

cmake: Specify reference outputs in llvm_test_data()
ClosedPublic

Authored by MatzeB on Aug 21 2018, 10:36 AM.

Details

Summary

Specify reference outputs in llvm_test_data() so they get copied to the builddir and used from there.

This is a follow-up to https://reviews.llvm.org/D50209

Diff Detail

Event Timeline

MatzeB created this revision.Aug 21 2018, 10:36 AM

After applying this patch, the tests fail for me because it is taking the hash of stdout instead of the file mentioned in llvm_test_verify_hash_program_output macro.
Changing line 64 of cmake/modules/SingleMultiSource.cmake to ${CMAKE_SOURCE_DIR}/HashProgramOutput.sh ${file} will fix this.

Using llvm_test_verify(WORKDIR ${CMAKE_CURRENT_BINARY_DIR} after llvm_test_verify_hash_program_output is redundant as it is already called in llvm_test_verify_hash_program_output.

I think it might be better if llvm_test_verify_hash_program_output supports multiple files at a time as it will decrease the length of cmake files (cmake files of Blur, Dither, and Interpolation have multiple reference files and other benchmarks might also have multiple reference files (if not now maybe in future)).

cmake/modules/SingleMultiSource.cmake
64

${CMAKE_SOURCE_DIR}/HashProgramOutput.sh ${file}

MatzeB updated this revision to Diff 163428.Aug 30 2018, 3:15 PM

After applying this patch, the tests fail for me because it is taking the hash of stdout instead of the file mentioned in llvm_test_verify_hash_program_output macro.
Changing line 64 of cmake/modules/SingleMultiSource.cmake to ${CMAKE_SOURCE_DIR}/HashProgramOutput.sh ${file} will fix this.

Yes sorry, I had fixed this in my local copy too but failed to update the review.

Using llvm_test_verify(WORKDIR ${CMAKE_CURRENT_BINARY_DIR} after llvm_test_verify_hash_program_output is redundant as it is already called in llvm_test_verify_hash_program_output.

I don't understand this. llvm_test_verify_hash_program_output() only inserts a hashing step to the verification script, you still need the final fpcmp invocation (and the WORKDIR setting is separate for every step).

I think it might be better if llvm_test_verify_hash_program_output supports multiple files at a time as it will decrease the length of cmake files (cmake files of Blur, Dither, and Interpolation have multiple reference files and other benchmarks might also have multiple reference files (if not now maybe in future)).

My motivation there was to not rewrite these CMakeFiles much more than necessary. You can probably shorten them by introducing a higher level macro that combines the hashing + fpcmp for multiple files. But that should be done in a separate commit.

And for some context about the seemingly excessive use of WORKDIR ${CMAKE_CURRENT_BINARY_DIR} here: One of my original plans was to make the test-suite build directory relocatable, i.e. after building you could move it to any other directory and still correctly run the benchmarks. That would mean have to stop using absolute paths and need to express everything relative to the benchmark binary. I hoped that an intermediate step towards this is changing as much as possible to WORKDIR ${CMAKE_CURRENT_BINARY_DIR} + relative paths. A next step would then be to change the default workdir to the binary dir.

This needs some more changes especially in the SPEC benchmarks though. Also it seems like I don't really need relocatability for my use cases though so I currently don't plan to push to finish this. Either way making some of the changes here and in my other commits didn't hurt...

proton accepted this revision.Aug 31 2018, 4:04 AM

LGTM

I don't understand this. llvm_test_verify_hash_program_output() only inserts a hashing step to the verification script, you still need the final fpcmp invocation (and the WORKDIR setting is separate for every step).

In the definition of llvm_test_verify_hash_program_output(), llvm_test_verify(WORKDIR ${CMAKE_CURRENT_BINARY_DIR} ${CMAKE_SOURCE_DIR}/HashProgramOutput.sh ${_file}) is called which sets the work directory to current binary directory. So we need not call llvm_test_verify with WORKDIR ${CMAKE_CURRENT_BINARY_DIR} argument after llvm_test_verify_hash_program_output, simply llvm_test_verify(${FPCMP} bilinear.reference_output bilinearOutput.txt) will work. Passing WORKDIR argument does not cause any harm though.

My motivation there was to not rewrite these CMakeFiles much more than necessary. You can probably shorten them by introducing a higher level macro that combines the hashing + fpcmp for multiple files. But that should be done in a separate commit.

Okay.

This revision is now accepted and ready to land.Aug 31 2018, 4:04 AM
This revision was automatically updated to reflect the committed changes.
tra added a comment.Aug 31 2018, 5:33 PM

I've ran into an unexpected problem after this patches have landed.

CUDA test suite generates multiple tests that use the same reference output.
I can't just use llvm_test_data(test_XYZ, "test.reference_output"), because there can only be one test.reference_output in the build directory.
I also can't manually create a symlink $SRC/test.reference_output -> $BUILD_DIR/test-XYZ.reference_output and then use llvm_test_data(SOURCE_DIR=$BUILD_DIR test-XYZ.reference_output) because it can't link to itself.
The only way around is to create a new directory, symlink original file there as test_XYZ and then use the temp dir as SOURCE_DIR to create test_XYZ.reference_output in the build dir. This is rather convoluted.

We probably need llvm_test_data with an optional argument that would allow providing a different name for the file name in the build dir, or make it unique automatically.

MatzeB added a comment.EditedAug 31 2018, 5:39 PM
In D51048#1221423, @tra wrote:

I've ran into an unexpected problem after this patches have landed.

CUDA test suite generates multiple tests that use the same reference output.
I can't just use llvm_test_data(test_XYZ, "test.reference_output"), because there can only be one test.reference_output in the build directory.
I also can't manually create a symlink $SRC/test.reference_output -> $BUILD_DIR/test-XYZ.reference_output and then use llvm_test_data(SOURCE_DIR=$BUILD_DIR test-XYZ.reference_output) because it can't link to itself.
The only way around is to create a new directory, symlink original file there as test_XYZ and then use the temp dir as SOURCE_DIR to create test_XYZ.reference_output in the build dir. This is rather convoluted.

We probably need llvm_test_data with an optional argument that would allow providing a different name for the file name in the build dir, or make it unique automatically.

Possible fix for your case (untested as I don't have a CUDA environment):

diff --git a/External/CUDA/CMakeLists.txt b/External/CUDA/CMakeLists.txt
index 071ffbc9..63ab1e4c 100644
--- a/External/CUDA/CMakeLists.txt
+++ b/External/CUDA/CMakeLists.txt
@@ -58,13 +58,16 @@ macro(create_one_local_test_f Name FileGlob FilterRegex)
     set(_executable ${Name}-${VariantSuffix})
     set(_executable_path ${CMAKE_CURRENT_BINARY_DIR}/${_executable})
     # Verify reference output if it exists.
+    llvm_test_run()
+    set(REFERENCE_OUTPUT)
     if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/${Name}.reference_output)
-      llvm_test_traditional(${Name})
-    else()
-      # otherwise just run the executable.
-      llvm_test_run()
+      set(REFERENCE_OUTPUT ${Name}.reference_output)
+      llvm_test_verify(WORKDIR ${CMAKE_CURRENT_BINARY_DIR}
+        ${FPCMP} %o ${REFERENCE_OUTPUT}
+      )
     endif()
     llvm_test_executable(${_executable} ${_sources})
+    llvm_test_data(${_executable} ${REFERENCE_OUTPUT})
     target_compile_options(${_executable} PUBLIC ${VariantCPPFLAGS})
     if(VariantLibs)
       target_link_libraries(${_executable} ${VariantLibs})
In D51048#1221423, @tra wrote:

I've ran into an unexpected problem after this patches have landed.

CUDA test suite generates multiple tests that use the same reference output.
I can't just use llvm_test_data(test_XYZ, "test.reference_output"), because there can only be one test.reference_output in the build directory.
I also can't manually create a symlink $SRC/test.reference_output -> $BUILD_DIR/test-XYZ.reference_output and then use llvm_test_data(SOURCE_DIR=$BUILD_DIR test-XYZ.reference_output) because it can't link to itself.
The only way around is to create a new directory, symlink original file there as test_XYZ and then use the temp dir as SOURCE_DIR to create test_XYZ.reference_output in the build dir. This is rather convoluted.

We probably need llvm_test_data with an optional argument that would allow providing a different name for the file name in the build dir, or make it unique automatically.

Possible fix for your case (untested as I don't have a CUDA environment):

diff --git a/External/CUDA/CMakeLists.txt b/External/CUDA/CMakeLists.txt
index 071ffbc9..63ab1e4c 100644
--- a/External/CUDA/CMakeLists.txt
+++ b/External/CUDA/CMakeLists.txt
@@ -58,13 +58,16 @@ macro(create_one_local_test_f Name FileGlob FilterRegex)
     set(_executable ${Name}-${VariantSuffix})
     set(_executable_path ${CMAKE_CURRENT_BINARY_DIR}/${_executable})
     # Verify reference output if it exists.
+    llvm_test_run()
+    set(REFERENCE_OUTPUT)
     if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/${Name}.reference_output)
-      llvm_test_traditional(${Name})
-    else()
-      # otherwise just run the executable.
-      llvm_test_run()
+      set(REFERENCE_OUTPUT ${Name}.reference_output)
+      llvm_test_verify(WORKDIR ${CMAKE_CURRENT_BINARY_DIR}
+        ${FPCMP} %o ${REFERENCE_OUTPUT}
+      )
     endif()
     llvm_test_executable(${_executable} ${_sources})
+    llvm_test_data(${_executable} ${REFERENCE_OUTPUT})
     target_compile_options(${_executable} PUBLIC ${VariantCPPFLAGS})
     if(VariantLibs)
       target_link_libraries(${_executable} ${VariantLibs})

Though thinking about it, I don't really understand your problem:

How can there be multiple outputs with the same name? They have different names in the repository (algorithm.reference_output, assert.reference_output, ...)

tra added a comment.Sep 4 2018, 11:41 AM

Though thinking about it, I don't really understand your problem:

How can there be multiple outputs with the same name? They have different names in the repository (algorithm.reference_output, assert.reference_output, ...)

For CUDA tests, from each test source file we generate a lot of test executables compiled for combination of {CUDA version, C++ standard, C++ standard library}. All of those test executables will run and need to have their output compared to the same reference file.

In D51048#1223622, @tra wrote:

Though thinking about it, I don't really understand your problem:

How can there be multiple outputs with the same name? They have different names in the repository (algorithm.reference_output, assert.reference_output, ...)

For CUDA tests, from each test source file we generate a lot of test executables compiled for combination of {CUDA version, C++ standard, C++ standard library}. All of those test executables will run and need to have their output compared to the same reference file.

Ok but that would just mean that the name of the reference file doesn't necessarily match the name of the executable. That should be easily doable and hopefolly something similar to what I proposed in the comment is enough to fix the issues...

tra added a comment.Sep 4 2018, 12:02 PM

Your patch above *almost* works, except that each test variant wants to create the same symlink $BUILD/ExternalCUDA/$TEST.teferece_output -> $SRC/External/CUDA/$TEST.reference_output. If symlinks are created at different points in time ninja manages to avoid conflicts, but typically I get one or two attempts to create the symlink launches simultaneously and one of them fails with "symlink already exists". Perhaps the symlink target name should be uniquified in some way.