This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Added missing functions.
ClosedPublic

Authored by tra on Feb 21 2018, 5:03 PM.

Details

Summary

Initial commit missed sincos(float), llabs() and few atomics that we
used to pull in from device_functions.hpp, which we no longer include.

Event Timeline

tra created this revision.Feb 21 2018, 5:03 PM
tra updated this revision to Diff 135348.Feb 21 2018, 5:06 PM

Added missing __threadfence_system().

jlebar accepted this revision.Feb 21 2018, 5:13 PM

For my information, how are we verifying that we've caught everything?

This revision is now accepted and ready to land.Feb 21 2018, 5:13 PM
tra added a comment.Feb 21 2018, 5:34 PM

For my information, how are we verifying that we've caught everything?

for v in 8.0 9.0 9.1 ;  do 
  /usr/local/cuda-$v/bin/nvcc -c -x cu /dev/null -o /tmp/null.o -arch=sm_60 -keep-dir=nvcc-$v -keep -v
  dump-func-sig nvcc-$v/empty.cpp1.ii -- -x cuda-cpp-output -nocudainc -nocudalib --cuda-host-only -ferror-limit=0 -std=c++11  > nvcc-$v/decls; 
  dump-func-sig /dev/null -- -x cuda --cuda-path=/usr/local/cuda-$v --cuda-host-only -ferror-limit=0 -std=c++11 --cuda-gpu-arch=sm_60 \
      |grep -v typename |grep -v curand | grep -v _Complex | grep -v fetch_builtin > clang-$v; 
done

dump-func-sig is a tool I hacked together which uses clang tooling to parse the files as much as it can and then prints out reconstructed function signature and it's location.

Ten do diff of clang-N and nvcc-N/decls with a lot of regex filtering for argument names, etc. No diff tool does good enough job, so I've missed few functions.

This revision was automatically updated to reflect the committed changes.