This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] make load on global readonly memory to use ldg
ClosedPublic

Authored by wengxt on Jul 17 2015, 2:51 PM.

Details

Summary

[NVPTX] make load on global readonly memory to use ldg

Summary:
As describe in [1], ld.global.nc may be used to load memory by nvcc when
restrict is used and compiler can detect whether read-only data cache
is safe to use.

This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in
S3D benchmark of shoc [2].

Patched by Xuetian Weng.

[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc

Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

Diff Detail

Event Timeline

wengxt updated this revision to Diff 30042.Jul 17 2015, 2:51 PM
wengxt retitled this revision from to [NVPTX] make load on global readonly memory to use ldg.
wengxt updated this object.
wengxt added reviewers: jholewinski, jingyue.
wengxt added a subscriber: llvm-commits.
jingyue edited edge metadata.Jul 17 2015, 2:59 PM

"As describe in [1]"

I don't see the definition of [1].

You should mention the performance gain you got. SHOC is open-source.

wengxt updated this revision to Diff 30178.Jul 20 2015, 12:05 PM
wengxt edited edge metadata.

update commit message

wengxt updated this object.Jul 20 2015, 12:06 PM
jingyue accepted this revision.Jul 20 2015, 1:27 PM
jingyue edited edge metadata.

Can you mention the specific benchmarks (benchmark name or kernel name) and how much speedup they enjoy individually? This gives people something to refer to later. SHOC is open-source and frequently cited, so people would love to see in detail how your work affects that.

Otherwise, LGTM. Thanks!

This revision is now accepted and ready to land.Jul 20 2015, 1:27 PM
wengxt updated this object.Jul 20 2015, 1:34 PM
wengxt edited edge metadata.
jingyue updated this object.Jul 20 2015, 2:21 PM
jingyue closed this revision.Jul 20 2015, 2:29 PM