Page MenuHomePhabricator

[NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.

Authored by jlebar on Jan 16 2017, 11:18 PM.



There are many NVVM intrinsics that we can't entirely get rid of, but
that nonetheless often correspond to target-generic LLVM intrinsics.

For example, if flush denormals to zero (ftz) is enabled, we can convert
@llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32. On the other hand, if ftz is
disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a
non-ftz PTX instruction. In this case, we can, however, simplify the
non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32.

These transformations are particularly useful because they let us
constant fold instructions that appear in libdevice, the bitcode library
that ships with CUDA and essentially functions as its libm.

Diff Detail


Event Timeline

jlebar created this revision.Jan 16 2017, 11:18 PM
tra accepted this revision.Jan 17 2017, 2:50 PM
This revision is now accepted and ready to land.Jan 17 2017, 2:50 PM
majnemer added inline comments.Jan 17 2017, 4:43 PM
1412 ↗(On Diff #84634)


jlebar added inline comments.Jan 17 2017, 4:46 PM
1412 ↗(On Diff #84634)

Is there a reason to prefer that over this syntax? This is fewer chars, which is why I chose to do it this way.

majnemer added inline comments.Jan 17 2017, 4:51 PM
1412 ↗(On Diff #84634)

The constructor of SimplifyAction could be trivial because the compiler can see what's going on. This would not be so with a user defined constructor.

jlebar updated this revision to Diff 85077.Jan 19 2017, 5:43 PM

Explicitly default the default constructor.

jlebar marked 3 inline comments as done.Jan 19 2017, 5:43 PM
jlebar added inline comments.
1412 ↗(On Diff #84634)

Thanks, done.

jlebar updated this revision to Diff 85301.Jan 22 2017, 3:09 PM
jlebar marked an inline comment as done.

Rebase atop latest patches, which make it unnecessary to special-case sqrt.

This revision was automatically updated to reflect the committed changes.