This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Add --cuda-flush-denormals-to-zero.
ClosedPublic

Authored by jlebar on Mar 31 2016, 1:18 PM.

Details

Summary

Setting this flag causes all functions are annotated with the
"nvvm-f32ftz" = "true" attribute.

In addition, we annotate the module with "nvvm-reflect-ftz" set
to 0 or 1, depending on whether -cuda-flush-denormals-to-zero is set.
This is read by the NVVMReflect pass.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 52273.Mar 31 2016, 1:18 PM
jlebar retitled this revision from to [CUDA] Add --cuda-flush-denormals-to-zero..
jlebar updated this object.
jlebar added reviewers: tra, rnk.
jlebar added a subscriber: cfe-commits.
tra added inline comments.Mar 31 2016, 1:34 PM
include/clang/Driver/Options.td
385

We need a provide a way to both enable and disable this. We either need a "-no" variant or make it an option with value.

Also, can we shorten it to --cuda-ftz? I would probably mistype current name more often than not.

386

Is there an equivalent for ftz fo host-side FP operations? It would be good to keep identical host and device side calculations as close as we can.

lib/Driver/ToolChains.cpp
4212

Perhaps we don't need different flags at driver and CC1 levels. Top-level "-f*" options in OPT_f_group are passed to CC1 automatically.

jlebar updated this revision to Diff 52292.Mar 31 2016, 2:40 PM

Address tra's review comments.

I also decided no longer to turn this on when -menable-unsafe-fp-math is on:
That's a cc1 flag that's implied by various clang flags, but now ftz is a clang
flag, so turning it on implicitly didn't make a ton of sense to me.

Thank you for the review, Art!

include/clang/Driver/Options.td
385

We need a provide a way to both enable and disable this. We either need a "-no" variant or make it an option with value.

Many (most) of the -f flags don't have -fno variants -- how do we decide which ones get an -fno and which don't?

Also, can we shorten it to --cuda-ftz? I would probably mistype current name more often than not.

Well, you and I both were calling it "ctz" about 50% of the time, so I'm not sure --cuda-ftz would solve the problem! :) (In all seriousness, that was one of the reasons I chose not to abbreviate it.)

Maybe "ftz" is a well-known acronym. Doesn't quite look like it from googling, though.

I looked through the flags and concluded that "ftz" was more abbreviated than most of them. Although "flush-denormals-to-zero" is at the verbose end of the spectrum. I considered "flush-denormals", thought that was a big ambiguous -- flush them how?

386

Is there an equivalent for ftz fo host-side FP operations?

Not that I can tell. The only other one I saw was opencl's equivalent flag, which does nothing at the moment.

lib/Driver/ToolChains.cpp
4212

Aha, much better, thank you!

rnk accepted this revision.Mar 31 2016, 2:46 PM
rnk edited edge metadata.
rnk added inline comments.
include/clang/Driver/Options.td
385

Let's stick with the long descriptive name for now.

This revision is now accepted and ready to land.Mar 31 2016, 2:46 PM
rnk added inline comments.Mar 31 2016, 2:50 PM
include/clang/Driver/Options.td
385

Personally I would add an -fno variant, and add code to the driver that conditionally forwards the flag to cc1.

jlebar updated this revision to Diff 52302.Mar 31 2016, 3:38 PM
jlebar marked an inline comment as done.
jlebar edited edge metadata.

Add -fno variant.

jlebar marked an inline comment as done.Mar 31 2016, 3:38 PM
jlebar added inline comments.
include/clang/Driver/Options.td
385

OK, added an -fno variant. But it seems simpler to pass both the -f flag and -fno flag to cc1 and let it figure things out? lmk what you think.

rnk requested changes to this revision.Mar 31 2016, 3:50 PM
rnk edited edge metadata.
rnk added inline comments.
include/clang/Driver/Options.td
385

I agree it would be simpler, but it's not what we do elsewhere. There's this idea that the -cc1 line is a canonicalized command line, where no-op options are removed, like -fno-exceptions when EH is off by default.

lib/Frontend/CompilerInvocation.cpp
1567

The right idiom for this is:

Opts.Thing = Args.hasFlag(OPT_fthing, OPT_fno_thing, false);
This revision now requires changes to proceed.Mar 31 2016, 3:50 PM
jlebar updated this revision to Diff 52310.Mar 31 2016, 4:46 PM
jlebar edited edge metadata.
jlebar marked an inline comment as done.

Update flags so we only pass -fcuda-flush-denormals-to-zero to cc1 if appropriate.

jlebar added inline comments.Mar 31 2016, 4:47 PM
lib/Frontend/CompilerInvocation.cpp
1567

Aha, I knew there had to be a better way to do that. Thanks.

Thank you for explaining that, Reid!

Reid, are we good here?

rnk accepted this revision.Apr 5 2016, 11:15 AM
rnk edited edge metadata.

Yep, lgtm

This revision is now accepted and ready to land.Apr 5 2016, 11:15 AM
This revision was automatically updated to reflect the committed changes.