This is an archive of the discontinued LLVM Phabricator instance.

Work on cleaning up denormal mode handling
ClosedPublic

Authored by arsenm on Oct 29 2019, 6:31 PM.

Details

Summary

Cleanup handling of the denormal-fp-math attribute. Consolidate places
checking the allowed names in one place. Also begin migrating towards
the current IEEE-754's standard's preferred terminology of subnormal
over denormal.

This is in preparation for introducing FP type specific variants of
the denorm-fp-mode attribute. AMDGPU will switch to using this in
place of the current hacky use of subtarget features for the denormal
mode.

Introduce a new header for dealing with FP modes. The constrained
intrinsic classes define related enums that should also be moved into
this header for uses in other contexts.

The verifier could use a check to make sure the denorm-fp-mode
attribute is sane, but there currently isn't one.

There is a problem with this patch as is. The one place currently
checking this attribute in buildSqrtEstimateImpl (added by D42323)
oddly doesn't assume the ieee behavior if the attribute isn't
specified as I would expect. The question of why this behaves this way
needs to be resolved to proceed as tests fail due to assuming IEEE
behavior on undecorated functions.

Diff Detail

Event Timeline

arsenm created this revision.Oct 29 2019, 6:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 29 2019, 6:31 PM
simoll added a subscriber: simoll.Nov 4 2019, 8:54 AM
arsenm updated this revision to Diff 227775.Nov 4 2019, 1:46 PM
arsenm retitled this revision from WIP: Work on cleaning up denormal mode handling to Work on cleaning up denormal mode handling.

Defer any behavior changes until a future patch, so all tests now pass

spatel added inline comments.Nov 5 2019, 1:19 PM
clang/lib/CodeGen/CGCall.cpp
1745–1746

Do you plan to change the attribute string from "denormal" to "subnormal" as part of upgrading it to work per-FP-type? Would we need to auto-upgrade old IR as part of making the string consistent with the code?

Can we stash the attribute string name inside a getter function in the new ADT file, so clang and LLVM have a common source of truth for the attribute name?

arsenm marked an inline comment as done.Nov 5 2019, 1:36 PM
arsenm added inline comments.
clang/lib/CodeGen/CGCall.cpp
1745–1746

I'm considering it, but at the moment I'm trying to avoid changes. The next step I'm working on is adding denormal-fp-math-f32 (or maybe subnormal-fp-math-f32), which will co-exist and override the current attribute if the type matches

spatel added inline comments.Nov 6 2019, 7:26 AM
clang/lib/CodeGen/CGCall.cpp
1745–1746

I think it would be better to not change the vocabulary incrementally then. Ie, keep everything "denormal" in this patch, and then universally change the terminology to "subnormal" in one step. That way we won't have any inconsistency/confusion between the attribute name and the code.

arsenm marked an inline comment as done.Nov 6 2019, 9:23 AM
arsenm added inline comments.
clang/lib/CodeGen/CGCall.cpp
1745–1746

In the follow up patch, the new attribute uses the old denormal name. The clang option handling maintains the old name to match the flag, but the new internal enums and functions use the subnormal name. Is that a reasonable state? I don’t want to spread the old name through the new utilities, but also don’t want to have to auto upgrade bitcode for a name change

spatel added inline comments.Nov 6 2019, 11:05 AM
clang/lib/CodeGen/CGCall.cpp
1745–1746

I'm not seeing the value in using "subnormal" relative to the confusion cost then. If we're always going to use the "denormal" flag in clang, then I'd prefer to have the code be consistent with that name. That's what I'd grep for, so I think that's what anyone viewing the code for the first time would expect too.

arsenm updated this revision to Diff 228161.Nov 6 2019, 3:35 PM

Rename to denormal

arsenm marked an inline comment as done.Nov 6 2019, 3:36 PM
arsenm added inline comments.
clang/lib/CodeGen/CGCall.cpp
1745–1746

I thought it might be good to move towards the current standard terminology, but it's not critical

arsenm updated this revision to Diff 228168.Nov 6 2019, 5:12 PM

Missed a spot to rename

pengfei added a subscriber: pengfei.Nov 6 2019, 9:30 PM
spatel accepted this revision.Nov 7 2019, 4:56 AM

LGTM - see inline for some leftover naming diffs.

clang/lib/CodeGen/CGCall.cpp
1744–1745

If I'm seeing it correctly, this can't build as-is? FPSubnormalMode is called FPDenormalMode in the header change above here.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
20469

SubnormMode -> DenormMode

This revision is now accepted and ready to land.Nov 7 2019, 4:56 AM

I'm unclear as to the expectations surrounding this option. I suppose this is somewhat beyond the scope of the current changes, but I'm confused by clang's current behavior with regard to denormals.

The -fdenromal-fp-math option causes a function attribute to be set indicating the desired flushing behavior, and I guess in some cases that has an effect on instruction selection, but it seems to be orthogonal to whether or not we're actually setting the processor controls to do flushing (at least for most targets). I really only know what happens in the x86 case, and I don't know if this behavior is consistent across architectures, but in the x86 case setting or not setting the processor control flags depends on the fast math flags and whether or not we find crtfastmath.o when we link.

This leads me to my other point of confusion. Setting the "denormal-fp-math" option on a per-function basis seems wrong for targets that have a global FP control.

clang/include/clang/Basic/CodeGenOptions.h
167

Why is "Invalid" the default here? If you don't use the "fdenormal-fp-math" option, shouldn't you get IEEE?

arsenm marked an inline comment as done.Nov 8 2019, 9:44 PM

I'm unclear as to the expectations surrounding this option. I suppose this is somewhat beyond the scope of the current changes, but I'm confused by clang's current behavior with regard to denormals.

Yes, the current usage is underspecified and broken by default. I complained about this in this post: http://lists.llvm.org/pipermail/llvm-dev/2019-November/136449.html
The difference between whether input denormals are implicitly flushed and whether denormal results are flushed to zero does matter in the current codegen use.

The -fdenromal-fp-math option causes a function attribute to be set indicating the desired flushing behavior, and I guess in some cases that has an effect on instruction selection, but it seems to be orthogonal to whether or not we're actually setting the processor controls to do flushing (at least for most targets). I really only know what happens in the x86 case, and I don't know if this behavior is consistent across architectures, but in the x86 case setting or not setting the processor control flags depends on the fast math flags and whether or not we find crtfastmath.o when we link.

The current user needs to know if it can safely ignore a denormal input to avoid miscompiling sqrt. For AMDGPU this changes some lowering and instructions that are safely selectable. I would also like to be able possibly use this for constant folding llvm.canonicalize, although I'm unsure if we need a "may flush" or "must flush" distinction.

This leads me to my other point of confusion. Setting the "denormal-fp-math" option on a per-function basis seems wrong for targets that have a global FP control.

That's really going to be all targets. For AMDGPU we can directly set the FP mode for the kernels/entry points from this attribute, but not an arbitrary callable function. I do think the floating point environment bits should be a considered a property of the calling convention, with attributes that override them. A function which calls a function with a different mode would be responsible for switching the mode before the call. This would require people actually caring about getting this right to really implement

clang/include/clang/Basic/CodeGenOptions.h
167

Because the current users are broken, and this minimizes changes in the cleanup patches. The follow up patches fix this and switch the default

andrew.w.kaylor accepted this revision.Nov 12 2019, 12:09 PM

Thanks. I understand your direction for denormal handling now, and I'm OK with this patch apart from the remaining references to subnormal that Sanjay mentioned.

I do think the floating point environment bits should be a considered a property of the calling convention, with attributes that override them. A function which calls a function with a different mode would be responsible for switching the mode before the call. This would require people actually caring about getting this right to really implement

Do you mean the compiler should insert code to restore the FP environment on function transitions? Or do you mean that the function itself (i.e. the user's code) is responsible for switching the mode? I have some reservations about this, but I think the C standard specification for FENV_ACCESS is clear that the it is the programmer's responsibility to manage the floating point environment correctly. Yes, that's a sure recipe for broken code, but that's what it says. Obviously LLVM IR is not bound by the C standard and we could take a different approach, but I have concerns about the performance implications because in general the compiler won't know when the environment needs to be restored so it would have to take a conservative approach.

I've been meaning to write some documentation on the expected behavior at function boundaries of the FP environment. Perhaps we can continue this discussion there.

Thanks. I understand your direction for denormal handling now, and I'm OK with this patch apart from the remaining references to subnormal that Sanjay mentioned.

I do think the floating point environment bits should be a considered a property of the calling convention, with attributes that override them. A function which calls a function with a different mode would be responsible for switching the mode before the call. This would require people actually caring about getting this right to really implement

Do you mean the compiler should insert code to restore the FP environment on function transitions?

When calling a convention with a different FP mode, yes. For example graphics shaders have a different default FP mode than compute functions. Theoretically a graphics shader could link a compute function library, which would require switching the mode around the call. My consideration isn't really for users using FENV_ACCESS

arsenm closed this revision.Nov 19 2019, 8:32 AM

7fe9435dc88050ee78eb1d4adec87610dce468f7

This does now need to be merged with the FPEnv.h header