Page MenuHomePhabricator

[Driver, CodeGen] pass through and apply -fassociative-math
ClosedPublic

Authored by spatel on Nov 8 2017, 11:17 AM.

Details

Summary

There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF:

  1. In the driver/frontend, we accept the flag and its 'no' inverse and deal with the interactions with other flags like -ffast-math. This was mostly already done - we just needed to pass the flag on as a codegen option. The test file is complicated because there are many potential combinations of flags here.
  1. In codegen, we map the option to FMF in the IR builder. This is simple code and corresponding test.

For the motivating example from PR27372:

float foo(float a, float x) { return ((a + x) - x); }

$ ./clang -O2 27372.c -S -o - -ffast-math  -fno-associative-math -emit-llvm  | egrep 'fadd|fsub'
  %add = fadd nnan ninf nsz arcp contract float %0, %1
  %sub = fsub nnan ninf nsz arcp contract float %add, %2

So 'reassoc' is off as expected (and so is the new 'afn' but that's a different patch). This case now works as expected end-to-end although the underlying logic is still wrong:

$ ./clang  -O2 27372.c -S -o - -ffast-math  -fno-associative-math | grep xmm
	addss	%xmm1, %xmm0
	subss	%xmm1, %xmm0

We're not done because the case where 'reassoc' is set is ignored by optimizer passes. Example:

$ ./clang  -O2 27372.c -S -o - -fassociative-math -emit-llvm  | grep fadd
  %add = fadd reassoc float %0, %1

$ ./clang -O2  27372.c -S -o - -fassociative-math | grep xmm
	addss	%xmm1, %xmm0
	subss	%xmm1, %xmm0

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Nov 8 2017, 11:17 AM

I just reviewed the gcc docs:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

"[-fassociative-math] requires that both -fno-signed-zeros and -fno-trapping-math be in effect."

If we want to match that behavior, I need to change this patch to light up 'nsz' and the no-trapping-math function attribute.

wristow edited edge metadata.Nov 8 2017, 11:44 AM

I just reviewed the gcc docs:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

"[-fassociative-math] requires that both -fno-signed-zeros and -fno-trapping-math be in effect."

If we want to match that behavior, I need to change this patch to light up 'nsz' and the no-trapping-math function attribute.

Yes, I think we want to match that behavior. But by "light up" 'nsz' and the no-trapping-math attribute, do you mean automatically turn them on when '-fassociative-math' is specified? I'd think it should be the other way around: Suppress the effect of '-fassociative-math' unless both '-fno-signed-zeros' and '-fno-trapping-math' are also specified.

spatel added a comment.Nov 8 2017, 1:00 PM

Yes, I think we want to match that behavior. But by "light up" 'nsz' and the no-trapping-math attribute, do you mean automatically turn them on when '-fassociative-math' is specified? I'd think it should be the other way around: Suppress the effect of '-fassociative-math' unless both '-fno-signed-zeros' and '-fno-trapping-math' are also specified.

Yes, my reading of the gcc docs was off. The code appears to match your understanding:
https://godbolt.org/g/Vau3cv

So we'll restrict the effect of -fassociative-math based on the presence of the other flags...more spaghetti needed. :)

spatel updated this revision to Diff 122155.Nov 8 2017, 2:29 PM

Patch updated:
Match gcc's handling of -fassociative-math with -fno-signed-zeros and -fno-trapping-math. I've created a new codegen option (-mreassociate) that maps directly to LLVM's 'reassoc' flag, so we limit the mix-and-match logic of these flags to the driver/frontend.

So now we have:

$ ./clang  27372.c -S -o -  -fassociative-math -O2 -emit-llvm | egrep 'fadd|fsub'
  %add = fadd float %a, %x
  %sub = fsub float %add, %x
$ ./clang  27372.c -S -o -  -fassociative-math -fno-signed-zeros -O2 -emit-llvm | egrep 'fadd|fsub'
  %add = fadd nsz float %a, %x
  %sub = fsub nsz float %add, %x
$ ./clang  27372.c -S -o -  -fassociative-math -fno-signed-zeros -fno-trapping-math -O2 -emit-llvm | egrep 'fadd|fsub'
  %add = fadd reassoc nsz float %a, %x
  %sub = fsub reassoc nsz float %add, %x
spatel updated this revision to Diff 122158.Nov 8 2017, 2:32 PM

Updated again - the last upload was missing the driver's test file.

hfinkel accepted this revision.Dec 13 2017, 3:07 PM

LGTM

This revision is now accepted and ready to land.Dec 13 2017, 3:07 PM
This revision was automatically updated to reflect the committed changes.