This is an archive of the discontinued LLVM Phabricator instance.

don't transform splats of vector FP (PR20358)
AbandonedPublic

Authored by spatel on Jul 18 2014, 11:20 AM.

Details

Summary

This patch addresses the performance and security problem shown here:
http://llvm.org/bugs/show_bug.cgi?id=20358

Because of unintended operations on denorms, it may be a severe performance loss to transform shuffle ops on FP vectors. Don't do that unless we're working with unsafe algebra (ie, -ffast-math) where it's likely that denorms are being ignored or flushed.

Diff Detail

Event Timeline

spatel updated this revision to Diff 11652.Jul 18 2014, 11:20 AM
spatel retitled this revision from to don't transform splats of vector FP (PR20358).
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added a subscriber: Unknown Object (MLST).
spatel updated this revision to Diff 40871.Nov 21 2015, 10:21 AM
spatel updated this object.
spatel added a reviewer: RKSimon.

Patch updated:
It's likely that denorms are not a concern in a -ffast-math setting, so limit this patch to only fire (ie, bail out of the transform) on FP ops that are not 'fast'.

As noted in the bug report, there are now known security exploits related to denorm execution (!), so there's another reason for this patch besides the potentially giant performance loss:
http://cseweb.ucsd.edu/~dkohlbre/papers/subnormal.pdf

Also - and I may be going for a world-record here - it's been ~490 days, so...

Ping. :)

aschwaighofer edited edge metadata.

This might penalize programs that don't traffic in denormals (most programs as I understand it). Did you measure the performance impact?

Did the bug report originate from a real program where this is an issue?

Though, I agree that putting a denormal on the execution path where it was not before seems like bad form.

This might penalize programs that don't traffic in denormals (most programs as I understand it). Did you measure the performance impact?

I agree with the rarity of denormals assessment. Although I would alter it slightly to "most programs don't think they traffic in denormals." Every once in a while, you run into a performance or profile mystery that seems inexplicable until you discover that some op has dipped into denorm territory. It's hard enough to spot that when the program actually causes it. If the compiler is introducing the denormal hit, that's a debugging problem I don't think anyone ever wants to see.

So yes, there is a potential perf loss: we're not eliminating one splat instruction with this patch. I view this as a small price to pay for not introducing performance unpredictability into a program. That said, I don't see any perf differences using test-suite on x86-64 with this change.

Did the bug report originate from a real program where this is an issue?

No. But I hope I never see that real program. :)

Also, I don't know what denorm penalties look like outside of x86 and some old PPC chips. If there are targets where denorm ops are free/cheap, that would be a good argument to make this target-specific.

I just connected the dots between this case and other recent speculative execution bugs/patches:
https://llvm.org/bugs/show_bug.cgi?id=24343
https://llvm.org/bugs/show_bug.cgi?id=24818
https://llvm.org/bugs/show_bug.cgi?id=25572
http://reviews.llvm.org/D12882
http://reviews.llvm.org/D13297
http://reviews.llvm.org/D14630

This splat is just a special (and probably extremely rare) case of the general problem:

#define DENORM ((float)(1.0e-39))
float foo(float x, float y) {
   if (x > DENORM) return x * y;
   else return y;
}

$ ./clang -O2 denorm.c -S -o - -emit-llvm

define float @foo(float %x, float %y) #0 {
  %cmp = fcmp ogt float %x, 0x37D5C73000000000
  %mul = fmul float %x, %y    <--- crazy expensive op that we wanted to avoid just got executed
  %retval.0 = select i1 %cmp, float %mul, float %y
  ret float %retval.0
}

Unless there is objection, I'll abandon this patch. If we really want to solve this, we'd have to add some global flag (-fno-speculative-fp-math?) or instruction-level annotation to include FP ops under !isSafeToSpeculativelyExecute().

hfinkel edited edge metadata.Nov 23 2015, 5:03 PM

I just connected the dots between this case and other recent speculative execution bugs/patches:
https://llvm.org/bugs/show_bug.cgi?id=24343
https://llvm.org/bugs/show_bug.cgi?id=24818
https://llvm.org/bugs/show_bug.cgi?id=25572
http://reviews.llvm.org/D12882
http://reviews.llvm.org/D13297
http://reviews.llvm.org/D14630

This splat is just a special (and probably extremely rare) case of the general problem:

#define DENORM ((float)(1.0e-39))
float foo(float x, float y) {
   if (x > DENORM) return x * y;
   else return y;
}

$ ./clang -O2 denorm.c -S -o - -emit-llvm

define float @foo(float %x, float %y) #0 {
  %cmp = fcmp ogt float %x, 0x37D5C73000000000
  %mul = fmul float %x, %y    <--- crazy expensive op that we wanted to avoid just got executed
  %retval.0 = select i1 %cmp, float %mul, float %y
  ret float %retval.0
}

Unless there is objection, I'll abandon this patch. If we really want to solve this, we'd have to add some global flag (-fno-speculative-fp-math?) or instruction-level annotation to include FP ops under !isSafeToSpeculativelyExecute().

I agree. Also, this seems related to Sergey Dmitrouk's work on enabling FP env access, etc.

I agree. Also, this seems related to Sergey Dmitrouk's work on enabling FP env access, etc.

Thanks, Hal. I wasn't aware of that work.
cc'ing Sergey on this thread for one more thing to consider in the design.

spatel abandoned this revision.Nov 24 2015, 8:02 AM
lib/Transforms/InstCombine/InstructionCombining.cpp