Page MenuHomePhabricator

[complex] Teach Clang to preserve different-type operands to arithmetic operators where one type is a C complex type, and to emit both the efficient and correct implementation for complex arithmetic according to C11 Annex G using this extra...
ClosedPublic

Authored by chandlerc on Oct 9 2014, 3:16 AM.

Details

Summary

...information.

For both multiply and divide the old code was writing a long-hand
reduced version of the math without any of the special handling of inf
and NaN recommended by the standard here. Instead of putting more
complexity here, this change does what GCC does which is to emit
a libcall for the fully general case.

However, the old code also failed to do the proper minimization of the
set of operations when there was a mixed complex and real operation. In
those cases, C provides a spec for much more minimal operations that are
valid. Clang now emits the exact suggested operations. This change isn't
*just* about performance though, without minimizing these operations, we
again lose the correct handling of infinities and NaNs. It is critical
that this happen in the frontend based on assymetric type operands to
complex math operations.

The performance implications of this change aren't trivial either. I've
run a set of benchmarks in Eigen, an open source mathematics library
that makes heavy use of complex. While a few have slowed down due to the
libcall being introduce, most sped up and some by a huge amount.

TODO: In order to make all of this work, also match the algorithm in the
constant evaluator to the one in the runtime library. Currently it is a broken
port of the simplifications from C's Annex G to the long-hand formulation of
the algorithm.

Splitting this patch up is very hard because none of this works without
the AST change to preserve non-complex operands. Sorry for the enormous
change.

Diff Detail

Repository
rL LLVM

Event Timeline

chandlerc updated this revision to Diff 14644.Oct 9 2014, 3:16 AM
chandlerc retitled this revision from to [complex] Teach Clang to preserve different-type operands to arithmetic operators where one type is a C complex type, and to emit both the efficient and correct implementation for complex arithmetic according to C11 Annex G using this extra....
chandlerc edited the test plan for this revision. (Show Details)
chandlerc added reviewers: hfinkel, scanon, resistor.
chandlerc updated this object.
chandlerc added a subscriber: Unknown Object (MLST).

A couple of other points on this patch.

  • We could get a lot more out of these kinds of simplification, and

generally improve the sanity of Clang's model here, by adding support for
the C11 Imaginary type and using that to build up things. But this is more
work than I can sign up for right now, and I'm not the best person to do it.

  • You might wonder why not add a complex type to the IR so that the

properties here are exposed at that level. There are some compelling
reasons not to do this IMO.

  • Would need to "lower" this at some point which would be ugly. No chip

(that LLVM supports) has complex registers and arithmetic operations....

  • Incredibly hard to change the IR in general, so bad cost/benefit

tradeoff.

  • While this patch handles the simplifications to complex arithmetic we can

do when the typesystem itself shows one operand to be a real, it doesn't
handle large other categories such as when inlining or other optimizations
propagate away the imaginary part for example.

  • All of these should be easily caught by a target library optimization
  • I'll work on just such a target library pass to help polish off the

last few missed optimzations.

  • This has uncovered some nasty behavior with SROA. Need to fix that.....
scanon accepted this revision.Oct 9 2014, 8:44 AM
scanon edited edge metadata.

Thanks for working on this, Chandler, it’s an enormous improvement over the current state of affairs.

I am a bit concerned about the performance impact of unconditionally turning mul and div into libcalls; I expect we’ll see large regressions on a few focused benchmarks from this. In the long run, we likely want CX_LIMITED_RANGE on semantics to be the default, but that will require implementing support for the pragma or some other means to control the behavior (-fcx-limited-range=on?).

Nonetheless, this is a large performance improvement for cases that are actually used in a relatively mainstream project (and it’s strictly an improvement w.r.t. correctness), so I am perfectly content to put such issues into the bug tracker to be handled separately.

This revision is now accepted and ready to land.Oct 9 2014, 8:44 AM
hfinkel accepted this revision.Oct 9 2014, 1:18 PM
hfinkel edited edge metadata.

I agree with Steve, this looks good. To add one thing: I'd also like to skip the libcalls when possible in fast-math mode.

lib/CodeGen/CGExprComplex.cpp
590 ↗(On Diff #14644)

Indent?

rsmith added a subscriber: rsmith.Oct 9 2014, 3:56 PM

I'd like to see some added tests for constant folding here.

lib/AST/ExprConstant.cpp
7694–7697 ↗(On Diff #14644)

How do we reach this case? The below handling for real operands calls EvaluateFloat rather than EvaluateComplex.

7883 ↗(On Diff #14644)

You should assign true to these somewhere =)

7909 ↗(On Diff #14644)

Maybe assert(!(LHSReal && RHSReal)); here?

7933 ↗(On Diff #14644)

Is changeSign right here? Should 0 - (0+0i) be 0+0i or 0-0i?

8013–8014 ↗(On Diff #14644)

Indentation looks off here. Maybe it's just Phab?

8023 ↗(On Diff #14644)

Does this do the right thing with negative zeros? Eg, 1 / (1 + 0i) will give 1-0i but should presumably be 1+0i.

lib/CodeGen/CGExprComplex.cpp
556–557 ↗(On Diff #14644)

Same question as before regarding negative zeros.

593 ↗(On Diff #14644)

My 80-col sense is tingling.

618–619 ↗(On Diff #14644)

Do we need to do anything about underflow/overflow in the constant evaluator?

chandlerc updated this revision to Diff 14713.Oct 10 2014, 4:57 AM
chandlerc edited edge metadata.

Completely re-work the constant folding code to correctly follow the same logic
as the library calls do, but in terms of APFloat. Add extensive compile time
folding tests. These tests have to be written in C++ despite testing the
C complex number functionality because we need C++ static assertions in order
to fold these complex things. Yes, pun intended. =]

chandlerc updated this revision to Diff 14714.Oct 10 2014, 4:59 AM

Run clang-format over the patch to fix some lingering formatting issues.

chandlerc updated this revision to Diff 14716.Oct 10 2014, 5:26 AM

Address Richard's comments.

All comments addressed. Answers to questions below. Fresh patches uploaded and ready for review.

The constant folding is now in good shape I deem it. Took some hacking on APFloat to get there. Added test cases as well. Had fun testing complex division with Wolfram Alpha. =] We seem to get the right answers.

One area I'm not testing heavily is sign propagation in the face of infinities. I'd appreciate if someone (yea, I'm looking at you Steve) could contribute more thorough testing of the sign propagation properties of these operations, and/or tighten or relax the assertions on the real vs. imaginary components to match what numerics folks actually want to enshrine in the constant folder. Hopefully my test case is good enough to make it clear how to extend things going forward.

I also have plans to address both the libcall performance concerns in general and the fast-math concerns in follow-up patches, but I'd like to get a baseline of correctness in first. =]

lib/AST/ExprConstant.cpp
7694–7697 ↗(On Diff #14644)

We don't. This was a speculative edit I shouldn't have made.

7883 ↗(On Diff #14644)

Indeed. As you noticed (and I tried but failed to make clear) once I knew I would have to re-implement this from the principled version in our libcalls, I stopped testing it and this was completely broken and wrong.

7909 ↗(On Diff #14644)

Sure. I had some of that later, but nicer to state it up front.

7933 ↗(On Diff #14644)

Quite surprisingly, I believe this is correct according to the table in G.5.2p2. It stipulates that a complex value subtracted from a real value as x - (u + iv) == (x - u) - iv. Because subtraction is represented as the sum of a negative, especially for complex numbers, this indicates that 'v' must be negated here, even if that would form 0.0 - 0.0i. The 'negative zero' here which is surprising resulting from subtraction of two values with the same sign isn't a big deal because in complex form it just feeds another subtraction.

8013–8014 ↗(On Diff #14644)

No, formatting was off here.

lib/CodeGen/CGExprComplex.cpp
556–557 ↗(On Diff #14644)

Same answer.

590 ↗(On Diff #14644)

Fixed.

593 ↗(On Diff #14644)

Fixed....

BY THE POWER OF CLANG-FORMAT!

618–619 ↗(On Diff #14644)

Not necessarily. Unlike with integers, such events should (IMO) simply produce the relevant infinity or zero value.

My inclination is that the constant folder should error when a full operation produces a NaN. This would let the complex arithmetic logic shield it from certain NaN conditions, but still bail when we go Too Far.

rsmith added inline comments.Oct 10 2014, 4:39 PM
lib/AST/ExprConstant.cpp
8047–8048 ↗(On Diff #14716)

It looks like B is not initialized if LHSReal is true here.

chandlerc closed this revision.Oct 10 2014, 6:07 PM
chandlerc updated this revision to Diff 14762.

Closed by commit rL219557 (authored by @chandlerc).