Page MenuHomePhabricator

Part 1 to fix x86_64 fp128 calling convention.

Authored by chh on Dec 1 2015, 4:44 PM.



This is part 1 subset of,
without the .td and new fp128 unit tests.

Most of the changes here only apply to the new x86_64 f128 type configuration,
which is not enable in this part.



  • Relax type legalization checks to accept new f128 type configuration, whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has TLI.isTypeLegal true.
  • Relax GetSoftenedFloat to return in some cases f128 type SDValue, which is TLI.isTypeLegal but not "softened" to i128 node.
  • Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration, to generate optimized bitwise operators for libm functions. Enhance related Lower* functions to handle f128 type.
  • Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions to keep new f128 type in register, and convert f128 operators to library calls.
  • Fix Combiner, Emitter, Legalizer routines that did not handle f128 type.
  • Add ExpandConstant to handle i128 constants, ExpandNode to handle ISD::Constant node.
  • Add one more parameter to getCommonSubClass and firstCommonClass, to guarantee that returned common sub class will contain the specified simple value type. This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp.
  • Fix infinite loop in getTypeLegalizationCost when f128 is the value type.
  • Fix printOperand to handle null operand.
  • Enhance ISD::BITCAST node to handle f128 constant.
  • Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes.
  • Enhance X86AsmPrinter to emit f128 values in comments.

Diff Detail


Event Timeline

chh updated this revision to Diff 41578.Dec 1 2015, 4:44 PM
chh retitled this revision from to Part 1 to fix x86_64 fp128 calling convention..
chh updated this object.
chh added reviewers: davidxl, rnk, echristo.
chh added subscribers: llvm-commits, grosbach, resistor and 6 others.
davidxl added inline comments.Dec 1 2015, 5:46 PM
52 ↗(On Diff #41578)

Push this change down to cases where No softening is needed.

73 ↗(On Diff #41578)

Unrelated format change.

117 ↗(On Diff #41578)

Put those comments and replace code into a small wrapper -- 'replaceValueIfNeeded'.

734 ↗(On Diff #41578)

This comment seems out of place. Either remove it or put it in better place.

741 ↗(On Diff #41578)

Move this comment also into the wrapper function. Explain better why it does not need to be softened again -- e.g, ReplaceValueWith is called eagerly ...

742 ↗(On Diff #41578)

Move this into a helper function with a proper name? Also add comment why only these subset of opcodes can be skipped.

785 ↗(On Diff #41578)

can be held.

787 ↗(On Diff #41578)

Is the first condition needed? If Res.getNode() != N, ReplaceValueWith should also be called (see below), so this condition does not look correct.

797 ↗(On Diff #41578)

typo re-analyze

241 ↗(On Diff #41578)

Move this comment to SoftFloatResult where short cut was taken:

' Return false to tell caller to scan operands ...'

391 ↗(On Diff #41578)

In that case,

394 ↗(On Diff #41578)

Put it in a helper and add some DEBUG check what kind of Opcode are expected to not 'softened' at this point.

chh marked 11 inline comments as done.Dec 2 2015, 1:48 PM
chh added inline comments.
117 ↗(On Diff #41578)

Use a new small wrapper called ReplaceSoftenFloatResult.

787 ↗(On Diff #41578)

Yes, it is necessary. It's not about Res.getNode()!=N.
When Res.getNode() == N,
(a) if LegalInHWReg, return false and do not reanalyze
(b) Otherwise, at line 799, return true and reanalyze

chh updated this revision to Diff 41671.Dec 2 2015, 1:50 PM
chh updated this object.
chh marked an inline comment as done.
davidxl added inline comments.Dec 2 2015, 3:03 PM
8776 ↗(On Diff #41671)

Move the comment and additional check into a small helper for readability:

if ( (.... || ...) && legalToSimplifyCopysign(N1))

122 ↗(On Diff #41671)

the caller.

775 ↗(On Diff #41671)

llvm_unreachable? When this function is called, isLegalInHWReg must be true.

400 ↗(On Diff #41671)

if (!SoftenedOp.getNode()) {

assert(isSimpleLegalType(...));  // or should it be assert(isLegalInHWReg(...)) ?
return Op;


414 ↗(On Diff #41671)

It is clearer to do:

if (!isLegal...)

if (...)
463 ↗(On Diff #41671)

soften -> softened

464 ↗(On Diff #41671)

because the operand has already been softened in SoftenFloatResult with calls to ReplaceValueWith.

silvas added a subscriber: silvas.Dec 2 2015, 3:13 PM

Each of your bullet points under Changes: sounds like it is about the right size for individual patches. Can you split this up?

chh added a comment.Dec 2 2015, 3:44 PM

Sean, only the "Fix printOperand to handle null operand" in SelectionDAGDumper.cpp might have some value by itself.
All other changes are not useful before x86_64 f128 type is configured to fix the calling convention problem.
This part 1 change should have no impact at all before the rest of is merged in.
Getting this part 1 in first is just to check for any regression.
I think all these changes should go in or be reverted when necessary.
It would be easier to keep track of these changes together, right?

chh marked 4 inline comments as done.Dec 2 2015, 4:32 PM
chh added inline comments.
8776 ↗(On Diff #41671)

Maybe you want something like IsOkayToConvertRoundOrExtendToCopySign(N1)?
The name legalToSimplifyCopysign(N1) here seems confusing,
as there are other 6 cases before this one to combine FABS, FNEG, and FCOPYSIGN operands
Maybe we could keep this condition close to its single user for now
until we know a clear abstraction?

775 ↗(On Diff #41671)

Yes, but it should be better for this function to return false,
then the current caller will reuse existing dbg dump and llvm_unreachable.

400 ↗(On Diff #41671)

I am not sure about assert here.
Original code will assert(SoftenOp.getNode() && ...) after RemapValue(SoftenedOp);
I cannot prove any error there, so I only want to have a condition here to return Op.

414 ↗(On Diff #41671)

Would it be acceptable to test (NewRes.getNode() != N) first, as it should be faster than isLegalInHWReg and not always true.

chh updated this revision to Diff 41693.Dec 2 2015, 4:32 PM
chh marked an inline comment as done.
davidxl added inline comments.Dec 2 2015, 4:54 PM
8776 ↗(On Diff #41693)

The comment of the method should clarify the intention -- I am not worried about naming.

775 ↗(On Diff #41693)

ok -- better change the name of the function to 'CanSkipSoft...'

400 ↗(On Diff #41693)

With the original code, this branch should never be taken -- so the new assertion should be safe, right?

414 ↗(On Diff #41693)


silvas added a comment.Dec 2 2015, 6:21 PM
In D15134#300978, @chh wrote:

It would be easier to keep track of these changes together, right?

No, it is actually the opposite. The more fine-grained the individual commits, the easier it is for a reviewer to review, and the easier it is to identify what caused an issue if an issue does show up. Tests are also easier to write for fine-grained commits (or, it is easier to identify that a fine-grained commit does not change externally visible behavior and so doesn't require a test).

Also, the changes you are proposing here seem to fix bugs / introduce new functionality. Therefore tests are required.

chh added a subscriber: chh.Dec 3 2015, 9:48 AM

Sean, could you review the test cases in too?

New features (or bug fixes) introduced in this patch really should only be
executed when x86_64 f128 type is configured to stay in SSE registers.
We don't turn on that configuration in this partial patch, so all unit
tests in D11438 won't apply. Is there some new test we can add to test
other types for this patch?

After my initial submit of D11438 in Jul 22 2015, I have received virtually
no comment on those unit tests yet. I do feel the risk of test coverage for
both new x86_64 f128 config and existing configurations.

I explained before in patch D11438 comments that a partial change of x86_64
f128 is worse than no change at all, as we would rather have a working
non-gcc compatible x86_64 long double type than a buggy type or crashing
compiler. So a partial commit like this has only benefit to verify partial
regression to non x86_64 f128 types, which has lower possibility and I have
no trouble to revert them all at once.

The main battle will be for Android platform to verify the correctness of
x86_64 f128 type after the configuration is submitted in the follow up

chh marked an inline comment as done.Dec 3 2015, 10:45 AM
chh added inline comments.
8776 ↗(On Diff #41693)

Fine. Please see the new split out function CanCombineFCOPYSIGN_EXTEND_ROUND.

400 ↗(On Diff #41693)

Right, but the new assertion will be different from the old assertion at line 403 when it happens. As we are so carefully to avoid change of behavior to non x86_64 f128 type, I would avoid the change to assertion too.

chh updated this revision to Diff 41777.Dec 3 2015, 10:46 AM
davidxl accepted this revision.Dec 3 2015, 11:03 AM
davidxl edited edge metadata.

f128 related changes are pretty isolated at this point, so I will lgtm and the further changes can iterated later.

This revision is now accepted and ready to land.Dec 3 2015, 11:03 AM
chh added a comment.Dec 3 2015, 1:49 PM

David, thanks a lot for the review. This patch won't be this clean without your suggestions.
I'm going to submit this and watch for any regression.

This revision was automatically updated to reflect the committed changes.
spatel added a subscriber: spatel.Jan 8 2016, 11:59 AM
spatel added inline comments.

Why is an x86-specific limitation being handled in DAGCombiner? Maybe no target supports f128, so it doesn't matter, but this seems wrong.

chh added inline comments.Jan 8 2016, 3:23 PM

I thought this optimization won't work for f128 given that current
llvm targets I know do not keep f128 in one register and have not defined
required instruction selection patterns. Instead of keeping this
optimization here for all targets and disable it in a target specific
way, it seems simpler to disable it here for f128 and enable such optimization
for any target later.

That being said, I didn't check all ad-hoc llvm floating point optimizations.
If my assumption is wrong or if there is a simpler way to handle this,
please feel free to suggest any different implementation.