This is an archive of the discontinued LLVM Phabricator instance.

Correct x86_64 fp128 calling convention
ClosedPublic

Authored by rnk on Jul 22 2015, 3:58 PM.

Diff Detail

Event Timeline

chh updated this revision to Diff 30416.Jul 22 2015, 3:58 PM
chh retitled this revision from to Correct x86_64 fp128 mangled name and return/varargs types .
chh updated this object.
chh added a subscriber: davidxl.Jul 22 2015, 4:13 PM

Please split this into two patches: one for the CC change and another for the mangling change.

majnemer edited subscribers, added: cfe-commits; removed: llvm-commits.Jul 22 2015, 4:33 PM
rnk added a comment.Jul 22 2015, 4:35 PM

We really should not use the general LLVM IR type conversion machinery to decide how we are classifying our arguments. There are existing instances of us doing this, but we should strive to eliminate them. I think the right approach is probably to check getLongDoubleFormat(), see if it is APFloat::IEEEquad or APFloat::x86DoubleExtended, and pick memory, sse, or x87 classifications based on that.

majnemer added inline comments.Jul 22 2015, 4:45 PM
lib/CodeGen/TargetInfo.cpp
1974–1989

I'd phrase this as:

} else if (ET == getContext().DoubleTy) {
  Lo = Hi = SSE;
} else if (ET == getContext().LongDoubleTy) {
  const llvm::fltSemantics *LDF = &getTarget().getLongDoubleFormat();
  if (LDF == &llvm::APFloat::IEEEquad)
    Current = Memory;
  else if (LDF == &llvm::APFloat::x87DoubleExtended)
    Current = ComplexX87;
  else if (LDF == &llvm::APFloat::IEEEdouble)
    Lo = Hi = SSE;
  else
    llvm_unreachable("unexpected long double representation!");
}
chh added a comment.Jul 23 2015, 10:23 AM

Separated the name mangling change into http://reviews.llvm.org/D11466.
I will fix the CC part in this patch soon.

chh updated this revision to Diff 30523.Jul 23 2015, 1:48 PM
chh retitled this revision from Correct x86_64 fp128 mangled name and return/varargs types to Correct x86_64 fp128 calling convention.
chh updated this object.

Change conditions from

CGT.ConvertType(RetTy)->isFP128Ty()

to

BT->getKind() == BuiltinType::LongDouble &&
&getTarget().getLongDoubleFormat() == &llvm::APFloat::IEEEquad

Use %clang_cc1 in unit tests.
Some checked IL output still depends on -O.

rnk added inline comments.Jul 23 2015, 1:56 PM
lib/CodeGen/TargetInfo.cpp
1861–1866

Any reason we can't do the same fp classification here like we do below for complex types?

1975–1976

Nice, I like this simplification.

2528

Why can't we classify IEEEQuad long doubles as SSE in the usual classify() implementation?

2667

ditto

chh marked an inline comment as done.Jul 23 2015, 4:22 PM

I tried to make X86_64ABIInfo::classify to return (Lo=SSE, Hi=NoClass) for fp128 long double type, or (Lo=SSE, Hi=SSEUp). That is not enough, although making fp128 Complex type to "Memory" worked.

X86_64ABIInfo::classifyArgumentType and classifyReturnType
will classify fp128 type as "double" through the help of GetSSETypeAtOffset.
These two or three functions still need more changes to handle fp128.
So I used the special cases for fp128, which seemed simpler with lower risk.

The mapping to register classes is quite complicated to decide converted parameter or return types. Although AMD64 spec has lengthy rules written this way, the rules are quite difficult to understand the mapping of fp128 type.

Is there other way to simplify these classification functions?

lib/CodeGen/TargetInfo.cpp
1973–1987

Done.

chh updated this revision to Diff 31268.Aug 3 2015, 2:37 PM
chh marked an inline comment as done.

New svn diff after svn update.

chh added a comment.Aug 3 2015, 2:59 PM

Ping. I hope the new synced diff will be easier to review.

rnk added a comment.Aug 6 2015, 10:54 AM
In D11437#211165, @chh wrote:

I tried to make X86_64ABIInfo::classify to return (Lo=SSE, Hi=NoClass) for fp128 long double type, or (Lo=SSE, Hi=SSEUp). That is not enough, although making fp128 Complex type to "Memory" worked.

X86_64ABIInfo::classifyArgumentType and classifyReturnType
will classify fp128 type as "double" through the help of GetSSETypeAtOffset.
These two or three functions still need more changes to handle fp128.
So I used the special cases for fp128, which seemed simpler with lower risk.

The mapping to register classes is quite complicated to decide converted parameter or return types. Although AMD64 spec has lengthy rules written this way, the rules are quite difficult to understand the mapping of fp128 type.

Is there other way to simplify these classification functions?

I think the right approach is to classify as SSE+SSEUp. It didn't work for you because GetByteVectorType was turning fp128 types into <2 x double>, which will correctly use XMM registers, but is not the IR you wanted.

I have a patch that fixes the TODOs and simplifies the code, do you mind if I land that?

rnk commandeered this revision.Aug 6 2015, 10:56 AM
rnk added a reviewer: chh.

Comandeering so I can upload my diff.

rnk updated this revision to Diff 31457.Aug 6 2015, 10:56 AM
rnk updated this object.
  • Update classify and GetByteVectorType
chh edited edge metadata.Aug 6 2015, 3:55 PM

Reid, thanks a lot for fixing my hacks!

I tried your new diff 31457 and it worked for Android libm and all my other tests.
I am still waiting for some review of the back end changes in http://reviews.llvm.org/D11438.
This patch can be submitted now or later with D11438.
Would you like to submit this one?

chh accepted this revision.Aug 6 2015, 3:56 PM
chh edited edge metadata.
This revision is now accepted and ready to land.Aug 6 2015, 3:56 PM
This revision was automatically updated to reflect the committed changes.