This is an archive of the discontinued LLVM Phabricator instance.

[COFF, ARM64] Fix ABI implementation of struct returns
ClosedPublic

Authored by mgrang on Apr 5 2019, 5:35 PM.

Diff Detail

Event Timeline

mgrang created this revision.Apr 5 2019, 5:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2019, 5:35 PM
mgrang added a comment.Apr 5 2019, 5:42 PM

Got rid of the confusing SuppressSRet logic. The "inreg" attribute is now used to indicate sret returns in X0.

ostannard requested changes to this revision.Apr 11 2019, 6:23 AM
ostannard added a subscriber: ostannard.

The document you linked in the LLVM change (https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values) says that small POD types are returned directly in X0 or X0 and X1, but this looks like it will always return them indirectly. I think we also need to check the size of the type, and fall back the the plain C ABI for small types.

lib/CodeGen/MicrosoftCXXABI.cpp
1095

Nit: this will use X1 for functions with a this parameter, not X0.

1097

This should also check aarch64_be.

This revision now requires changes to proceed.Apr 11 2019, 6:23 AM
efriedma added inline comments.Apr 12 2019, 12:51 PM
lib/CodeGen/MicrosoftCXXABI.cpp
1097

aarch64_be-windows isn't a thing; it doesn't make sense to check for that explicitly.

That said, there is an isAArch64() helper which is probably more readable anyway.

mgrang updated this revision to Diff 194954.Apr 12 2019, 1:41 PM
mgrang marked an inline comment as done.

The document you linked in the LLVM change (https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values) says that small POD types are returned directly in X0 or X0 and X1, but this looks like it will always return them indirectly. I think we also need to check the size of the type, and fall back the the plain C ABI for small types.

richard.townsend.arm added inline comments.
lib/CodeGen/MicrosoftCXXABI.cpp
1093

Microsoft have updated the spec since this was written, it now says to check for aggregate-ness, trivial copy, and trivial destruct, instead of POD.

ssijaric added inline comments.Apr 17 2019, 5:21 PM
lib/CodeGen/MicrosoftCXXABI.cpp
1093

This is only true for aggregates. We can have non-aggregate return types that are less or equal to 16 bytes. In this case, they are returned on the stack.

1110

The size returned is in bits, not bytes.

As mentioned above, this applies to aggregates with trivial copy assignment operators and destructors. I will provide a function to check for this. At that point, the check can be removed from here, and the check below can be replaced with something like:

bool isIndirectReturn = !canReturnInRegisters();

lib/CodeGen/MicrosoftCXXABI.cpp
1104

Based on https://reviews.llvm.org/D60348 - lib/Target/AArch64/AArch64CallingConvention.td (line 44), should this be (isIndirectReturn || isInstanceMethod)?

mgrang updated this revision to Diff 195932.Apr 19 2019, 4:23 PM
mgrang updated this revision to Diff 196109.Apr 22 2019, 11:55 AM
efriedma added inline comments.Apr 22 2019, 12:49 PM
lib/Sema/SemaDeclCXX.cpp
5906

The isAggregate predicate is not appropriate to check here; it varies depending on whether the language version is C++14 or C++17. For example, C++17 aggregates are allowed to have base classes. You have to split out checking the relevant properties (getNumBases(), isDynamicClass(), etc.).

mgrang updated this revision to Diff 196144.Apr 22 2019, 3:37 PM
mgrang marked an inline comment as done.Apr 22 2019, 3:42 PM
efriedma added inline comments.Apr 22 2019, 4:23 PM
lib/Sema/SemaDeclCXX.cpp
5904

I'm not entirely sure it makes sense to do all of these Windows-specific checks here... I think it makes trivial_abi not work the way it should. But that's not something we need to worry about for the initial patch, I think.

5916

D->getAccess() is the access specifier for the class itself (if the class is nested inside another class), not the access specifier for any of its members.

I think you're looking for HasProtectedFields and HasPrivateFields. (I think there currently isn't any getter on CXXRecordDecl for them at the moment, but you can add one.)

5922

Why are you checking needsImplicitCopyAssignment()?

lib/Sema/SemaDeclCXX.cpp
5908

I think these checks are in the wrong place - the aggregate restriction applies to return values, but it doesn't apply to arguments. Placing this logic here breaks argument passing between MSVC and Clang, because Clang (with this logic in place) will pass and expect arguments with an extra layer of indirection: for example, when passing https://cs.chromium.org/chromium/src/v8/include/v8.h?q=v8::Local&sq=package:chromium&g=0&l=183 v8::Local, what you actually see in the register bank is supposed to be its only private member. Moving the logic to MicrosoftCXXABI and applying it only to the return value resolves this problem.

5916

So the issue that I mentioned in https://bugs.llvm.org/show_bug.cgi?id=41135#c18 is resolved by checking HasProtectedFields/HasPrivateFields.

mgrang updated this revision to Diff 196341.Apr 23 2019, 3:13 PM
mgrang marked 3 inline comments as done.
mgrang updated this revision to Diff 196342.Apr 23 2019, 3:16 PM

It looks like there's some missing documentation in the ARM64 ABI document involving homogeneous aggregates... in particular, it looks like non-POD types are never homogeneous, or something along those lines. I guess we can address that in a followup, though.

@TomTan could you look into updating the ARM64 ABI documentation?

Testcase:

struct Pod {
  double b[2];
};
struct NotAggregate {
  NotAggregate();
  double b[2];
};
struct NotPod {
  NotAggregate x;
};
Pod copy(Pod *x) { return *x; }  // ldp d0,d1,[x0]
NotAggregate copy(NotAggregate *x) { return *x; } // stp x8,x9,[x0]
NotPod copy(NotPod *x) { return *x; } // ldp x0,x1,[x8]
lib/Sema/SemaDeclCXX.cpp
5959

just did some quick tests; I think this rule does in fact apply to AArch64, except that the limit is 128 bits, instead of 64 bits.

It looks like there's some missing documentation in the ARM64 ABI document involving homogeneous aggregates... in particular, it looks like non-POD types are never homogeneous, or something along those lines. I guess we can address that in a followup, though.

@TomTan could you look into updating the ARM64 ABI documentation?

Testcase:

struct Pod {
  double b[2];
};
struct NotAggregate {
  NotAggregate();
  double b[2];
};
struct NotPod {
  NotAggregate x;
};
Pod copy(Pod *x) { return *x; }  // ldp d0,d1,[x0]
NotAggregate copy(NotAggregate *x) { return *x; } // stp x8,x9,[x0]
NotPod copy(NotPod *x) { return *x; } // ldp x0,x1,[x8]

@efriedma from your test sample above, which part is missing in the ARM64 ABI document? For NotPod, it is aggregate which is specific in the document.

For NotPod, it is aggregate which is specific in the document

Yes, it's an aggregate which is returned in registers... but it's returned in integer registers, unlike Pod which is returned in floating-point registers.

For NotPod, it is aggregate which is specific in the document

Yes, it's an aggregate which is returned in registers... but it's returned in integer registers, unlike Pod which is returned in floating-point registers.

struct Pod is HFA (Homogenous Floating-Point Aggregates) which is returned in floating-point register, struct NotPod is not HFA so still returned in integer registers. The ARM64 ABI document will be updated to reflect this, thanks.

lib/CodeGen/MicrosoftCXXABI.cpp
1074

Should this function also check for user-provided constructors?

lib/CodeGen/MicrosoftCXXABI.cpp
1074

I think it should: I speculatively added these two lines

if (RD->hasUserDeclaredConstructor())
  return true;

and it resolved the problem with std::setw I mentioned in the bug tracker (which means Electron could start).

efriedma added inline comments.Apr 25 2019, 12:32 PM
lib/CodeGen/MicrosoftCXXABI.cpp
1074

The comment says it's supposed to check for user-provided constructors, but there isn't any code to perform that check, yes.

hasUserDeclaredConstructor() isn't precisely correct; from the C++ standard, "A function is user-provided if it is user-declared and not explicitly defaulted or deleted on its first declaration."

For NotPod, it is aggregate which is specific in the document

Yes, it's an aggregate which is returned in registers... but it's returned in integer registers, unlike Pod which is returned in floating-point registers.

struct Pod is HFA (Homogenous Floating-Point Aggregates) which is returned in floating-point register, struct NotPod is not HFA so still returned in integer registers. The ARM64 ABI document will be updated to reflect this, thanks.

@efriedma Return info for HFA and HVA is updated in https://github.com/MicrosoftDocs/cpp-docs/pull/970.

Return info for HFA and HVA is updated

That's helpful, but not really sufficient; according to the AAPCS rules, both "Pod" and "NotPod" from my testcase are HFAs.

Return info for HFA and HVA is updated

That's helpful, but not really sufficient; according to the AAPCS rules, both "Pod" and "NotPod" from my testcase are HFAs.

Could you provide some more details on why NotPod is HFA, according to AAPCS?

Could you provide some more details on why NotPod is HFA, according to AAPCS?

In general, computing whether a composite type is a "homogeneous aggregate" according to section 4.3.5 depends only on the "fundamental data types". It looks through "aggregates" (C structs/C++ classes), unions, and arrays. The test applies "without regard to access control or other source language restrictions". Section 7.1.6 mentions there are additional rules that apply to non-POD structs, but that's just referring to the Itanium C++ ABI rule that types which are "non-trivial for the purposes of calls" are passed and returned indirectly. Both clang and gcc agree that "NotPod" is an HFA on AArch64 Linux.

So the question is, what rule is MSVC using to exclude "NotPod" from being an HFA?

mgrang updated this revision to Diff 196774.Apr 25 2019, 6:31 PM
efriedma added inline comments.Apr 25 2019, 6:54 PM
lib/CodeGen/MicrosoftCXXABI.cpp
1078

Like I mentioned before, the rule says "user-provided", not "user-declared".

1080

The rule says "trivial" not "user-declared".

Confirmed just now that the current diff (196774) does work, but it'd be good to correct user-provided constructors and trivial destructors before this lands.

mgrang updated this revision to Diff 196894.Apr 26 2019, 11:56 AM

The current diff (196894) seems to have the same std::setw issue as the previous one. Here's what it's trying to compile:

_MRTIMP2 _Smanip<streamsize> __cdecl setw(streamsize wide)
{	// manipulator to set width
      return (_Smanip<streamsize>(&swfun, wide));
 }

Here's the definition for _Smanip:

  template<class _Arg>
    struct _Smanip
      {	// store function pointer and argument value
        _Smanip(void (__cdecl *_Left)(ios_base&, _Arg), _Arg _Val)
		: _Pfun(_Left), _Manarg(_Val)
		{	// construct from function pointer and argument value
		}

               void (__cdecl *_Pfun)(ios_base&, _Arg);	// the function pointer
                   _Arg _Manarg;	// the argument value
     };

Apologies, I meant to write "here's what Clang is trying to call" in the previous comment. It's clear from looking at MSVC's output that MSVC expects to return indirectly via X0, implying that _Smanip is not aggregate by MSVC's definition.

lib/CodeGen/MicrosoftCXXABI.cpp
1080

So I think that the problem with this new check is that it doesn't check all of the constructors. I replaced it with

for (auto it = RD->ctor_begin(); it != RD->ctor_end(); ++it) {
  if (it->isUserProvided())
    return true;
}

And that seems to resolve the setw problem.

mgrang updated this revision to Diff 197158.Apr 29 2019, 12:03 PM

Thanks @richard.townsend.arm . I have updated the patch.

I'd like to see a few more tests to cover the interesting cases that came up during development of this patch:

  1. The user-provided constructor issue: there should be testcases with an explicit user-provided constructor, a non-trivial constructor that's explicitly defaulted, and a non-trivial constructor that's implicitly defaulted.
  2. The difference between a user-defined and a non-trivial destructor: there should be a testcase with an implicit non-trivial destructor (which is non-trivial because a member has a non-trivial destructor).
  3. A testcase with a trivial copy constructor, but a non-trivial copy-assignment operator.
  4. A testcase with a struct with a base class.
lib/CodeGen/MicrosoftCXXABI.cpp
1105

I'm not sure what the IsSizeGreaterThan128 check is doing here - if the return type is over 128 bits, then it will be indirectly returned in X8 with this check, which is not always what we want (e.g. in https://bugs.llvm.org/show_bug.cgi?id=41135 ostream.cpp, MSVC expects the value returned in X1). Another check of hasMicrosoftABIRestrictions might be OK, but that's also not quite right due to the size check.

mgrang updated this revision to Diff 197436.Apr 30 2019, 2:11 PM
efriedma added inline comments.Apr 30 2019, 2:38 PM
lib/CodeGen/MicrosoftCXXABI.cpp
1105

I think the check here is intended to counteract the similar check in hasMicrosoftABIRestrictions... but yes, I don't think that works correctly. Looks like we're missing a testcase for a non-aggregate type with size greater than 128 bits.

lib/CodeGen/MicrosoftCXXABI.cpp
1105

Ah, OK. I think the problem is that we need to check whether the return value meets the rest of aggregate definition (i.e. !hasMicrosoftABIRestrictions), as well as being more than 128 bits in size. May be worth splitting the logic up. Will experiment.

efriedma added inline comments.Apr 30 2019, 2:59 PM
lib/CodeGen/MicrosoftCXXABI.cpp
1105

I think you might get the right behavior by just erasing both IsSizeGreaterThan128 checks. That matches the way the ABI document describes the rules. Haven't tried it, though.

rnk added inline comments.Apr 30 2019, 3:08 PM
lib/CodeGen/MicrosoftCXXABI.cpp
55–56

These are both trivial wrappers for RD->canPassInRegisters() which has the real logic. I would just call that to remove some indirection, even if the name is the logical opposite of the property we're trying to check.

1079

Can this be just: for (const CXXConstructorDecl *Ctor : RD->ctors())?

dmajor added a subscriber: dmajor.Apr 30 2019, 3:20 PM
lib/CodeGen/MicrosoftCXXABI.cpp
1105

So I tried that, and everything mostly worked, but there are still some crashes in electron. I think the solution will be something like:

  1. Remove the size restriction from the hasMicrosoftABIRestrictions and just decide if the result is aggregate/not aggregate according to the ABI spec.
  2. Replace the check with !(isAggregate && IsSizeGreaterThan128(RD))

If both of those things are are true (i.e. it's an aggregate more than >128bits) we should end up calling FI.getReturnInfo().setInReg(false), which implies a return in x8. I haven't confirmed that this works yet because I've run out of time today, but the result should be in tomorrow. May need similar logic on the outer if.

And with those modifications, everything seems to work correctly. I'd be fine with the LLVM portion landing at this stage, but one more rev and another test cycle for this patch would be ideal.

lib/CodeGen/MicrosoftCXXABI.cpp
1085

So... to get the latest diff working, I had to remove this check...

1095

I also had to amend this to:

bool isIndirectReturn = isAArch64 ?
    (passClassIndirect(RD) || hasMicrosoftABIRestrictions(RD)
       || IsSizeGreaterThan128(RD)) :
    !RD->isPOD();
1105

... and also amend this to FI.getReturnInfo().setInReg(isAArch64 && !(isAggregate && IsSizeGreaterThan128(RD)));

mgrang updated this revision to Diff 197831.May 2 2019, 11:06 AM
mgrang edited the summary of this revision. (Show Details)
efriedma accepted this revision.May 2 2019, 12:19 PM

LGTM, assuming it passes testing on electron

LGTM, assuming it passes testing on electron

Thanks Eli. @richard.townsend.arm Can you please confirm whether Electron works with this patch and D60348?

Just completed testing... everything seems to be working correctly. So long as the all the tests pass, let's commit! :D

This revision was not accepted when it landed; it landed in state Needs Review.May 3 2019, 2:11 PM
This revision was automatically updated to reflect the committed changes.