This is an archive of the discontinued LLVM Phabricator instance.

x86-64 ABI: unwrap single element structs / arrays of 256-bit vectors to pass and return in registers
ClosedPublic

Authored by spatel on Feb 13 2015, 7:56 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 19891.Feb 13 2015, 7:56 AM
spatel retitled this revision from to x86-64 ABI: unwrap single element structs / arrays of 256-bit vectors to pass and return in registers.
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added reviewers: hfinkel, bruno.
spatel added a subscriber: Unknown Object (MLST).
hfinkel accepted this revision.Feb 16 2015, 1:49 AM
hfinkel edited edge metadata.

Indeed, this looks like an odd mismatch between X86_64ABIInfo::GetByteVectorType and the recursive logic in X86_64ABIInfo::classify (which will recurse and handle the array types).

FWIW, I wonder if there are more bugs hiding in here too. When X86_64ABIInfo::classify recurses on the array types, it looks like it is just looking at the total size of the array. isSingleElementStruct specifically handles only arrays of 1 element. What happens when you try to pass a struct of { v4f32 x[2]; }?

If that works, then LGTM (otherwise, we obviously need to fix that too).

This revision is now accepted and ready to land.Feb 16 2015, 1:49 AM

What happens when you try to pass a struct of { v4f32 x[2]; }?

Great question. There's a check in classify() around here:
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/TargetInfo.cpp#L2044

// The only case a 256-bit wide vector could be used is when the struct
// contains a single 256-bit element. Since Lo and Hi logic isn't extended
// to work for sizes wider than 128, early check and fallback to memory.
//
if (Size > 128 && getContext().getTypeSize(i->getType()) != 256) {

And so we don't try to pass/return anything in registers with an array of two 128-bit vectors.

And this also triggers on the even simpler case of:
struct v4f32_wrapper {

v4f32 v1;
v4f32 v2;

};

Based on my reading of the ABI, I would've thought that could be passed as 2 xmm registers, but we pass by memory for this too. FWIW, gcc 4.9 and icc 15 do the same in these cases, so I think this is just the way things are supposed to be.

What happens when you try to pass a struct of { v4f32 x[2]; }?

Great question. There's a check in classify() around here:
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/TargetInfo.cpp#L2044

// The only case a 256-bit wide vector could be used is when the struct
// contains a single 256-bit element. Since Lo and Hi logic isn't extended
// to work for sizes wider than 128, early check and fallback to memory.
//
if (Size > 128 && getContext().getTypeSize(i->getType()) != 256) {

And so we don't try to pass/return anything in registers with an array of two 128-bit vectors.

And this also triggers on the even simpler case of:
struct v4f32_wrapper {

v4f32 v1;
v4f32 v2;

};

Based on my reading of the ABI, I would've thought that could be passed as 2 xmm registers, but we pass by memory for this too. FWIW, gcc 4.9 and icc 15 do the same in these cases, so I think this is just the way things are supposed to be.

Okay, great. Please commit.

Also, if the ABI spec differs from implementation practice, we should also send a note to whomever maintains that and try to get it clarified.

This revision was automatically updated to reflect the committed changes.