This is an archive of the discontinued LLVM Phabricator instance.

[X86] Don't consider functions ABI compatible for ArgumentPromotion pass if they would legalize 512-bit vectors differently.
ClosedPublic

Authored by craig.topper on Feb 19 2019, 9:36 AM.

Details

Summary

The use of the -mprefer-vector-width=256 command line option mixed with functions using vector intrinsics can create situations where one function thinks 512 vectors are legal, but another fucntion does not.

If a 512 bit vector is passed between them via a pointer, its possible ArgumentPromotion might try to pass by value instead. This will result in type legalization for the two functions handling the 512 bit vector differently leading to runtime failures.

Had the 512 bit vector been passed by value from clang codegen, both functions would have been tagged with a min-legal-vector-width=512 function attribute. That would make them be legalized the same way.

I observed this issue in 32-bit mode where a union containing a 512 bit vector was being passed by a function that used intrinsics to one that did not. The caller ended up passing in zmm0 and the callee tried to read it from ymm0 and ymm1.

The fix implemented here is just to consider it a mismatch if two functions would handle 512 bit differently without looking at the types that are being considered. This is the easist and safest fix, but it can be improved in the future.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Feb 19 2019, 9:36 AM
echristo added inline comments.Feb 19 2019, 9:50 AM
lib/Target/X86/X86TargetTransformInfo.cpp
3085–3089 ↗(On Diff #187394)

I'm guessing removing the single use local variables is much less readable than having them?

craig.topper marked an inline comment as done.Feb 19 2019, 10:02 AM
craig.topper added inline comments.
lib/Target/X86/X86TargetTransformInfo.cpp
3085–3089 ↗(On Diff #187394)

clang-format gave me this

return static_cast<const X86Subtarget *>(TM.getSubtargetImpl(*Caller))
           ->useAVX512Regs() ==
       static_cast<const X86Subtarget *>(TM.getSubtargetImpl(*Callee))
           ->useAVX512Regs();
echristo added inline comments.Feb 19 2019, 10:05 AM
lib/Target/X86/X86TargetTransformInfo.cpp
3085–3089 ↗(On Diff #187394)

Actually:

class X86TTIImpl : public BasicTTIImplBase<X86TTIImpl> {

typedef BasicTTIImplBase<X86TTIImpl> BaseT;
typedef TargetTransformInfo TTI;
friend BaseT;

const X86Subtarget *ST;
const X86TargetLowering *TLI;

The subtarget is already there :)

Use TM.getSubtarget<X86Subtarget> to shorten code.

echristo accepted this revision.Feb 19 2019, 11:38 AM
This revision is now accepted and ready to land.Feb 19 2019, 11:38 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2019, 12:14 PM