This is an archive of the discontinued LLVM Phabricator instance.

[Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it.
ClosedPublic

Authored by craig.topper on Jun 26 2018, 4:04 PM.

Details

Summary

This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter.

Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible.

To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future.

To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin.

To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute.

To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins.

There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch.

Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue.

Diff Detail

Event Timeline

craig.topper created this revision.Jun 26 2018, 4:04 PM
aaron.ballman added a subscriber: aaron.ballman.
aaron.ballman added inline comments.
include/clang/Basic/Attr.td
1949

Should this apply to Objective-C methods? What about other function-like interfaces such as function pointers?

1950

No new, undocumented attributes, please.

lib/Sema/SemaDeclAttr.cpp
2955

Do you want to drop one of the attributes in this case?

test/Sema/attr-min-vector-width.c
9

Also missing tests for applying the attribute to the wrong subject and with the incorrect number of arguments.

craig.topper added inline comments.Jul 2 2018, 11:59 AM
test/Sema/attr-min-vector-width.c
9

How many possible subjects are there? And how many wrong subjects do you want me to test?

-Rebase the intrinsic headers and builtins file
-Add documentation for the attribute. Open to feedback on improvements here
-Add tests for wrong number of arguments to the attribute.

craig.topper added inline comments.Jul 2 2018, 12:04 PM
include/clang/Basic/Attr.td
1949

I think maybe it should apply to objective-C. but I"m not sure because it doesn't look like the target attribute applies there?

aaron.ballman added inline comments.Jul 2 2018, 12:06 PM
test/Sema/attr-min-vector-width.c
9

Lots of possible subjects, but testing just one will suffice as a failure test case (you already covered the successful subject test cases). We've had issues in the past where lacking this test coverage caused problems that cost a lot of time to track down, so it's a checklist test item for me.

aaron.ballman added inline comments.Jul 2 2018, 12:44 PM
include/clang/Basic/Attr.td
1949

I don't have strong opinions on the question, I just wasn't sure if this attribute would be something an ObjC method would want to make use of. If that's unlikely, it's reasonable to leave it off until a use case appears.

include/clang/Basic/AttrDocs.td
1502

clang support -> Clang supports

1503

that backend -> the backend

1504

Target specific -> Target-specific

1505

"Apply" how?

I don't see any logic that diagnoses the situation where the user requests something larger than the maximum vector width or smaller than the minimum vector width supported by the target. I would expect the attribute to be diagnosed and dropped in that case; is there a reason not to do that?

This is attribute it -> This attribute is

1511

This suggests to me that the user can override the attribute on the builtins if you want a different behavior; is that correct?

lib/Basic/Builtins.cpp
119–122

I think you want to use the ending position from strtol() to assert that it ended on the : and not some other random character. e.g., I think this will parse just fine: V:1e28: and run into issues elsewhere.

lib/CodeGen/CodeGenFunction.cpp
449

Missing full stop at the end of the comment.

1198

const auto *

lib/CodeGen/CodeGenFunction.h
1466

with -> width

FWIW, I looked at an early version of this patch and am generally happy with the target-specific / IR-specific behavior aspects of it. Totally leaving the detailed review of the attribute stuff to you Aaron, as you're already doing an amazing job there. Minor clarification on use cases below.

include/clang/Basic/Attr.td
1949

Should it in theory? Yes.

But if/when we want to make good on that in practice, we would need to do the same for the target attribute as Craig mentions. I think it would be good to defer doing anything here to that point -- we don't need these to be *more* powerful than the target attribute in that regard.

aaron.ballman added inline comments.Jul 2 2018, 1:39 PM
include/clang/Basic/Attr.td
1949

Strong agreement; that was the conclusion @craig.topper and I came to on IRC as well.

craig.topper marked 4 inline comments as done.

-Added a negative test
-Hopefully fixed all the grammatical/spelling errors.
-Attempted to clarify some more about prefer-vector-width and builtins.

aaron.ballman accepted this revision.Jul 3 2018, 3:48 PM

Aside from a new round of minor doc nits, I think this is looking good.

One remaining question I have is whether the attribute should diagnose an argument for a width that's not supported by the target and drop the attribute explicitly. e.g., if a user picks a min width that's 10240000 and the target cannot handle it, I think it's better to diagnose than to silently truncate to a value the user wasn't expecting. However, this can be handled in a follow-up patch if you want to land this and not have to deal with constant rebasing headaches.

include/clang/Basic/AttrDocs.td
1512

Backticks around the command line option; same below (and for the attribute name as well). Basically, put the code things into the code font.

This revision is now accepted and ready to land.Jul 3 2018, 3:48 PM
This revision was automatically updated to reflect the committed changes.
include/clang/Basic/Builtins.def