This is an archive of the discontinued LLVM Phabricator instance.

[HLSL] Add ExternalSemaSource & vector alias
ClosedPublic

Authored by beanz on Jun 16 2022, 2:56 PM.

Details

Summary

HLSL vector types are ext_vector types, but they are also exposed via a
template syntax vector<T, #>. This is morally equavalent to the code:

c++
template <typename T, int Size>
using vector = T __attribute__((ext_vector_type(Size)))

The problem is that templates aren't supported before HLSL 2021, and
type aliases still aren't supported in HLSL.

To resolve this (and other issues where HLSL can't represent its own
types), we rely on an external AST & Sema source being registered for
HLSL code.

This patch adds the HLSLExternalSemaSource and registers the vector
type alias.

Depends on D127802

Diff Detail

Event Timeline

beanz created this revision.Jun 16 2022, 2:56 PM
beanz requested review of this revision.Jun 16 2022, 2:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 2:56 PM
python3kgae added inline comments.Jun 16 2022, 3:20 PM
clang/lib/Frontend/FrontendAction.cpp
1022

Does this mean features like PCH which use ExternalSource will not work anymore? Will we add these features into HLSLSema?

beanz added inline comments.Jun 16 2022, 3:45 PM
clang/lib/Frontend/FrontendAction.cpp
1022

We can always use an external source multiplexer or add support through the HLSL source. Both are viable options as appropriate.

I wonder if accepting type alias with identifier vector while parsing would be simpler to implement? Then you could just add those type aliases into the internal header. I assume user code is not allowed to redefine those i.e. it would result in undefined behavior or something like for most of C++ standard library features...

It has a drawback of needing to check names for every type alias but it is an invalid program when used in other cases anyway so it should not be a concern for the parsing speed.

beanz added a comment.Jun 17 2022, 6:24 AM

For just the type alias, adding it during parsing would probably work. Where things get more complicated is we have a whole mess of other data types that we’ll need to add too eventually. One of the advantages of using an external sema source is that we can create incomplete records during initialization, and complete them when the sema source gets the callback from lookup. That gives us cheap and efficient lazy-initialization of our buffer types which (in DXC) has a huge impact on compile time.

For just the type alias, adding it during parsing would probably work. Where things get more complicated is we have a whole mess of other data types that we’ll need to add too eventually. One of the advantages of using an external sema source is that we can create incomplete records during initialization, and complete them when the sema source gets the callback from lookup. That gives us cheap and efficient lazy-initialization of our buffer types which (in DXC) has a huge impact on compile time.

aha, are you having a lot of these types then? Ok that sounds similar problem to OpenCL builtin functions as we had to go away from a simple header include due to long parsing time.

beanz added a comment.Jun 17 2022, 7:57 AM

aha, are you having a lot of these types then? Ok that sounds similar problem to OpenCL builtin functions as we had to go away from a simple header include due to long parsing time.

Very similar problem. We have a two-part problem where we both have enough of them that parsing a header would be too slow _and_ the HLSL language actually can't represent them. The later we're working to fix. My hope is that eventually we can write the HLSL libraries in HLSL and use a pre-compilation step (PCH or Module-style) to get past the performance issues.

aha, are you having a lot of these types then? Ok that sounds similar problem to OpenCL builtin functions as we had to go away from a simple header include due to long parsing time.

Very similar problem. We have a two-part problem where we both have enough of them that parsing a header would be too slow _and_ the HLSL language actually can't represent them. The later we're working to fix. My hope is that eventually we can write the HLSL libraries in HLSL and use a pre-compilation step (PCH or Module-style) to get past the performance issues.

PCH have large in-memory size compared to the sources, but if it's not an issue then it should be reasonable.

beanz added a comment.Jun 21 2022, 7:41 AM

PCH have large in-memory size compared to the sources, but if it's not an issue then it should be reasonable.

Yea... I think we may have some flexibility on memory size. Our current on-device source->dxil compiler is a dylib that runs in the user's process. My intent for the 2.0 implementation which will based on modern clang is to move the compiler out of process, which will give us more flexibility on memory usage, especially if the memory is read-only.

beanz added a comment.Jul 1 2022, 7:40 AM

gentle ping :)

mizvekov added inline comments.
clang/lib/Sema/HLSLExternalSemaSource.cpp
53
beanz updated this revision to Diff 441759.Jul 1 2022, 11:49 AM

Updates based on review feedback.

mizvekov accepted this revision.Jul 1 2022, 12:10 PM

This looks reasonable, LGTM for what it's worth :)

This revision is now accepted and ready to land.Jul 1 2022, 12:10 PM
aaron.ballman added inline comments.Jul 5 2022, 7:55 AM
clang/include/clang/Sema/HLSLExternalSemaSource.h
23

It looks like you're missing the classof and isA support that actually uses this.

clang/lib/Headers/hlsl/hlsl_basic_types.h
30–35

Can you explain these changes in light of:

The problem is that templates aren't supported before HLSL 2021, and type aliases still aren't supported in HLSL.

This still looks like it's using template and type aliases.

clang/lib/Sema/HLSLExternalSemaSource.cpp
42–45

And users won't find this behavior surprising? I would be worried that the user has their own globally declared type that this hidden using directive will then cause to be ambiguous: https://godbolt.org/z/jMsW54vWe -- when users hit this themselves, the location of the conflict is going to point into nowhere-land, which is also rather unfortunate.

Actually, the more I think about this, the less comfortable I am with it. This also means you have to be *very* careful about conflicts with STL names, and it means that any new declarations added to the hlsl namespace automatically risk breaking code.

Aside: any particular reason the namespace declaration is implicit but the using directive declaration is not?

clang/test/AST/HLSL/vector-alias.hlsl
21

It's good that there are tests to verify the AST behavior, but there should also be tests showing the behavior in failure cases (like instantiating the vector incorrectly, ambiguous names as described above, etc) to see if we can live with the behavior.

This revision was landed with ongoing or failed builds.Jul 5 2022, 9:30 AM
This revision was automatically updated to reflect the committed changes.
beanz added a comment.Jul 5 2022, 9:38 AM

@aaron.ballman, I just had a total brain-asleep moment, and pushed this without realizing that I had feedback from you. Are you okay with me addressing that most-commit or would you prefer a revert? Huge apologies...

beanz added inline comments.Jul 5 2022, 9:49 AM
clang/lib/Headers/hlsl/hlsl_basic_types.h
30–35

This is an extreme total hack of insanity, but it is how HLSL works.

Basically, user-defined templates aren't supported, but built-in templates have been around for years. In DXC, the vector template is a class containing an ext_vector but there are horrible hacks in member expr handling to make the . operator return ext_vector swizzles... Using an alias allows me to get source compatibility here without having to do any of the member expr hacking.

clang/lib/Sema/HLSLExternalSemaSource.cpp
42–45

As HLSL is implanted today in DXC the built-in types and functions are all globally defined (not in any namespace). My goal here was to start nesting them in an hlsl namespace, and to eventually do that for DXC as well in a new language version. HLSL just got support for using namespace like 3 months ago, so we can't do that in a header for compatibility, so my intent was to throw it in the directive here.

My hope is the next version of HLSL will move the builtins into the hlsl namespace so I can make this only enabled for older language versions in the near future.

In terms of conflicts with STL, I don't think the STL sources will compile for HLSL anytime soon. Maybe in a future version of HLSL, but we've got a long way to go to get there. For existing HLSL code, namespaces are very sparingly used, and while we may find conflicts in user code I think moving HLSL's builtins into a namespace will be worth the pain... but I might be wrong about how much pain that will be.

@aaron.ballman, I just had a total brain-asleep moment, and pushed this without realizing that I had feedback from you. Are you okay with me addressing that most-commit or would you prefer a revert? Huge apologies...

No worries, it happens! Post-commit is fine by me; no need to have significant churn.

clang/lib/Headers/hlsl/hlsl_basic_types.h
30–35

Interesting and horrifying in equal measures. :-)

clang/lib/Sema/HLSLExternalSemaSource.cpp
42–45

So basically this trades redefinition errors for ambiguous lookup errors in terms of conflicts with user-declared code?

In terms of conflicts with STL, I don't think the STL sources will compile for HLSL anytime soon. Maybe in a future version of HLSL, but we've got a long way to go to get there. For existing HLSL code, namespaces are very sparingly used, and while we may find conflicts in user code I think moving HLSL's builtins into a namespace will be worth the pain... but I might be wrong about how much pain that will be.

I think moving them into a namespace is a lovely idea in theory, but it's still discomforting to add the using directive automatically for the user. The user is going to have the pain of dealing with the names moving to a new namespace at some point, so why not start from a cleaner design? It seems not entirely unreasonable to expect users to type using namespace hlsl; themselves if you're moving the types into a namespace, and users who need to handle older compilers can use the preprocessor to conditionally compile that line out. Or is that not really a viable option?

beanz added inline comments.Jul 5 2022, 11:49 AM
clang/lib/Sema/HLSLExternalSemaSource.cpp
42–45

Yea, basically in DXC declaring a top-level vector class would be a redefinition error, where as with this implementation it would be a lookup error.

The test case I just wrote is:

// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -fsyntax-only -verify %s

// This code is rejected by clang because clang puts the HLSL built-in types
// into the HLSL namespace.
namespace hlsl {
  struct vector {}; // expected-error {{redefinition of 'vector'}}
}

// This code is rejected by dxc because dxc puts the HLSL built-in types
// into the global space, but clang will allow it even though it will shadow the
// vector template.
struct vector {}; // expected-note {{candidate found by name lookup is 'vector'}}

vector<int,2> VecInt2; // expected-error {{reference to 'vector' is ambiguous}}

// expected-note@*:* {{candidate found by name lookup is 'hlsl::vector'}}

My concern here is really about compatibility with existing code and things get a bit funky because we don't always have the ability to modify the code. The Windows and Xbox OSs ship HLSL compilers as part of the OS, and some degree of explicit source compatibility for older language versions is needed.

I'm really hoping moving things into the hlsl namespace will land in our next language version (HLSL 202x), but since 2021 still isn't fully out the door... I don't really know when that will be. I could preemptively gate the using declaration here on the version being < hlsl202x, which would put clang one step ahead of DXC.

Giving the posthumous review my LGTM so it's clear. :-)

clang/lib/Sema/HLSLExternalSemaSource.cpp
42–45

My concern here is really about compatibility with existing code and things get a bit funky because we don't always have the ability to modify the code. The Windows and Xbox OSs ship HLSL compilers as part of the OS, and some degree of explicit source compatibility for older language versions is needed.

Whelp, that pretty much ties our hands. :-D I guess I'll disagree but commit to the changes anyway -- I think this is another instance of HLSL being a questionable production quality language where I have to hold my nose. :-)

beanz added a comment.Jul 5 2022, 2:34 PM

Posted an update in rGa6e63e35ede4, which removes the not-fully-implemented RTTI bits, and adds an extra test case to cover instantiation errors.