This is an archive of the discontinued LLVM Phabricator instance.

[HLSL] Add WaveActiveCountBits as Langugage builtin function for HLSL
ClosedPublic

Authored by python3kgae on Jun 2 2022, 12:28 AM.

Details

Summary

One clang builtins are introduced
uint WaveActiveCountBits( bool bBit ) as Langugage builtin function for HLSL.

The detail for WaveActiveCountBits is at
https://github.com/microsoft/DirectXShaderCompiler/wiki/Wave-Intrinsics#uint-waveactivecountbits-bool-bbit-

This is only clang part change to make WaveActiveCountBits into AST.
llvm intrinsic for WaveActiveCountBits will be add in separate PR.

Diff Detail

Event Timeline

python3kgae created this revision.Jun 2 2022, 12:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2022, 12:28 AM
python3kgae requested review of this revision.Jun 2 2022, 12:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2022, 12:28 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

ok, so the reason you are adding this to clang is that it needs to be mapped into a target intrinsic?

ok, so the reason you are adding this to clang is that it needs to be mapped into a target intrinsic?

Yes. The plan is to create a target intrinsic for WaveActiveCountBits which map to this clang builtin directly.

Anastasia added inline comments.Jun 2 2022, 11:56 AM
clang/include/clang/Basic/Builtins.def
1697

FYI we most of time try to add a builtin name using a reserved identifier which is not part of the language standard (mostly prefixed by __). Then we add regular function that just calls the clang builtins. This way provides more flexibility to the implementers. However you might not need this... in which case using original name avoids an extra call.

clang/test/SemaHLSL/Wave.hlsl
8

if you are planning to add a proper CodeGen test later then here it might be sufficient to have a -verify test and check that the builtin is accepted with the right arguments... you might want to add a test checking that a diagnostic is provided when the arguments are wrong... although this is not strictly necessary as we already test clang builtins quite extensively.

python3kgae marked 2 inline comments as done.

Change to verify test.

clang/include/clang/Basic/Builtins.def
1697

Yes. For this one, it is without prefix to avoid extra call.
I'm following things like enqueue_kernel in opencl.
For other things support overload which has to make extra call, I'll add the prefix.

clang/test/SemaHLSL/Wave.hlsl
8

Changed to verify test.

Anastasia added inline comments.Jun 2 2022, 12:26 PM
clang/include/clang/Basic/Builtins.def
1697

Ok, although enqueue_kernel was implemented as a builtin because it has a weird variadic prototype that can't be implemented using normal features of C/C++ languages. Hence it is a builtin with custom SemaChecking.

clang/test/SemaHLSL/Wave.hlsl
8

ok, then also -ast-dump -o - can be removed from the Run line. Also you probably don't need -finclude-default-header here wither?

Simplify test.

python3kgae marked 2 inline comments as done.Jun 2 2022, 12:37 PM
python3kgae added inline comments.
clang/include/clang/Basic/Builtins.def
1697

I see.
Since HLSL also has things cannot implemented using HLSL itself, we cannot put all HLSL intrinsic in one place anyway.
So when it is possible to reduce an extra call, I just reduce it.

Anastasia accepted this revision.Jun 2 2022, 12:56 PM

LGTM! Thanks

This revision is now accepted and ready to land.Jun 2 2022, 12:56 PM
This revision was landed with ongoing or failed builds.Jun 2 2022, 1:06 PM
This revision was automatically updated to reflect the committed changes.
python3kgae marked an inline comment as done.
beanz added inline comments.Jun 23 2022, 7:46 AM
clang/include/clang/Basic/Builtins.def
1697

I don't think this was the right decision. It is trivial for the optimizer to remove a function that is more or less a tail-call of another function, and since HLSL has no call semantics anyways they will disappear.

It would be much better if this was a function that called the builtin, this could even be trivially implemented as a sub-header of hlsl.h as a wrapper function.

beanz added inline comments.Jun 28 2022, 2:19 PM
clang/include/clang/Basic/Builtins.def
1697

Let me expand a little here. I think we need to change this. We should name builtins as builtins (i.e. __builtin_hlsl_...), and we can implement a wrapper function in one of the headers from hlsl.h.

There are two reasons I think we should take this approach:

  1. Matching existing builtin patterns makes it clear in the code
  2. Clang builtins don't populate in the AST for AST queries, which impacts how the function for IDE tooling. Having the HLSL user-surfaced functions exist in the AST will allow for better IDE tooling for autocomplete.

If hlsl.h gets too slow to parse, there are other approaches we can take like using the external sema source, modules or PCH to speed it up later.