This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/utils/TableGen/
-
utils/
-
TableGen/
4/7
ClangOpenCLBuiltinEmitter.cpp
-
TableGen.cpp
-
TableGenBackends.h

Differential D97869

[OpenCL] Add OpenCL builtin test generator
ClosedPublic

Authored by svenvh on Mar 3 2021, 9:26 AM.

Download Raw Diff

Details

Reviewers

Anastasia
azabaznov

Commits

rGc5ffc6f8bd6a: [OpenCL] Add builtin header test
rG8866793b4e0a: [OpenCL] Add OpenCL builtin test generator

Summary

Add a new clang-tblgen flag -gen-clang-opencl-builtin-tests that
generates a .cl file containing calls to every builtin function
defined in the .td input.

Example output:

...

float test6092_cos(float arg1) {
  return cos(arg1);
}
double test6093_cos(double arg1) {
  return cos(arg1);
}
half test6094_cos(half arg1) {
  return cos(arg1);
}
float2 test6095_cos(float2 arg1) {
  return cos(arg1);
}
double2 test6096_cos(double2 arg1) {
  return cos(arg1);
}

...

On the current (31 March 2020) main branch, this emitter will generate about 62k lines of test code.

The test consists of parsing a 60k line generated input file, which
takes about 90s with a debug build. This doesn't seem excessive
compared to other tests in clang/test/Headers (one of the tests
takes 120s for example).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

svenvh created this revision.Mar 3 2021, 9:26 AM

Herald added subscribers: jfb, yaxunl. · View Herald TranscriptMar 3 2021, 9:26 AM

svenvh requested review of this revision.Mar 3 2021, 9:26 AM

svenvh edited the summary of this revision. (Show Details)Mar 3 2021, 9:28 AM

If possible to add a fragment of what is generated it would be great!

Harbormaster completed remote builds in B91838: Diff 327823.Mar 3 2021, 4:20 PM

float test6092_cos(float arg1) {
  return cos(arg1);
}
double test6093_cos(double arg1) {
  return cos(arg1);
}
half test6094_cos(half arg1) {
  return cos(arg1);
}
float2 test6095_cos(float2 arg1) {
  return cos(arg1);
}
double2 test6096_cos(double2 arg1) {
  return cos(arg1);
}
``

I was just thinking if we could combine the calls into one function to minimize the number of lines to parse? Perhaps this will make the Tablegen generator too complex?

Also would it be possible to handle optional functionality - extensions and functions available in certain versions?

svenvh mentioned this in D97930: [OpenCL] Fix builtins that require multiple extensions.Mar 4 2021, 3:59 AM

In D97869#2602943, @Anastasia wrote:

I was just thinking if we could combine the calls into one function to minimize the number of lines to parse? Perhaps this will make the Tablegen generator too complex?

That will increase the emitter's complexity. Especially when taking into account the next point about handling optional functionality. Hard to say in advance if paying for the added emitter complexity is worth the parsing time reduction, though the parsing time reduction is probably not going to be an order of magnitude difference.

Also would it be possible to handle optional functionality - extensions and functions available in certain versions?

Yes, I think this should be possible; I already have a TODO in the emitter. It will make the resulting file bigger of course, because parts will have to be guarded with #ifdefs. But I believe it would be a good way to test extension and version handling for builtins.

That's awesome!

I'm thinking of how can you track correctness of generated built-ins according to spec. Perhaps you can compare the number of distinct declarations for each built-in in the spec and in tablegen and diagnose if they are not equal. This is a very straightforward, but I can't think of on other solution yet (with the absence of CTS coverage)...

In D97869#2603249, @azabaznov wrote:

That's awesome!

I'm thinking of how can you track correctness of generated built-ins according to spec. Perhaps you can compare the number of distinct declarations for each built-in in the spec and in tablegen and diagnose if they are not equal. This is a very straightforward, but I can't think of on other solution yet (with the absence of CTS coverage)...

The advantage of adding standard clang testing through the whole frontend invocation is that we can test all of the frontend features together to make sure that they indeed work as expected. For example there are types or macros added by clang and declarations from the opencl-base-c.h that interact with the headers declaring builtin functions so I think it would be desirable to use the standard clang testing strategy. There is also another advantage - it is more intuitive for the community to understand and maintain further.

But of course simple diffing should be fast I assume. We could look into this if we get push back on the traditional testing or we could consider it as a hybrid approach too.

In D97869#2603068, @svenvh wrote:

In D97869#2602943, @Anastasia wrote:

I was just thinking if we could combine the calls into one function to minimize the number of lines to parse? Perhaps this will make the Tablegen generator too complex?

That will increase the emitter's complexity. Especially when taking into account the next point about handling optional functionality. Hard to say in advance if paying for the added emitter complexity is worth the parsing time reduction, though the parsing time reduction is probably not going to be an order of magnitude difference.

We could also consider generating the file with a simple emitter and then partitioning it manually before adding into the repo. But I think you are not suggesting checking it in?

Could you explain a bit more about the overall approach?

To be honest, I don't have a concrete picture yet of how to deploy this emitter for testing. So here are a few thoughts about the various options that come to mind:

1. Regenerate the test file every time we run check-clang and use it to validate the `-fdeclare-opencl-builtins` option.

Advantage:

This gives confidence that each builtin described in OpenCLBuiltins.td is usable in OpenCL code.

Disadvantages:

This doesn't protect against unintended removals of builtins/overloads, as the same source of truth is used for implementation and testing.
It is quick to generate (<1 second), but it takes some time to parse the entire test with a debug build of Clang (35 seconds).

2. As 1, but also keep maintaining `opencl-c.h` and test with both `opencl-c.h` and `-fdeclare-opencl-builtins`.

This addresses the disadvantage of option 1 at the expense of maintaining opencl-c.h and increased testing time.

3. Generate the test file once, manually verify that it is correct, check it in to git, and maintain it from there.

Advantage:

Different sources of truth for implementation (OpenCLBuiltins.td) and testing (the generated and maintained test).

Disadvantages:

The output is big, currently already 56k lines, and I expect it will still get a bit bigger when we add conditionals. I don't feel comfortable about putting such big autogenerated files under git (regardless of whether we add it as a single file, or as a collection of smaller files). Ideally the repo should only contain the source from which we generate things.
Updates to OpenCLBuiltins.td will give large diffs when regenerating the tests (regardless of whether the file is split or not).
Manually verifying that the generated test is actually correct is not trivial.

4. Generate a condensed summary of all builtins, check that in to git.

Then regenerate the summary during testing and diff it against the checked in reference. This builds upon @azabaznov's suggestion. We could for example use a count for each builtin and maintain that. We could maintain multiple counts to cover different configurations (e.g. CL versions).
Advantages:

There are only about 1000 different builtin function names, so this reduces the size of the reference data considerably, making it easier to verify the initial checkin and making maintainance easier.
Generating the summary and diffing it against a reference is quick (<1 second).
It would catch the case when builtins are accidentally removed from a configuration: if the count in the summary is not updated, the test will fail. I believe this is an important scenario to test in light of the OpenCL 3 work?

Disadvantage:

Not giving perfect coverage, so subtle errors might slip through, such as for example changing the argument type of a builtin.

Other ideas / feedback welcome!

Thanks for the comprehensive summary. I don't have many other suggestions in mind but I would like to add a few thoughts.

We could always consider writing the tests manually from scratch too. But it doesn't feel like a good cost/benefit trade-off.

Regarding 2 and 4 I think we should drive towards deprecation of opencl-c.h as it is a maintenance overhead but we could convert it into a test instead?

Aside from option 3 all other options will require adding non-standard testing formats in Clang. This means that if there is any core refactoring changes even outside of OpenCL affecting the functionality in the headers, it would be required to learn how we do the testing in order to address any breakages. Perhaps this can be mitigated by the documentation, but overall it is an extra burden to the community. Speaking from my personal experience I was in the situations breaking functionality outside of OpenCL and having a unified testing format is a big advantage in understanding and addressing the issues in unfamiliar code.

The output is big, currently already 56k lines, and I expect it will still get a bit bigger when we add conditionals. I don't feel comfortable about putting such big autogenerated files under git (regardless of whether we add it as a single file, or as a collection of smaller files). Ideally, the repo should only contain the source from which we generate things.

What problems do you see with checking in autogenerated files? There are some existing tests that check similar functionality and they seem to contain repetitive patterns. Honestly, I wouldn't be able to say if they were written with copy/paste or auto-generated but I am not sure whether it is very important. Of course, it would have been better if we have built up the functionality incrementally, and it is still an option. But it hasn't been possible in the past and as a matter of fact the original header was committed as one large file (around 15K lines).

Updates to OpenCLBuiltins.td will give large diffs when regenerating the tests (regardless of whether the file is split or not).

Could you provide more details. I don't follow this.

Manually verifying that the generated test is actually correct is not trivial.

I think the way to approach this would be to just assume that the test is correct and then fix any issues we discover as we go along with the development. There is nothing wrong with this though as even small manually written tests have bugs.

In D97869#2616772, @Anastasia wrote:

Regarding 2 and 4 I think we should drive towards deprecation of opencl-c.h as it is a maintenance overhead but we could convert it into a test instead?

Perhaps you misunderstood option 4, as that does not require maintaining opencl-c.h. Only for option 2 we would have to maintain opencl-c.h. For option 4, the checked in files could look something like this:

clang/test/SemaOpenCL/builtins-opencl2.0.check

; RUN: clang-tblgen -emit-opencl-builtin-count -cl-std=cl2.0 OpenCLBuiltins.td | FileCheck '%s'

; This file lists the number of builtin function declarations per builtin function name
; available in the OpenCL 2.0 language mode.

CHECK: absdiff 48
CHECK: abs 48
CHECK: acos 18
CHECK: acosh 18
CHECK: acospi 18
...

To illustrate the use, suppose a developer writes a patch that accidentally makes the double variants of acos unavailable in OpenCL 2.0. The reported number of acos builtins would be different so FileCheck would fail on the line containing acos 18. If we use option 4 in combination with option 1, a Clang developer can regenerate the test and observe the differences for the acos tests: from there it becomes obvious that the double variants of acos disappeared.

Aside from option 3 all other options will require adding non-standard testing formats in Clang. This means that if there is any core refactoring changes even outside of OpenCL affecting the functionality in the headers, it would be required to learn how we do the testing in order to address any breakages. Perhaps this can be mitigated by the documentation, but overall it is an extra burden to the community. Speaking from my personal experience I was in the situations breaking functionality outside of OpenCL and having a unified testing format is a big advantage in understanding and addressing the issues in unfamiliar code.

Do you expect this to be a problem in practice for the other proposals, and if so could you elaborate in more detail about any particular concerns for the other proposals?

What problems do you see with checking in autogenerated files? There are some existing tests that check similar functionality and they seem to contain repetitive patterns. Honestly, I wouldn't be able to say if they were written with copy/paste or auto-generated but I am not sure whether it is very important. Of course, it would have been better if we have built up the functionality incrementally, and it is still an option. But it hasn't been possible in the past and as a matter of fact the original header was committed as one large file (around 15K lines).
Updates to OpenCLBuiltins.td will give large diffs when regenerating the tests (regardless of whether the file is split or not).
Could you provide more details. I don't follow this.

There are two problems here that combine to make the problem bigger: checking in autogenerated files, and the fact that those files are large. The problem I anticipate with regenerating the test files (at least with the current generator), is that inserting a new builtin in the .td file results in renumbering of the test functions, which results in a large diff.

Other problems I see with checking in large autogenerated files:

Merge conflicts in the generated files when reordering/rebasing/porting patches. A single conflict in the .td description could expand to many conflicts in the autogenerated file(s).
Diffs in the autogenerated files clutter the output of for example git log -p.
Code reviews of changes to those files are harder due to the size.
It bloats the repository.

Disclaimer: I may be biased by my previous experience on projects that had large autogenerated files checked in, that's why I feel that we should avoid doing so if we reasonably can. I have probably revealed now that option 3 is my least favorite option. :-)

Manually verifying that the generated test is actually correct is not trivial.
I think the way to approach this would be to just assume that the test is correct and then fix any issues we discover as we go along with the development. There is nothing wrong with this though as even small manually written tests have bugs.

Fair enough.

In D97869#2617063, @svenvh wrote:
In D97869#2616772, @Anastasia wrote:

Regarding 2 and 4 I think we should drive towards deprecation of opencl-c.h as it is a maintenance overhead but we could convert it into a test instead?

Perhaps you misunderstood option 4, as that does not require maintaining opencl-c.h. Only for option 2 we would have to maintain opencl-c.h. For option 4, the checked in files could look something like this:
clang/test/SemaOpenCL/builtins-opencl2.0.check
; RUN: clang-tblgen -emit-opencl-builtin-count -cl-std=cl2.0 OpenCLBuiltins.td | FileCheck '%s'

; This file lists the number of builtin function declarations per builtin function name
; available in the OpenCL 2.0 language mode.

CHECK: absdiff 48
CHECK: abs 48
CHECK: acos 18
CHECK: acosh 18
CHECK: acospi 18
...
To illustrate the use, suppose a developer writes a patch that accidentally makes the double variants of acos unavailable in OpenCL 2.0. The reported number of acos builtins would be different so FileCheck would fail on the line containing acos 18. If we use option 4 in combination with option 1, a Clang developer can regenerate the test and observe the differences for the acos tests: from there it becomes obvious that the double variants of acos disappeared.

Ok, I see. It is more clear now. I like that it is easy enough to understand. Still it has the issue of non-standard testing but we have mentioned it already.

Aside from option 3 all other options will require adding non-standard testing formats in Clang. This means that if there is any core refactoring changes even outside of OpenCL affecting the functionality in the headers, it would be required to learn how we do the testing in order to address any breakages. Perhaps this can be mitigated by the documentation, but overall it is an extra burden to the community. Speaking from my personal experience I was in the situations breaking functionality outside of OpenCL and having a unified testing format is a big advantage in understanding and addressing the issues in unfamiliar code.

Do you expect this to be a problem in practice for the other proposals, and if so could you elaborate in more detail about any particular concerns for the other proposals?

Not sure I understand what you mean by "other proposals" though. I think this is only an issue for the proposals that don't use regular clang invocation for testing. It is probably not a widespread problem but it could bother someone someday potentially.

What problems do you see with checking in autogenerated files? There are some existing tests that check similar functionality and they seem to contain repetitive patterns. Honestly, I wouldn't be able to say if they were written with copy/paste or auto-generated but I am not sure whether it is very important. Of course, it would have been better if we have built up the functionality incrementally, and it is still an option. But it hasn't been possible in the past and as a matter of fact the original header was committed as one large file (around 15K lines).
Updates to OpenCLBuiltins.td will give large diffs when regenerating the tests (regardless of whether the file is split or not).
Could you provide more details. I don't follow this.
There are two problems here that combine to make the problem bigger: checking in autogenerated files, and the fact that those files are large. The problem I anticipate with regenerating the test files (at least with the current generator), is that inserting a new builtin in the .td file results in renumbering of the test functions, which results in a large diff.

Other problems I see with checking in large autogenerated files:

Merge conflicts in the generated files when reordering/rebasing/porting patches. A single conflict in the .td description could expand to many conflicts in the autogenerated file(s).

Diffs in the autogenerated files clutter the output of for example git log -p.

Code reviews of changes to those files are harder due to the size.

It bloats the repository.

Disclaimer: I may be biased by my previous experience on projects that had large autogenerated files checked in, that's why I feel that we should avoid doing so if we reasonably can. I have probably revealed now that option 3 is my least favorite option. :-)

Ok, fair point. :D But however your option 3 and my option 3 are apparently not the same. I was thinking we would only generate test once and then manually modify and extend it every time new functionality is added. We could see it as if the file was written with copy-paste or perfect formatting. So every time someone would add anything in the headers they would also need to modify the test and reflect the functionality being added. So we could see the test generator as a tool to help us get started with testing. The advantage of extending the test manually is that it makes more clear what functionality is being implemented because you can see OpenCL code lines and it also separates the implementation from testing to minimize the probability of mistakes. The disadvantage is that for every few lines in Tablegen definitions we might need hundreds or thousands lines in the test.

If we instead regenerate the test every time then I agree with you that we should just generate it when the tests are build and never check it into the repo.

Anastasia mentioned this in D92004: [OpenCL] add CL 3.0 optional feature support to opencl-c.h.Mar 15 2021, 12:29 PM

I have one more though.

I like the idea of turning opencl-c.h into the test: as it is already in the repo and is already being used for quite a while we can assume it as a mainline for now. I think the first step should be to test that -fdeclare-oprencl-builtins generates the same built-ins which are declared in opencl-c.h. If we can't do this programmable way we can use AST pretty-printing for tablegen header and opencl-c.h and compare these with diff (we can also do a clean-up of AST files with rm while running that tests).

Once this is done we can incrementally modify either opencl-c.h header and tablegen header. Testing changes to either of them can combine sanity check as @svenvh suggested in builtins-opencl2.0.check and diff for AST pretty printing.

Advantages:

Time of testing. Locally I get nice results:

$ time ./build_release/bin/clang-check -extra-arg=-cl-std=CL2.0 --ast-print  llvm-project/clang/lib/Headers/opencl-c.h &> header.h
real    0m0.182s
user    0m0.162s
sys     0m0.020s

But not yet clear how much time such printing will take for tablegen header. I assume it won't take a lot longer.

This will keep changes to opencl-h header and tablegen header consistent (until opencl-c.h will be deprecated)

Disadvantages:

Still doesn't eliminate subtle errors, need to carefully collect information from the spec about amount of the built-ins

In D97869#2628679, @azabaznov wrote:
I have one more though.

I like the idea of turning opencl-c.h into the test: as it is already in the repo and is already being used for quite a while we can assume it as a mainline for now. I think the first step should be to test that -fdeclare-oprencl-builtins generates the same built-ins which are declared in opencl-c.h. If we can't do this programmable way we can use AST pretty-printing for tablegen header and opencl-c.h and compare these with diff (we can also do a clean-up of AST files with rm while running that tests).

Once this is done we can incrementally modify either opencl-c.h header and tablegen header. Testing changes to either of them can combine sanity check as @svenvh suggested in builtins-opencl2.0.check and diff for AST pretty printing.

Advantages:

Time of testing. Locally I get nice results:
$ time ./build_release/bin/clang-check -extra-arg=-cl-std=CL2.0 --ast-print  llvm-project/clang/lib/Headers/opencl-c.h &> header.h
real    0m0.182s
user    0m0.162s
sys     0m0.020s
But not yet clear how much time such printing will take for tablegen header. I assume it won't take a lot longer.

This will keep changes to opencl-h header and tablegen header consistent (until opencl-c.h will be deprecated)

Disadvantages:

Still doesn't eliminate subtle errors, need to carefully collect information from the spec about amount of the built-ins

Thanks for elaborating on this option. Yes, I think this could make the testing quick. The biggest concern with this I see is that it doesn't actually show that the header is working through the clang frontend. For example, we could easily change something very significant like remove the definition of macros guarding the function declarations and not notice that it means the declarations are no longer available to the users of clang. So I would say while it covers all the builtin overloads it doesn't give the confidence that them at all are exposed correctly to the users. We could of course work around this by providing complimentary partial testing using clang invocation but it means our testing will become harder to understand.

Overall I think we have made a good progress on brainstorming here so I suggest to give the ideas broader visibility via RFC to cfe-dev. We could include the following:

Start from standard clang testing idea i.e. adding a large test file(s) using clang invocation and calling the functions (either written manually or autogenerated).
Summarize the issues with the standard approach and highlight that similar tests with other clang features (i.e. target intrinsics) show time bottlenecks.
Enumerate various alternative options and hybrid options to avoid adding large test files with long execution time highlighting adv/disadv for each of them.

Does this sound reasonable? Is there anything else we should include in such RFC?

Emit #if guards for extensions and versions.

Harbormaster completed remote builds in B95271: Diff 332675.Mar 23 2021, 10:59 AM

I have done some measurements using the test produced from this Tablegen emitter (59K lines).

I have used the test it in two files:

SemaOpenCL/all-std-buitins.cl that has the following RUN line appended 6 times (for every supported OpenCL version v1.0, v1.1, v1.2, v2.0, v1.3, C++)

//RUN: %clang_cc1 %s -triple=spir -fsyntax-only -verify -cl-std=CL2.0 -finclude-default-header -fdeclare-opencl-builtins

SemaOpenCL/all-std-buitins-slow-header.cl that has the following RUN line appended 6 times (for every supported OpenCL version v1.0, v1.1, v1.2, v2.0, v3.0, C++)

//RUN: %clang_cc1 %s -triple=spir -fsyntax-only -verify -cl-std=CL2.0 -finclude-default-header

So I am getting the following testing time breakdown then:

201.61s: Clang :: SemaOpenCL/all-std-buitins-slow-header.cl
199.70s: Clang :: SemaOpenCL/all-std-buitins.cl
85.14s: Clang :: Headers/arm-neon-header.c
68.06s: Clang :: OpenMP/nesting_of_regions.cpp
65.23s: Clang :: Driver/crash-report.c
60.26s: Clang :: Analysis/PR24184.cpp
57.80s: Clang :: CodeGen/X86/rot-intrinsics.c
57.58s: Clang :: CodeGen/X86/x86_64-xsave.c
56.34s: Clang :: Headers/opencl-c-header.cl
55.68s: Clang :: CodeGen/X86/x86_32-xsave.c
44.83s: Clang :: Driver/crash-report-with-asserts.c
40.38s: Clang :: Lexer/SourceLocationsOverflow.c
37.44s: Clang :: Headers/x86intrin-2.c
36.53s: Clang :: OpenMP/target_teams_distribute_parallel_for_simd_codegen_registration.cpp
34.09s: Clang :: CodeGen/X86/avx512f-builtins-constrained.c
33.41s: Clang :: CodeGen/X86/sse-builtins-constrained.c
32.82s: Clang :: Analysis/iterator-modeling.cpp
31.37s: Clang :: OpenMP/target_teams_distribute_simd_codegen_registration.cpp
31.10s: Clang :: OpenMP/target_parallel_for_simd_codegen_registration.cpp
30.78s: Clang :: Analysis/use-after-move.cpp

I am very confused though about why is the difference between Tablegen and opencl-c.h so insignificant? FYI, also for a single clang invocation with Tablegen and opencl-c.h the difference is very insignificant in parsing time of this test - 20.794s vs 21.401s. This is really interesting because with small files the difference is huge 0.043s vs 3.990s on test with empty kernel.

I also timed check-clang invocation on my 8 core machine:

with both tests - 697.70s
with all-std-buitins.cl only - 684.43s
without any new tests - 673.00s

The change in total testing time appears to be insignificant. I guess this is due to parallel execution?
Btw one thing I have thought of since OpenCL v1.0-1.1 doesn't differ a lot for builtin functions and they are not modified much either, perhaps we only need to test v1.2? That would reduce number of clang invocations to 4 in each test. Then the measurements are as follows:

134.13s: Clang :: SemaOpenCL/all-std-buitins-slow-header.cl
131.52s: Clang :: SemaOpenCL/all-std-buitins.cl
85.81s: Clang :: Headers/arm-neon-header.c
69.14s: Clang :: OpenMP/nesting_of_regions.cpp
60.08s: Clang :: Driver/crash-report.c
59.67s: Clang :: Analysis/PR24184.cpp
57.27s: Clang :: CodeGen/X86/rot-intrinsics.c
56.93s: Clang :: CodeGen/X86/x86_32-xsave.c
56.59s: Clang :: CodeGen/X86/x86_64-xsave.c
55.68s: Clang :: Headers/opencl-c-header.cl
40.71s: Clang :: Driver/crash-report-with-asserts.c
39.44s: Clang :: Lexer/SourceLocationsOverflow.c
38.02s: Clang :: OpenMP/target_teams_distribute_parallel_for_simd_codegen_registration.cpp
37.07s: Clang :: Headers/x86intrin-2.c
32.61s: Clang :: CodeGen/X86/avx512f-builtins-constrained.c
32.58s: Clang :: CodeGen/X86/sse-builtins-constrained.c
32.19s: Clang :: Analysis/use-after-move.cpp
31.96s: Clang :: Analysis/iterator-modeling.cpp
31.02s: Clang :: OpenMP/target_teams_distribute_simd_codegen_registration.cpp
30.59s: Clang :: OpenMP/target_parallel_for_simd_codegen_registration.cpp

with a total testing time 688.61s

Conclusion:

if we test the whole functionality the test will be at least 2x slower than the slowest clang test so far but it hardly affect the full testing time of clang-check on modern HW due to the parallel execution. Also related to this partitioning of test files could help with the latency due to the parallel execution.
Testing of opencl-c.h only doubles the testing time.

azabaznov mentioned this in D99577: [RFC][OpenCL][PoC] Testing TableGen with diffing.Mar 30 2021, 3:59 AM

svenvh edited the summary of this revision. (Show Details)Mar 31 2021, 9:29 AM

Fix formatting issues, rebase patch, add test.

Harbormaster completed remote builds in B105937: Diff 347428.May 24 2021, 10:35 AM

I think this is a very good starting point for the testing of built-in functions. Btw should we now remove '[Draft]' from the title?

clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
242	Should be `emit` to adhere to coding style?
969	I would add a documentation for this function, it is not very obvious.
1004	Maybe this should be lifted as documentation of the header of this function?
1033	I don't get this comment and its exact reference to the code.

Improve comments.

clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
969	Not sure what documentation you are expecting? Are there any particular lines that are "not very obvious"? Note that there is already a comment for this method up in the class declaration.
1004	This comment only applies to the first loop-nest. It documents an implementation detail, so I don't think this comment should be lifted to the method's declaration comment.
1033	I have expanded the comment, and also factored out an index into a variable. Hopefully that makes it clear?

Harbormaster completed remote builds in B108004: Diff 350314.Jun 7 2021, 9:15 AM

LGTM! Thanks!

This revision is now accepted and ready to land.Jun 9 2021, 3:56 AM

@svenvh, are you going to add header-like output emitting?

In D97869#2807627, @azabaznov wrote:

@svenvh, are you going to add header-like output emitting?

Yes, I can add that if there is interest. I already have it on a local branch, so if there is interest (and it seems there is? :-)), then I will rebase that, tidy it up, and create a separate review.

In D97869#2807654, @svenvh wrote:

In D97869#2807627, @azabaznov wrote:

@svenvh, are you going to add header-like output emitting?

Yes, I can add that if there is interest. I already have it on a local branch, so if there is interest (and it seems there is? :-)), then I will rebase that, tidy it up, and create a separate review.

Sounds great :) Thanks! Yeah, it may be needed as alternative testing of changes of opencl-h and testing OpenCL TableGen header as well.

This revision was landed with ongoing or failed builds.Jun 9 2021, 6:08 AM

Closed by commit rG8866793b4e0a: [OpenCL] Add OpenCL builtin test generator (authored by svenvh). · Explain Why

This revision was automatically updated to reflect the committed changes.

svenvh added a commit: rG8866793b4e0a: [OpenCL] Add OpenCL builtin test generator.

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2021, 6:08 AM

Herald added a subscriber: ldrumm. · View Herald Transcript

svenvh added a commit: rGc5ffc6f8bd6a: [OpenCL] Add builtin header test.Jun 10 2021, 2:06 AM

Revision Contents

Path

Size

clang/

utils/

TableGen/

ClangOpenCLBuiltinEmitter.cpp

281 lines

TableGen.cpp

6 lines

TableGenBackends.h

2 lines

Diff 350874

clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp

Show First 20 Lines • Show All 222 Lines • ▼ Show 20 Lines	private:
// Then we will create a mapping of the vector of signatures:		// Then we will create a mapping of the vector of signatures:
// SignatureListMap[<I1, I2, I3>] = <		// SignatureListMap[<I1, I2, I3>] = <
// <"cos", "sin">,		// <"cos", "sin">,
// <Signature1, Signature2, Signature3>>		// <Signature1, Signature2, Signature3>>
// The function "tan", having the same signatures, would be mapped to the		// The function "tan", having the same signatures, would be mapped to the
// same entry (<I1, I2, I3>).		// same entry (<I1, I2, I3>).
MapVector<BuiltinIndexListTy *, BuiltinTableEntries> SignatureListMap;		MapVector<BuiltinIndexListTy *, BuiltinTableEntries> SignatureListMap;
};		};

		// OpenCL builtin test generator. This class processes the same TableGen input
		// as BuiltinNameEmitter, but generates a .cl file that contains a call to each
		// builtin function described in the .td input.
		class OpenCLBuiltinTestEmitter {
		public:
		OpenCLBuiltinTestEmitter(RecordKeeper &Records, raw_ostream &OS)
		: Records(Records), OS(OS) {}

		// Entrypoint to generate the functions for testing all OpenCL builtin
		// functions.
		void emit();
		AnastasiaUnsubmitted Done Reply Inline Actions Should be `emit` to adhere to coding style? Anastasia: Should be `emit` to adhere to coding style?

		private:
		struct TypeFlags {
		TypeFlags() : IsConst(false), IsVolatile(false), IsPointer(false) {}
		bool IsConst : 1;
		bool IsVolatile : 1;
		bool IsPointer : 1;
		StringRef AddrSpace;
		};

		// Return a string representation of the given type, such that it can be
		// used as a type in OpenCL C code.
		std::string getTypeString(const Record *Type, TypeFlags Flags,
		int VectorSize) const;

		// Return the type(s) and vector size(s) for the given type. For
		// non-GenericTypes, the resulting vectors will contain 1 element. For
		// GenericTypes, the resulting vectors typically contain multiple elements.
		void getTypeLists(Record *Type, TypeFlags &Flags,
		std::vector<Record *> &TypeList,
		std::vector<int64_t> &VectorList) const;

		// Expand the TableGen Records representing a builtin function signature into
		// one or more function signatures. Return them as a vector of a vector of
		// strings, with each string containing an OpenCL C type and optional
		// qualifiers.
		//
		// The Records may contain GenericTypes, which expand into multiple
		// signatures. Repeated occurrences of GenericType in a signature expand to
		// the same types. For example [char, FGenType, FGenType] expands to:
		// [char, float, float]
		// [char, float2, float2]
		// [char, float3, float3]
		// ...
		void
		expandTypesInSignature(const std::vector<Record *> &Signature,
		SmallVectorImpl<SmallVector<std::string, 2>> &Types);

		// Contains OpenCL builtin functions and related information, stored as
		// Record instances. They are coming from the associated TableGen file.
		RecordKeeper &Records;

		// The output file.
		raw_ostream &OS;
		};

} // namespace		} // namespace

void BuiltinNameEmitter::Emit() {		void BuiltinNameEmitter::Emit() {
emitSourceFileHeader("OpenCL Builtin handling", OS);		emitSourceFileHeader("OpenCL Builtin handling", OS);

OS << "#include \"llvm/ADT/StringRef.h\"\n";		OS << "#include \"llvm/ADT/StringRef.h\"\n";
OS << "using namespace clang;\n\n";		OS << "using namespace clang;\n\n";

▲ Show 20 Lines • Show All 617 Lines • ▼ Show 20 Lines	if (Ty.IsPointer != 0) {
}		}
}		}
)";		)";

// End of the "OCL2Qual" function.		// End of the "OCL2Qual" function.
OS << "\n} // OCL2Qual\n";		OS << "\n} // OCL2Qual\n";
}		}

		std::string OpenCLBuiltinTestEmitter::getTypeString(const Record *Type,
		TypeFlags Flags,
		int VectorSize) const {
		std::string S;
		if (Type->getValueAsBit("IsConst") \|\| Flags.IsConst) {
		S += "const ";
		}
		if (Type->getValueAsBit("IsVolatile") \|\| Flags.IsVolatile) {
		S += "volatile ";
		}

		auto PrintAddrSpace = [&S](StringRef AddrSpace) {
		S += StringSwitch<const char *>(AddrSpace)
		.Case("clang::LangAS::opencl_private", "__private")
		.Case("clang::LangAS::opencl_global", "__global")
		.Case("clang::LangAS::opencl_constant", "__constant")
		.Case("clang::LangAS::opencl_local", "__local")
		.Case("clang::LangAS::opencl_generic", "__generic")
		.Default("__private");
		S += " ";
		};
		if (Flags.IsPointer) {
		PrintAddrSpace(Flags.AddrSpace);
		} else if (Type->getValueAsBit("IsPointer")) {
		PrintAddrSpace(Type->getValueAsString("AddrSpace"));
		}

		StringRef Acc = Type->getValueAsString("AccessQualifier");
		if (Acc != "") {
		S += StringSwitch<const char *>(Acc)
		.Case("RO", "__read_only ")
		.Case("WO", "__write_only ")
		.Case("RW", "__read_write ");
		}

		S += Type->getValueAsString("Name").str();
		if (VectorSize > 1) {
		S += std::to_string(VectorSize);
		}

		if (Type->getValueAsBit("IsPointer") \|\| Flags.IsPointer) {
		S += " *";
		}

		return S;
		}

		void OpenCLBuiltinTestEmitter::getTypeLists(
		AnastasiaUnsubmitted Not Done Reply Inline Actions I would add a documentation for this function, it is not very obvious. Anastasia: I would add a documentation for this function, it is not very obvious.
		svenvhAuthorUnsubmitted Done Reply Inline Actions Not sure what documentation you are expecting? Are there any particular lines that are "not very obvious"? Note that there is already a comment for this method up in the class declaration. svenvh: Not sure what documentation you are expecting? Are there any particular lines that are "not…
		Record Type, TypeFlags &Flags, std::vector<Record > &TypeList,
		std::vector<int64_t> &VectorList) const {
		bool isGenType = Type->isSubClassOf("GenericType");
		if (isGenType) {
		TypeList = Type->getValueAsDef("TypeList")->getValueAsListOfDefs("List");
		VectorList =
		Type->getValueAsDef("VectorList")->getValueAsListOfInts("List");
		return;
		}

		if (Type->isSubClassOf("PointerType") \|\| Type->isSubClassOf("ConstType") \|\|
		Type->isSubClassOf("VolatileType")) {
		StringRef SubTypeName = Type->getValueAsString("Name");
		Record *PossibleGenType = Records.getDef(SubTypeName);
		if (PossibleGenType && PossibleGenType->isSubClassOf("GenericType")) {
		// When PointerType, ConstType, or VolatileType is applied to a
		// GenericType, the flags need to be taken from the subtype, not from the
		// GenericType.
		Flags.IsPointer = Type->getValueAsBit("IsPointer");
		Flags.IsConst = Type->getValueAsBit("IsConst");
		Flags.IsVolatile = Type->getValueAsBit("IsVolatile");
		Flags.AddrSpace = Type->getValueAsString("AddrSpace");
		getTypeLists(PossibleGenType, Flags, TypeList, VectorList);
		return;
		}
		}

		// Not a GenericType, so just insert the single type.
		TypeList.push_back(Type);
		VectorList.push_back(Type->getValueAsInt("VecWidth"));
		}

		void OpenCLBuiltinTestEmitter::expandTypesInSignature(
		const std::vector<Record *> &Signature,
		SmallVectorImpl<SmallVector<std::string, 2>> &Types) {
		AnastasiaUnsubmitted Not Done Reply Inline Actions Maybe this should be lifted as documentation of the header of this function? Anastasia: Maybe this should be lifted as documentation of the header of this function?
		svenvhAuthorUnsubmitted Done Reply Inline Actions This comment only applies to the first loop-nest. It documents an implementation detail, so I don't think this comment should be lifted to the method's declaration comment. svenvh: This comment only applies to the first loop-nest. It documents an implementation detail, so I…
		// Find out if there are any GenTypes in this signature, and if so, calculate
		// into how many signatures they will expand.
		unsigned NumSignatures = 1;
		SmallVector<SmallVector<std::string, 4>, 4> ExpandedGenTypes;
		for (const auto &Arg : Signature) {
		SmallVector<std::string, 4> ExpandedArg;
		std::vector<Record *> TypeList;
		std::vector<int64_t> VectorList;
		TypeFlags Flags;

		getTypeLists(Arg, Flags, TypeList, VectorList);

		// Insert the Cartesian product of the types and vector sizes.
		for (const auto &Vector : VectorList) {
		for (const auto &Type : TypeList) {
		ExpandedArg.push_back(getTypeString(Type, Flags, Vector));
		}
		}
		NumSignatures = std::max<unsigned>(NumSignatures, ExpandedArg.size());
		ExpandedGenTypes.push_back(ExpandedArg);
		}

		// Now the total number of signatures is known. Populate the return list with
		// all signatures.
		for (unsigned I = 0; I < NumSignatures; I++) {
		SmallVector<std::string, 2> Args;

		// Process a single signature.
		for (unsigned ArgNum = 0; ArgNum < Signature.size(); ArgNum++) {
		AnastasiaUnsubmitted Not Done Reply Inline Actions I don't get this comment and its exact reference to the code. Anastasia: I don't get this comment and its exact reference to the code.
		svenvhAuthorUnsubmitted Done Reply Inline Actions I have expanded the comment, and also factored out an index into a variable. Hopefully that makes it clear? svenvh: I have expanded the comment, and also factored out an index into a variable. Hopefully that…
		// For differently-sized GenTypes in a parameter list, the smaller
		// GenTypes just repeat, so index modulo the number of expanded types.
		size_t TypeIndex = I % ExpandedGenTypes[ArgNum].size();
		Args.push_back(ExpandedGenTypes[ArgNum][TypeIndex]);
		}
		Types.push_back(Args);
		}
		}

		void OpenCLBuiltinTestEmitter::emit() {
		emitSourceFileHeader("OpenCL Builtin exhaustive testing", OS);

		// Enable some extensions for testing.
		OS << R"(
		#pragma OPENCL EXTENSION cl_khr_fp16 : enable
		#pragma OPENCL EXTENSION cl_khr_fp64 : enable
		#pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable
		#pragma OPENCL EXTENSION cl_khr_int64_extended_atomics : enable
		#pragma OPENCL EXTENSION cl_khr_gl_msaa_sharing : enable
		#pragma OPENCL EXTENSION cl_khr_mipmap_image_writes : enable
		#pragma OPENCL EXTENSION cl_khr_3d_image_writes : enable

		)";

		// Ensure each test has a unique name by numbering them.
		unsigned TestID = 0;

		// Iterate over all builtins.
		std::vector<Record *> Builtins = Records.getAllDerivedDefinitions("Builtin");
		for (const auto *B : Builtins) {
		StringRef Name = B->getValueAsString("Name");

		SmallVector<SmallVector<std::string, 2>, 4> FTypes;
		expandTypesInSignature(B->getValueAsListOfDefs("Signature"), FTypes);

		OS << "// Test " << Name << "\n";
		std::string OptionalEndif;
		StringRef Extensions =
		B->getValueAsDef("Extension")->getValueAsString("ExtName");
		if (!Extensions.empty()) {
		OS << "#if";
		OptionalEndif = "#endif // Extension\n";

		SmallVector<StringRef, 2> ExtVec;
		Extensions.split(ExtVec, " ");
		bool isFirst = true;
		for (StringRef Ext : ExtVec) {
		if (!isFirst) {
		OS << " &&";
		}
		OS << " defined(" << Ext << ")";
		isFirst = false;
		}
		OS << "\n";
		}
		auto PrintOpenCLVersion = [this](int Version) {
		OS << "CL_VERSION_" << (Version / 100) << "_" << ((Version % 100) / 10);
		};
		int MinVersion = B->getValueAsDef("MinVersion")->getValueAsInt("ID");
		if (MinVersion != 100) {
		// OpenCL 1.0 is the default minimum version.
		OS << "#if __OPENCL_C_VERSION__ >= ";
		PrintOpenCLVersion(MinVersion);
		OS << "\n";
		OptionalEndif = "#endif // MinVersion\n" + OptionalEndif;
		}
		int MaxVersion = B->getValueAsDef("MaxVersion")->getValueAsInt("ID");
		if (MaxVersion) {
		OS << "#if __OPENCL_C_VERSION__ < ";
		PrintOpenCLVersion(MaxVersion);
		OS << "\n";
		OptionalEndif = "#endif // MaxVersion\n" + OptionalEndif;
		}
		for (const auto &Signature : FTypes) {
		// Emit function declaration.
		OS << Signature[0] << " test" << TestID++ << "_" << Name << "(";
		if (Signature.size() > 1) {
		for (unsigned I = 1; I < Signature.size(); I++) {
		if (I != 1)
		OS << ", ";
		OS << Signature[I] << " arg" << I;
		}
		}
		OS << ") {\n";

		// Emit function body.
		OS << " ";
		if (Signature[0] != "void") {
		OS << "return ";
		}
		OS << Name << "(";
		for (unsigned I = 1; I < Signature.size(); I++) {
		if (I != 1)
		OS << ", ";
		OS << "arg" << I;
		}
		OS << ");\n";

		// End of function body.
		OS << "}\n";
		}
		OS << OptionalEndif << "\n";
		}
		}

void clang::EmitClangOpenCLBuiltins(RecordKeeper &Records, raw_ostream &OS) {		void clang::EmitClangOpenCLBuiltins(RecordKeeper &Records, raw_ostream &OS) {
BuiltinNameEmitter NameChecker(Records, OS);		BuiltinNameEmitter NameChecker(Records, OS);
NameChecker.Emit();		NameChecker.Emit();
}		}

		void clang::EmitClangOpenCLBuiltinTests(RecordKeeper &Records,
		raw_ostream &OS) {
		OpenCLBuiltinTestEmitter TestFileGenerator(Records, OS);
		TestFileGenerator.emit();
		}

clang/utils/TableGen/TableGen.cpp

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	enum ActionType {
GenClangSyntaxNodeList,		GenClangSyntaxNodeList,
GenClangSyntaxNodeClasses,		GenClangSyntaxNodeClasses,
GenClangCommentHTMLTags,		GenClangCommentHTMLTags,
GenClangCommentHTMLTagsProperties,		GenClangCommentHTMLTagsProperties,
GenClangCommentHTMLNamedCharacterReferences,		GenClangCommentHTMLNamedCharacterReferences,
GenClangCommentCommandInfo,		GenClangCommentCommandInfo,
GenClangCommentCommandList,		GenClangCommentCommandList,
GenClangOpenCLBuiltins,		GenClangOpenCLBuiltins,
		GenClangOpenCLBuiltinTests,
GenArmNeon,		GenArmNeon,
GenArmFP16,		GenArmFP16,
GenArmBF16,		GenArmBF16,
GenArmNeonSema,		GenArmNeonSema,
GenArmNeonTest,		GenArmNeonTest,
GenArmMveHeader,		GenArmMveHeader,
GenArmMveBuiltinDef,		GenArmMveBuiltinDef,
GenArmMveBuiltinSema,		GenArmMveBuiltinSema,
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	cl::values(
clEnumValN(GenClangCommentCommandInfo, "gen-clang-comment-command-info",		clEnumValN(GenClangCommentCommandInfo, "gen-clang-comment-command-info",
"Generate command properties for commands that "		"Generate command properties for commands that "
"are used in documentation comments"),		"are used in documentation comments"),
clEnumValN(GenClangCommentCommandList, "gen-clang-comment-command-list",		clEnumValN(GenClangCommentCommandList, "gen-clang-comment-command-list",
"Generate list of commands that are used in "		"Generate list of commands that are used in "
"documentation comments"),		"documentation comments"),
clEnumValN(GenClangOpenCLBuiltins, "gen-clang-opencl-builtins",		clEnumValN(GenClangOpenCLBuiltins, "gen-clang-opencl-builtins",
"Generate OpenCL builtin declaration handlers"),		"Generate OpenCL builtin declaration handlers"),
		clEnumValN(GenClangOpenCLBuiltinTests, "gen-clang-opencl-builtin-tests",
		"Generate OpenCL builtin declaration tests"),
clEnumValN(GenArmNeon, "gen-arm-neon", "Generate arm_neon.h for clang"),		clEnumValN(GenArmNeon, "gen-arm-neon", "Generate arm_neon.h for clang"),
clEnumValN(GenArmFP16, "gen-arm-fp16", "Generate arm_fp16.h for clang"),		clEnumValN(GenArmFP16, "gen-arm-fp16", "Generate arm_fp16.h for clang"),
clEnumValN(GenArmBF16, "gen-arm-bf16", "Generate arm_bf16.h for clang"),		clEnumValN(GenArmBF16, "gen-arm-bf16", "Generate arm_bf16.h for clang"),
clEnumValN(GenArmNeonSema, "gen-arm-neon-sema",		clEnumValN(GenArmNeonSema, "gen-arm-neon-sema",
"Generate ARM NEON sema support for clang"),		"Generate ARM NEON sema support for clang"),
clEnumValN(GenArmNeonTest, "gen-arm-neon-test",		clEnumValN(GenArmNeonTest, "gen-arm-neon-test",
"Generate ARM NEON tests for clang"),		"Generate ARM NEON tests for clang"),
clEnumValN(GenArmSveHeader, "gen-arm-sve-header",		clEnumValN(GenArmSveHeader, "gen-arm-sve-header",
▲ Show 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	case GenClangCommentCommandInfo:
EmitClangCommentCommandInfo(Records, OS);		EmitClangCommentCommandInfo(Records, OS);
break;		break;
case GenClangCommentCommandList:		case GenClangCommentCommandList:
EmitClangCommentCommandList(Records, OS);		EmitClangCommentCommandList(Records, OS);
break;		break;
case GenClangOpenCLBuiltins:		case GenClangOpenCLBuiltins:
EmitClangOpenCLBuiltins(Records, OS);		EmitClangOpenCLBuiltins(Records, OS);
break;		break;
		case GenClangOpenCLBuiltinTests:
		EmitClangOpenCLBuiltinTests(Records, OS);
		break;
case GenClangSyntaxNodeList:		case GenClangSyntaxNodeList:
EmitClangSyntaxNodeList(Records, OS);		EmitClangSyntaxNodeList(Records, OS);
break;		break;
case GenClangSyntaxNodeClasses:		case GenClangSyntaxNodeClasses:
EmitClangSyntaxNodeClasses(Records, OS);		EmitClangSyntaxNodeClasses(Records, OS);
break;		break;
case GenArmNeon:		case GenArmNeon:
EmitNeon(Records, OS);		EmitNeon(Records, OS);
▲ Show 20 Lines • Show All 106 Lines • Show Last 20 Lines

clang/utils/TableGen/TableGenBackends.h

	Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	void EmitCdeBuiltinAliases(llvm::RecordKeeper &Records, llvm::raw_ostream &OS);			void EmitCdeBuiltinAliases(llvm::RecordKeeper &Records, llvm::raw_ostream &OS);

	void EmitClangAttrDocs(llvm::RecordKeeper &Records, llvm::raw_ostream &OS);			void EmitClangAttrDocs(llvm::RecordKeeper &Records, llvm::raw_ostream &OS);
	void EmitClangDiagDocs(llvm::RecordKeeper &Records, llvm::raw_ostream &OS);			void EmitClangDiagDocs(llvm::RecordKeeper &Records, llvm::raw_ostream &OS);
	void EmitClangOptDocs(llvm::RecordKeeper &Records, llvm::raw_ostream &OS);			void EmitClangOptDocs(llvm::RecordKeeper &Records, llvm::raw_ostream &OS);

	void EmitClangOpenCLBuiltins(llvm::RecordKeeper &Records,			void EmitClangOpenCLBuiltins(llvm::RecordKeeper &Records,
	llvm::raw_ostream &OS);			llvm::raw_ostream &OS);
				void EmitClangOpenCLBuiltinTests(llvm::RecordKeeper &Records,
				llvm::raw_ostream &OS);

	void EmitClangDataCollectors(llvm::RecordKeeper &Records,			void EmitClangDataCollectors(llvm::RecordKeeper &Records,
	llvm::raw_ostream &OS);			llvm::raw_ostream &OS);

	void EmitTestPragmaAttributeSupportedAttributes(llvm::RecordKeeper &Records,			void EmitTestPragmaAttributeSupportedAttributes(llvm::RecordKeeper &Records,
	llvm::raw_ostream &OS);			llvm::raw_ostream &OS);

	} // end namespace clang			} // end namespace clang

	#endif			#endif

This is an archive of the discontinued LLVM Phabricator instance.

[OpenCL] Add OpenCL builtin test generatorClosedPublic

Details

Diff Detail

Event Timeline

1. Regenerate the test file every time we run check-clang and use it to validate the -fdeclare-opencl-builtins option.

2. As 1, but also keep maintaining opencl-c.h and test with both opencl-c.h and -fdeclare-opencl-builtins.

3. Generate the test file once, manually verify that it is correct, check it in to git, and maintain it from there.

4. Generate a condensed summary of all builtins, check that in to git.

Revision Contents

Diff 350874

clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp

clang/utils/TableGen/TableGen.cpp

clang/utils/TableGen/TableGenBackends.h

[OpenCL] Add OpenCL builtin test generator
ClosedPublic

1. Regenerate the test file every time we run check-clang and use it to validate the `-fdeclare-opencl-builtins` option.

2. As 1, but also keep maintaining `opencl-c.h` and test with both `opencl-c.h` and `-fdeclare-opencl-builtins`.