This is an archive of the discontinued LLVM Phabricator instance.

[OpenCL] Include opencl-c.h by default as a clang module
ClosedPublic

Authored by yaxunl on May 19 2016, 12:30 PM.

Details

Summary

Include opencl-c.h by default as a module to utilize the automatic AST caching mechanism of clang modules.

Add an option -finclude-default-header to enable default header for OpenCL, which is off by default.

Diff Detail

Repository
rL LLVM

Event Timeline

yaxunl updated this revision to Diff 57837.May 19 2016, 12:30 PM
yaxunl retitled this revision from to [OpenCL] Include opencl-c.h by default as a module.
yaxunl updated this object.
yaxunl added reviewers: Anastasia, pxli168, bader, rsmith.
yaxunl added a subscriber: cfe-commits.
yaxunl updated this revision to Diff 58297.May 24 2016, 12:54 PM

Fix regressions in unit test due to ConstructJob disables builtin include path.

Anastasia added inline comments.May 25 2016, 8:47 AM
include/clang/Basic/LangOptions.def
219 ↗(On Diff #58297)

I was just thinking whether it would make sense to do the opposite i.e. to have a flag that would enable loading of the header (which won't be loaded by default). This might be better for compatibility.

The issues might arise because the users already have their own BIF declaration mechanisms in place in Clang or because of the parsing speed increase. The latter can affect testing significantly. Does the module mechanism use PCH?

Have you compared the speed with and without including the OpenCL header? Is there any difference?

yaxunl updated this revision to Diff 58830.May 27 2016, 1:29 PM
yaxunl retitled this revision from [OpenCL] Include opencl-c.h by default as a module to [OpenCL] Include opencl-c.h by default as a clang module.
yaxunl updated this object.

Revised as Anastasia suggested.

Add option -finclude-default-header to include OpenCL default header file, which is off by default.
Add test to make sure the AST of the header is really cached and works for different OpenCL versions.

yaxunl marked an inline comment as done.May 27 2016, 1:38 PM
yaxunl added inline comments.
include/clang/Basic/LangOptions.def
220 ↗(On Diff #58830)

I agree and has made the change.

The module caches AST of the header file, which is essentially the same thing as PCH.

I checked a typical OpenCL progam. Without including header,

---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
0.0160 (100.0%)   0.0080 (100.0%)   0.0240 (100.0%)   0.0211 (100.0%)  Clang front-end timer
0.0160 (100.0%)   0.0080 (100.0%)   0.0240 (100.0%)   0.0211 (100.0%)  Total

With header but without module caching,

---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
0.2160 (100.0%)   0.0120 (100.0%)   0.2280 (100.0%)   0.2271 (100.0%)  Clang front-end timer
0.2160 (100.0%)   0.0120 (100.0%)   0.2280 (100.0%)   0.2271 (100.0%)  Total

With header and module caching,

---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
0.0160 (100.0%)   0.0080 (100.0%)   0.0240 (100.0%)   0.0222 ( 93.6%)  Clang front-end timer
0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0008 (  3.5%)  Reading modules
0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0007 (  2.9%)  Loading /home/yaxunl/llvm-tot/llvm/tools/clang/test/CodeGenOpenCL/./JJHF9CF78G8Z/opencl_c-3RXFI66A6CUM7.pcm
0.0160 (100.0%)   0.0080 (100.0%)   0.0240 (100.0%)   0.0237 (100.0%)  Total

So the caching is very effective and essentially makes the header loading time to be ignored.

pxli168 accepted this revision.May 29 2016, 7:29 PM
pxli168 edited edge metadata.

LGTM!
Thanks!

This revision is now accepted and ready to land.May 29 2016, 7:29 PM
Anastasia added inline comments.Jun 1 2016, 9:03 AM
lib/Frontend/CompilerInvocation.cpp
1782 ↗(On Diff #58830)

Is this change being tested somewhere?

bader accepted this revision.Jun 2 2016, 8:12 AM
bader edited edge metadata.
bader added inline comments.
test/Headers/opencl-c-header.cl
30–31 ↗(On Diff #58830)

It would be nice to re-use that part of module that agnostic to OpenCL version. Most of the OpenCL 2.0 declarations are already in the module for OpenCL 1.0.
Pointer/size_t arguments could another reason to re-parse the header.

Just FYI: at the moment we use chained PCHes to reduce the footprint of the PCHes we ship with OpenCL implementation.
The chain looks like: <common part> + <standard specific part> + <platform specific part>, where 60% is occupied by <common part>, which is shared across all possible compilations.

yaxunl updated this revision to Diff 59398.Jun 2 2016, 9:23 AM
yaxunl edited edge metadata.
yaxunl marked an inline comment as done.

Add a test for -fblocks as Anastasia suggested.

yaxunl added inline comments.Jun 2 2016, 9:35 AM
test/Headers/opencl-c-header.cl
53–54 ↗(On Diff #59398)

We can discuss that as the next step for improving the header.

Anastasia added inline comments.Jun 2 2016, 9:39 AM
test/Frontend/opencl-blocks.cl
11 ↗(On Diff #59398)

This is not part of this change, but I was just wondering whether it would make sense to update the message to refer to OpenCL version that allows blocks (similarly to other cases).

We might need to create different diagnostic or use select option?

yaxunl updated this revision to Diff 59431.Jun 2 2016, 11:12 AM

Improve diag msg for fblocks.

Anastasia added inline comments.Jun 3 2016, 9:34 AM
include/clang/Basic/DiagnosticSemaKinds.td
7347 ↗(On Diff #59431)

-> for OpenCL version 2.0 or above

test/Headers/opencl-c-header.cl
50 ↗(On Diff #59431)

Sam, what was the aim of this test?

I was just wondering if we should instead check that the file is exactly the same i.e. hasn't been overwritten with exactly the same content?

53–54 ↗(On Diff #59431)

Indeed PCH size is an issue. I don't quite get why it expands from the original source code size by a factor or 2. Something to look at!

70 ↗(On Diff #59431)

This line seems exactly the same as line 67.

yaxunl updated this revision to Diff 59733.Jun 6 2016, 8:50 AM
yaxunl updated this object.
yaxunl marked 8 inline comments as done.

Update the diagnositic msg for fblocks.

test/Headers/opencl-c-header.cl
50 ↗(On Diff #59431)

This checks whether the content has changed, which is the same approach used by test/Modules/fmodules-validate-once-per-build-session.c to make sure a module is not re-compiled.

It seems there is no good way otherwise. stat -c %Y filename can get the modification time but it is in seconds, which is not accurate enough.

53–54 ↗(On Diff #59431)

I can take a look why the PCH size is much larger than the header file itself.

70 ↗(On Diff #59431)

Right. The first compilation generates the module and the second compilation uses the cached module. The module is compiled for different OpenCL version and triple to make sure the cached module still works.

Anastasia added inline comments.Jun 6 2016, 9:19 AM
test/Headers/opencl-c-header.cl
50 ↗(On Diff #59733)

I see, but is diffing accurate here? Because if the file is regenerated but with exactly the same content it won't be caught...

53–54 ↗(On Diff #59733)

Sounds cool. Thanks!

70 ↗(On Diff #59733)

So in this line it will be regenerated because the line above used different triple?

yaxunl marked 4 inline comments as done.Jun 6 2016, 12:00 PM
yaxunl added inline comments.
test/Headers/opencl-c-header.cl
50 ↗(On Diff #59733)

It cannot detect if the file was re-written with the same content.

There is one way we can do that:
get the modified time of the file
sleep 1 second
get the modified time of the file again and compare

but it will slow down the test by 1 second. do we really want to do that?

70 ↗(On Diff #59733)

No. It should use the cached module.

Anastasia added inline comments.Jun 8 2016, 9:44 AM
test/Headers/opencl-c-header.cl
50 ↗(On Diff #59733)

Not desirable to increase the testing time. I was wondering if we could amend the attribute of the file let's say run chmod on it? I guess if it's regenerated it would get default attributes again?

Otherwise, I would rather skip testing uniqueness, if we can't do it properly. We rely on the existing modules functionality anyways which is already being tested elsewhere.

70 ↗(On Diff #59733)

Ok, but it doesn't seem like there is something different being tested to line 67.

yaxunl updated this revision to Diff 60078.Jun 8 2016, 12:16 PM
yaxunl marked 2 inline comments as done.

Modified the test for module as Anastasia suggested.

yaxunl marked 5 inline comments as done.Jun 8 2016, 12:19 PM
yaxunl added inline comments.
test/Headers/opencl-c-header.cl
50 ↗(On Diff #60078)

I changed the module to read only. If Clang tries to create a new module, it will fail. Also add a check to make sure module is read.

70 ↗(On Diff #60078)

I added check to the second compilation to make sure module is read, also changed the modules to be read only so that they won't be created again.

Anastasia added inline comments.Jun 9 2016, 11:17 AM
test/Headers/opencl-c-header.cl
70 ↗(On Diff #60078)

Ok, now I see what you are testing here. :)

Do you think we could add:

CHECK-NOT: Reading modules

For the cases the modules are regenerated new?

yaxunl marked 2 inline comments as done.Jun 10 2016, 1:07 PM
yaxunl added inline comments.
test/Headers/opencl-c-header.cl
70 ↗(On Diff #60078)

In the case the modules are generated as new, Clang will generate the module first and then load it. So in the time report, you still see 'Reading modules'.

yaxunl added inline comments.Jun 10 2016, 1:50 PM
test/Headers/opencl-c-header.cl
53–54 ↗(On Diff #60078)

I checked the size of the pch file. The largest chunk is for function declarations. Since it contains about 30k declarations, each one is about 50 bytes on average, so total size is about 1.6MB.

The AST of function decl is written by ASTDeclWriter::VisitFunctionDecl http://clang.llvm.org/doxygen/ASTWriterDecl_8cpp_source.html

A code snippet is as follows:

515 Record.push_back((int)D->SClass); // FIXME: stable encoding

516   Record.push_back(D->IsInline);
517   Record.push_back(D->IsInlineSpecified);
518   Record.push_back(D->IsVirtualAsWritten);
519   Record.push_back(D->IsPure);
520   Record.push_back(D->HasInheritedPrototype);
521   Record.push_back(D->HasWrittenPrototype);
522   Record.push_back(D->IsDeleted);
523   Record.push_back(D->IsTrivial);
524   Record.push_back(D->IsDefaulted);
525   Record.push_back(D->IsExplicitlyDefaulted);
526   Record.push_back(D->HasImplicitReturnZero);
527   Record.push_back(D->IsConstexpr);
528   Record.push_back(D->HasSkippedBody);
529   Record.push_back(D->IsLateTemplateParsed);
530   Record.push_back(D->getLinkageInternal());
531   Record.AddSourceLocation(D->getLocEnd());

Record is like a buffer which will be written to file. It uses a vector of int64 to store the values, so it takes space.

Anastasia added inline comments.Jun 13 2016, 10:34 AM
test/Headers/opencl-c-header.cl
53–54 ↗(On Diff #60078)

It feels the problem is that it has to record all the attributes even though most of them are missing. Quite wasteful! Wondering if variable length solution would be possible...

70 ↗(On Diff #60078)

In this case, should you be testing it in regenerated mode too?

Also would it make sense to add:

CHECK-NOT: _Z3ctzc

to make sure there isn't OpenCL 2.0 functionality.

yaxunl updated this revision to Diff 60568.Jun 13 2016, 11:34 AM

Update the test as Anastasia suggested.

Anastasia accepted this revision.Jun 14 2016, 8:39 AM
Anastasia edited edge metadata.

LGTM, I think it would be nice if the block related code goes in a separate commit.

Thanks!

LGTM, I think it would be nice if the block related code goes in a separate commit.

Thanks!

Will do. Thanks.

This revision was automatically updated to reflect the committed changes.