This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Fix clang-tidy to support parsing of assembly statements.
Needs RevisionPublic

Authored by etienneb on Mar 8 2016, 9:15 PM.

Details

Summary

Summary:
Clang-tidy fails when parsing MSVC inline assembly statements. The native target and asm parser aren't initialized.

The following patch is fixing the issue by using the same code than clang-check.
The tool clang-check has the following code in main to initialize the required components.

// Initialize targets for clang module support.
llvm::InitializeAllTargets();
llvm::InitializeAllTargetMCs();
llvm::InitializeAllAsmPrinters();
llvm::InitializeAllAsmParsers();

Apparently, it is sufficient to initialize the native target and the asm parser.
see:
https://code.google.com/p/chromium/codesearch#chromium/src/tools/clang/rewrite_scoped_refptr/RewriteScopedRefptr.cpp&l=262

llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmParser();

Diff Detail

Event Timeline

etienneb updated this revision to Diff 50103.Mar 8 2016, 9:15 PM
etienneb retitled this revision from to [clang-tidy] Fix clang-tidy to support parsing of assembly statements..
etienneb updated this object.
etienneb added reviewers: alexfh, rnk.
etienneb added a subscriber: cfe-commits.
rnk updated this object.Mar 10 2016, 1:06 PM
rnk added a reviewer: bkramer.
rnk edited edge metadata.Mar 10 2016, 1:15 PM

This will definitely work, but it will substantially increase the size of clang-tidy:
$ du -cksh bin/clang-tidy.exe bin/clang.exe
21M bin/clang-tidy.exe
47M bin/clang.exe
68M total

The difference is mostly from the backends, and that's with only the X86 and ARM backends enabled.

Alex, how should clang-tidy behave when it sees MS inline asm? If we ignore the error about not finding a registered target, then we treat the asm statement as empty. This may cause false positives where clang-tidy doesn't see Decl uses from inline asm, but that may be acceptable, since they mostly come from system headers.

alexfh edited edge metadata.Mar 10 2016, 10:54 PM

If this issue is specific to MS inline asm, then the additional dependencies should only be added, when we care about MS inline asm (windows builds only?).

Also, should we do this for all Clang tools?

rnk added a comment.Mar 11 2016, 8:46 AM

If this issue is specific to MS inline asm, then the additional dependencies should only be added, when we care about MS inline asm (windows builds only?).

Clang is always a cross compiler, we never know what we are targeting until we see the command line.

In D17981#373153, @rnk wrote:

If this issue is specific to MS inline asm, then the additional dependencies should only be added, when we care about MS inline asm (windows builds only?).

Clang is always a cross compiler, we never know what we are targeting until we see the command line.

Fair enough, but I guess there are situations, where the one who builds the binary knows for sure that it won't be used for MS targets. We could add a configuration option to enable handling of MS inline assembly in Clang tools.

$ du -cksh bin/clang-tidy.exe bin/clang.exe
21M bin/clang-tidy.exe
47M bin/clang.exe
68M total

This is a huge difference. I didn't expect dependencies to bring so much code.
I'm not a fan of having an empty statement and increasing false positives ratio.
Would it be possible to skip whole declarations with asm-stm, and flag them as "ignored / not parsable"?

Also, should we do this for all Clang tools?

I would prefer a generic solution for every tools. I saw different cases where developers fix this issue differently.

where the one who builds the binary knows for sure that it won't be used for MS targets

We could gate this code under a define. I'm not a fan of define, but it seems to be a compromise for the size.

Something like: LIBTOOLING_ENABLE_INLINE_ASM_PARSER

If we decide to pursue that direction, then it should probably be for every tools.

rnk added a comment.Mar 14 2016, 2:31 PM

This is a huge difference. I didn't expect dependencies to bring so much code.
I'm not a fan of having an empty statement and increasing false positives ratio.
Would it be possible to skip whole declarations with asm-stm, and flag them as "ignored / not parsable"?

I don't actually think there are that many false positives, but I wanted to hear from Alex in case I'm wrong. I was hoping he had better ideas on how to suppress a diagnostic error in clang and run the clang-tidy checks anyway. My best idea is that we make this diagnostic a default-error warning and then build with -Wno-unparseable-assembly or something. That's not a very good solution, though. =\

We could gate this code under a define. I'm not a fan of define, but it seems to be a compromise for the size.

Something like: LIBTOOLING_ENABLE_INLINE_ASM_PARSER

If we decide to pursue that direction, then it should probably be for every tools.

I'd really rather not do that.

Adding Manuel, who might have better ideas.

In D17981#374904, @rnk wrote:

This is a huge difference. I didn't expect dependencies to bring so much code.
I'm not a fan of having an empty statement and increasing false positives ratio.
Would it be possible to skip whole declarations with asm-stm, and flag them as "ignored / not parsable"?

I don't actually think there are that many false positives, but I wanted to hear from Alex in case I'm wrong. I was hoping he had better ideas on how to suppress a diagnostic error in clang and run the clang-tidy checks anyway.

I'm not familiar with the capabilities of MS-asm blocks, but if they can contain function calls, for example, we might lose valuable information, if we skip them. The possibility of a false positive depends on a specific check, it's hard to tell in general. There's also a more generic thing that can stop working properly: finding compilation errors inside asm blocks and applying fixes to these errors (if there are any fixes generated from parsing MS-asm blocks). Not sure how frequently this will happen though.

My best idea is that we make this diagnostic a default-error warning and then build with -Wno-unparseable-assembly or something. That's not a very good solution, though. =\

Yes, not a very good solution for the reasons outlined above.

We could gate this code under a define. I'm not a fan of define, but it seems to be a compromise for the size.

Something like: LIBTOOLING_ENABLE_INLINE_ASM_PARSER

If we decide to pursue that direction, then it should probably be for every tools.

I'd really rather not do that.

What's your concern? If we want parsing code inside MS asm blocks like the compiler does, we'll need to pay the cost of the asm parser. If the binary size is a large problem here, can we maybe reduce the set of dependencies of the MS asm parser?

In any case, I'm sure many users don't need this functionality, so it should be made compile-time configurable.

Adding Manuel, who might have better ideas.

In D17981#374904, @rnk wrote:

This is a huge difference. I didn't expect dependencies to bring so much code.
I'm not a fan of having an empty statement and increasing false positives ratio.
Would it be possible to skip whole declarations with asm-stm, and flag them as "ignored / not parsable"?

I don't actually think there are that many false positives, but I wanted to hear from Alex in case I'm wrong. I was hoping he had better ideas on how to suppress a diagnostic error in clang and run the clang-tidy checks anyway.

I'm not familiar with the capabilities of MS-asm blocks, but if they can contain function calls, for example, we might lose valuable information, if we skip them. The possibility of a false positive depends on a specific check, it's hard to tell in general. There's also a more generic thing that can stop working properly: finding compilation errors inside asm blocks and applying fixes to these errors (if there are any fixes generated from parsing MS-asm blocks). Not sure how frequently this will happen though.

My best idea is that we make this diagnostic a default-error warning and then build with -Wno-unparseable-assembly or something. That's not a very good solution, though. =\

Yes, not a very good solution for the reasons outlined above.

We could gate this code under a define. I'm not a fan of define, but it seems to be a compromise for the size.

Something like: LIBTOOLING_ENABLE_INLINE_ASM_PARSER

If we decide to pursue that direction, then it should probably be for every tools.

I'd really rather not do that.

What's your concern? If we want parsing code inside MS asm blocks like the compiler does, we'll need to pay the cost of the asm parser. If the binary size is a large problem here, can we maybe reduce the set of dependencies of the MS asm parser?

In any case, I'm sure many users don't need this functionality, so it should be made compile-time configurable.

The alternative was to plug a dummy asm parser in clang-tidy, to mock the real one in the back-end.
I think it's doable, but didn't investigate it.

Or we can add a fake browser to avoid paying the cost of the full one and all the dependencies.

Both direction make sense, and I will let you choose:

  • Adding the real parser and paying the cost
  • Finding a way to parse and ignore any errors related to assembly statement

If we choose to add the expensive dependencies, then we need to choose to gate it or not under a "define".

Also, a point to bring here: this is applicable to every tool built with libtooling.

Adding Manuel, who might have better ideas.

... and who is unavailable for the next couple of weeks, unfortunately. To unblock you, I'd suggest the following solution for now:

  1. move this code to clang/lib/Tooling
  2. gate the code and its dependencies on a CMake option (off by default)
  3. commit it and explore alternative solutions
  4. after Manuel comes back, discuss possible alternatives once again with a better understanding of possible alternatives

WDYT?

In D17981#374904, @rnk wrote:

This is a huge difference. I didn't expect dependencies to bring so much code.
I'm not a fan of having an empty statement and increasing false positives ratio.
Would it be possible to skip whole declarations with asm-stm, and flag them as "ignored / not parsable"?

I don't actually think there are that many false positives, but I wanted to hear from Alex in case I'm wrong. I was hoping he had better ideas on how to suppress a diagnostic error in clang and run the clang-tidy checks anyway.

I'm not familiar with the capabilities of MS-asm blocks, but if they can contain function calls, for example, we might lose valuable information, if we skip them. The possibility of a false positive depends on a specific check, it's hard to tell in general. There's also a more generic thing that can stop working properly: finding compilation errors inside asm blocks and applying fixes to these errors (if there are any fixes generated from parsing MS-asm blocks). Not sure how frequently this will happen though.

My best idea is that we make this diagnostic a default-error warning and then build with -Wno-unparseable-assembly or something. That's not a very good solution, though. =\

Yes, not a very good solution for the reasons outlined above.

We could gate this code under a define. I'm not a fan of define, but it seems to be a compromise for the size.

Something like: LIBTOOLING_ENABLE_INLINE_ASM_PARSER

If we decide to pursue that direction, then it should probably be for every tools.

I'd really rather not do that.

What's your concern? If we want parsing code inside MS asm blocks like the compiler does, we'll need to pay the cost of the asm parser. If the binary size is a large problem here, can we maybe reduce the set of dependencies of the MS asm parser?

In any case, I'm sure many users don't need this functionality, so it should be made compile-time configurable.

The alternative was to plug a dummy asm parser in clang-tidy, to mock the real one in the back-end.
I think it's doable, but didn't investigate it.

It would be useful to know whether it's possible with a reasonable effort and what the implications will be.

Or we can add a fake browser to avoid paying the cost of the full one and all the dependencies.

I'm not sure I understand what you mean by "fake browser".

Both direction make sense, and I will let you choose:

  • Adding the real parser and paying the cost
  • Finding a way to parse and ignore any errors related to assembly statement

If we choose to add the expensive dependencies, then we need to choose to gate it or not under a "define".

If we choose to add these dependencies, we should definitely gate them on a CMake option.

Also, a point to bring here: this is applicable to every tool built with libtooling.

Yes, so it at least makes sense to move the code to clang/lib/Tooling.

... and who is unavailable for the next couple of weeks, unfortunately. To unblock you, I'd suggest the following solution for now:

  1. move this code to clang/lib/Tooling
  2. gate the code and its dependencies on a CMake option (off by default)
  3. commit it and explore alternative solutions
  4. after Manuel comes back, discuss possible alternatives once again with a better understanding of possible alternatives

WDYT?

We can wait until more people can discuss.

I'm not sure I understand what you mean by "fake browser".

Wow. I was thinking "fake parser". Ouf! This is funny.

If we choose to add these dependencies, we should definitely gate them on a CMake option.

This seems to be the best option to me.

Yes, so it at least makes sense to move the code to clang/lib/Tooling.

Yes. I believe this is making sense for other tools too.

So, short story.... I'm not blocked by this patch. I'm keeping this patch applied over my local repo for the moment, and I'm keep moving forward.
We can wait that Manuel is available and try to land the best approach.

alexfh requested changes to this revision.Mar 24 2016, 12:11 PM
alexfh edited edge metadata.

OK, if you're not blocked on this, let's just wait then.

This revision now requires changes to proceed.Mar 24 2016, 12:11 PM

Manuel, can you take a look at this? We need your input.

Manuel, ping.

klimek edited edge metadata.May 13 2016, 1:28 AM

We had problems with binary size in multiple places.
Overall, I think if we care about binary size we should try to make binary size smaller at a higher level (perhaps by hiding visibility of symbols by default; not sure how much that would affect the exe's though).

Until we have a better solution for binary size, I'd add a FIXME and accept possible false positives / false negatives until we get more bug reports.

jckooijman added a subscriber: jckooijman.EditedJan 4 2017, 5:00 AM

Hi, this is my first time posting here, my apologies if i am giving you not enough information.

I found this bugreport from: https://blog.yuo.be/2016/05/11/columns-ui-can-now-be-compiled-using-clang/

I am getting the exact same issue using clang-tidy e.g.

C:\Program Files (x86)\Windows Kits\8.0\include\um\winnt.h:833:5: error: MS-styl
e inline assembly is not available: Unable to find target for this triple (no ta
rgets are registered) [clang-diagnostic-error]
    __asm    {
    ^

I am trying to use clang-tidy on a windows 32bit source. As you stated clang-tidy seems to simply ignore the error and move on, but i think it should be fixed regardless (my error log gets spammed by this)

jckooijman, this issue is still not solved on ToT.

You can apply this patch on your local repo.
It's fixing the problem, but your executable size will increase.