This is an archive of the discontinued LLVM Phabricator instance.

[CodeCompletion] Generally consider header files without extension
ClosedPublic

Authored by ckandeler on Nov 2 2021, 3:15 AM.

Details

Summary

Real-world use case: The Qt framework's headers have the same name
as the respective class defined in them, and Qt's traditional qmake
build tool uses -I (rather than -isystem) to pull them in.

Diff Detail

Event Timeline

ckandeler requested review of this revision.Nov 2 2021, 3:15 AM
ckandeler created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2021, 3:15 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I hope this isn't too controversial. After all, the files are located in include directories, so there shouldn't be any random garbage getting picked up.

the files are located in include directories

These are directories that may contain headers, not directories that only contain headers. (Which we mostly expect -Isystem to be).
For example, many projects keep headers next to sources, and so have sources on their include path. And the directory containing the current file is always on the include path.
We'd expect to see Makefile, SConstruct, BUILD files etc here. And maybe a smattering of "random" files that don't follow particular conventions.

Trying to support QT headers seems very reasonable though. Seems like our options are:

  • current behavior with false negatives on QT
  • proposed behavior with false positives on Makefile etc
  • current behavior and try to detect QT as an exception
  • proposed behavior and try to detect Makefile etc as exceptions

WDYT about detecting QT headers specifically? It seems hacky, but I don't see a way out of this that doesn't involve hardcoding some filenames. Are they in a directory like "qt-11/QFoo" that we can recognize? Even Q followed by another capital letter might be a good enough heuristic.
(The docs suggest it's just <QFoo> but the docs also say to use angle brackets so I'm not sure whether to believe them)

WDYT about detecting QT headers specifically? It seems hacky, but I don't see a way out of this that doesn't involve hardcoding some filenames. Are they in a directory like "qt-11/QFoo" that we can recognize? Even Q followed by another capital letter might be a good enough heuristic.
(The docs suggest it's just <QFoo> but the docs also say to use angle brackets so I'm not sure whether to believe them)

The headers are, as far as I can tell, always located in a directory whose name starts with "Qt". This parent directory is also in the include path, so e.g. to get access to QString, which is located under QtCore/, you'd typically just write:

#include <QString>

This is the recommended, documented way of pulling in headers.
Though you could also write:

#include <QtCore/QString>

as the parent parent directory is also in the list of include paths.

Looking at the code, it seems we have access to the parent directory, so we could do that name check (which I suppose has less potential for false positives than checking the file name).
For framework builds, the directory would be "Headers", which also seems safe.

WDYT about detecting QT headers specifically? It seems hacky, but I don't see a way out of this that doesn't involve hardcoding some filenames. Are they in a directory like "qt-11/QFoo" that we can recognize? Even Q followed by another capital letter might be a good enough heuristic.
(The docs suggest it's just <QFoo> but the docs also say to use angle brackets so I'm not sure whether to believe them)

The headers are, as far as I can tell, always located in a directory whose name starts with "Qt". This parent directory is also in the include path, so e.g. to get access to QString, which is located under QtCore/, you'd typically just write:

#include <QString>

This is the recommended, documented way of pulling in headers.
Though you could also write:

#include <QtCore/QString>

as the parent parent directory is also in the list of include paths.

Looking at the code, it seems we have access to the parent directory, so we could do that name check (which I suppose has less potential for false positives than checking the file name).

Yeah, that makes sense to me. Also a bit cheaper since we only have to do this once per parent dir.

For framework builds, the directory would be "Headers", which also seems safe.

I agree extensionless headers in frameworks seem fine to show.
We already know which includepath entries are frameworks, so we can just use that info directly (see line 9674)

ckandeler updated this revision to Diff 384407.Nov 3 2021, 5:43 AM

Limited the matching to Qt headers.

For framework builds, the directory would be "Headers", which also seems safe.

I agree extensionless headers in frameworks seem fine to show.
We already know which includepath entries are frameworks, so we can just use that info directly (see line 9674)

These aren't technically framework includes, as in order to make prefix-less headers work, the include path has to point directly into the Headers directory.

For framework builds, the directory would be "Headers", which also seems safe.

I agree extensionless headers in frameworks seem fine to show.
We already know which includepath entries are frameworks, so we can just use that info directly (see line 9674)

These aren't technically framework includes, as in order to make prefix-less headers work, the include path has to point directly into the Headers directory.

I see. In that case we can detect this pattern in any framework by checking whether the directory contains ".framework/Headers", right? Headers itself doesn't seem specific enough.

clang/lib/Sema/SemaCodeComplete.cpp
9647

can you extract this up near dirname, and give a name that explains the idea (e.g. ExtensionlessHeaders)

9647

we've lost the condition that there's no extension, and now accept *any* extension (e.g. .DS_Store!)
Unless this is important I think we should keep that check.

And to keep the number of cases/paths low, I think it's fine to bundle system headers into the same bucket - really this is just for the C++ standard library, which does not have extensions.

So some logic like:

bool ExtensionlessHeaders = IsSystem || Dir.endswith(".framework/Headers") || Dirname.startswith("Qt");
...
case regular_file:
  bool IsHeader = Filename.endswith(...) || Filename.endswith(...) || (ExtensionlessHeaders && !Filename.contains('.'));
  if (!IsHeader) break;
ckandeler updated this revision to Diff 384485.Nov 3 2021, 9:09 AM

Addressed comments.

ckandeler marked 2 inline comments as done.Nov 3 2021, 9:09 AM
sammccall accepted this revision.Nov 3 2021, 9:12 AM

LG, thanks! can we the tests again though?

This revision is now accepted and ready to land.Nov 3 2021, 9:12 AM
ckandeler added inline comments.Nov 3 2021, 9:19 AM
clang/lib/Sema/SemaCodeComplete.cpp
9619

I'm just noticing that this is often a symlink into a versioned directory (e.g. xxx.framework/Version/5/Headers). The include paths are not canonicalized, I hope?

can we the tests again though?

Sorry, I don't understand what you mean by that.

ckandeler updated this revision to Diff 385453.Nov 8 2021, 4:25 AM

Addressed formatting comments.

Can someone please merge this?

can we the tests again though?

Sorry, I don't understand what you mean by that.

Apologies, I mangled the comment.

This should have a test.

ckandeler updated this revision to Diff 385826.Nov 9 2021, 7:54 AM

Added test cases.

Thanks Christian! Sorry for the delay.

I have a similar structure to Qt regarding the headers and use clangd trough the QtCreator, so for my own libraries I cannot use the auto completion.

the files are located in include directories

These are directories that may contain headers, not directories that only contain headers. (Which we mostly expect -Isystem to be).
For example, many projects keep headers next to sources, and so have sources on their include path. And the directory containing the current file is always on the include path.
We'd expect to see Makefile, SConstruct, BUILD files etc here. And maybe a smattering of "random" files that don't follow particular conventions.

Trying to support QT headers seems very reasonable though. Seems like our options are:

  • current behavior with false negatives on QT
  • proposed behavior with false positives on Makefile etc
  • current behavior and try to detect QT as an exception
  • proposed behavior and try to detect Makefile etc as exceptions

WDYT about detecting QT headers specifically? It seems hacky, but I don't see a way out of this that doesn't involve hardcoding some filenames. Are they in a directory like "qt-11/QFoo" that we can recognize? Even Q followed by another capital letter might be a good enough heuristic.
(The docs suggest it's just <QFoo> but the docs also say to use angle brackets so I'm not sure whether to believe them)

Could we extend the heuristic to match directories called include? For me it is

Path to Library                   .../code/cast
- include                         .../code/cast/include
  - LibraryName                   .../code/cast/include/Cast
    - Header                      .../code/cast/include/Cast/Signed

And for that I want #include <Cast/ to offer Signed>.