This is an archive of the discontinued LLVM Phabricator instance.

[clangd][query-driver] Extract target
ClosedPublic

Authored by ArcsinX on Nov 24 2020, 1:27 AM.

Details

Summary

In some cases system includes extractions is not enough, we also need target specific defines.
The problems appears when clang default target is not the same as toolchain's one (GCC cross-compiler, MinGW on Windows).
After this patch query-driver also extracts target and adds --target=<extracted target> compile option.

Diff Detail

Event Timeline

ArcsinX created this revision.Nov 24 2020, 1:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2020, 1:27 AM
ArcsinX requested review of this revision.Nov 24 2020, 1:27 AM

I'm not sure if this solution is elegant enough. I will be happy to hear advices.

Thanks, this seems really useful!

clang-tools-extra/clangd/QueryDriverDatabase.cpp
62

define a little struct holding these two, for clarity?

92

This part seems pretty confusing :-)
You could just search for Target: independently of the current scanning - I'm not terribly worried about it appearing in the middle of the include list!

If we want to be strict though, it might be time to rewrite to be more stateful e.g. a loop consuming chunks from an ArrayRef<StringRef>, or a state machine, or something.

130

this changes the error behavior: intentional?

245

clang -target foo test.cc seems to be a hard error in the driver if the target is unknown.
(vs likely *some* functionality if we just didn't set the driver)

so this could regress some scenarios. Can we mitigate that?
(It's possible that we're running the driver in a mode where we proceed anyway, but I can't remember :-()

338

nit: while here, add std::move?

kadircet added inline comments.Nov 24 2020, 12:21 PM
clang-tools-extra/clangd/QueryDriverDatabase.cpp
245

what if target already exists in Cmd?

also it would be nice to use --target=X format to be consistent with target inference from invocation name as in https://github.com/llvm/llvm-project/blob/master/clang/lib/Tooling/Tooling.cpp#L278.

ArcsinX updated this revision to Diff 307585.Nov 25 2020, 4:49 AM

Add structure to hold system includes and target.
Use state machine to go though lines of the driver output.
Handle invalid target
Use '--target=X' instead of '-target X'

ArcsinX marked 5 inline comments as done.Nov 25 2020, 4:56 AM
ArcsinX added inline comments.
clang-tools-extra/clangd/QueryDriverDatabase.cpp
134–135

Unsure about this.
Should we toss all collected data if end markes was not found?

245

I failed to find options to process in case of invalid target, so added target validation.

245

We could specify several --target options, the last one will be used. But I am not sure should we override existing or not.

kadircet added inline comments.Nov 25 2020, 5:06 AM
clang-tools-extra/clangd/QueryDriverDatabase.cpp
67

i think you can just do TargetRegistry::lookupTarget

229

nit: I would fold these two into a single log statement.

245

We could specify several --target options, the last one will be used. But I am not sure should we override existing or not.

Having multiple targets sounds confusing. I would prefer keeping a target specifically mentioned by the user but it is also possible for the target to be inferred by clang (and that might be wrong). This is similar to our stand against predefined macros though, i think we should fix clang's inference logic if it has any false positives. So i am slightly leaning towards not overriding the target if it exists.

336–337

nit:

if(auto Info = ...)
  setTarget(addSystemIncludes(...), ...);
return std::move(Cmd);
sammccall accepted this revision.Nov 25 2020, 5:07 AM

Thanks! just nits, apart from the "-target already explicitly set" issue Kadir raised.

clang-tools-extra/clangd/QueryDriverDatabase.cpp
99

I'm not sure these early returns are worth the complexity

102

also not sure whether initial vs includesextracted is worth distinguishing - this doesn't scale well if we add a third thing we're tracking (do we need a state for each combination that we could have seen already?)

Maybe just a bool for SeenIncludes or something to detect duplicate/missing sections would be simpler

106

this mostly echoes the code: "Only detect targets that clang understands"?

108

I'd prefix with "System include extraction: " and bump up to elog - this is fairly likely to break things (and we'll only log it ~once thanks to caching)

112

also add a prefix here so it's clear what the context is in the log

226

we want to log this even if the set is empty, so the reader can tell the extraction worked
(it's fine to only optionally log target)

230

The target extractor isn't a separate component from the system includes extractor, so the log message seems a bit misleading.
I'd suggest just using "System includes extractor" for both, even if it's not totally accurate

245

I think the question of "target already explicitly set" should probably be answered.
Checking for an arg equal to "-target" or beginning with "--target" is probably enough.

This revision is now accepted and ready to land.Nov 25 2020, 5:07 AM

Sorry, code reviews are racy :-)

clang-tools-extra/clangd/QueryDriverDatabase.cpp
134–135

Yup, at least for this patch (this is the existing behavior).
I think the logic is "this is a fragile text protocol, we've never seen this happen, so don't really know what it would mean".

245

I think we should definitely not override the target if it already exists.

The scenario is: compile_commands.json has /my/driver --target=bar foo.cc.
/my/driver reports "Target: foo"

So foo is the default target, but it's being overridden on the command-line as part of the build.

Now if we add the extracted target to the end of the compile command, to produce /my/driver --target=bar foo.cc --target=foo, we're effectively reversing the precedence: now the default target is used even if the compile command overrode it.

ArcsinX updated this revision to Diff 307773.Nov 26 2020, 12:30 AM

Do not override existing target.
Fix clang-tidy warning.
Address review comments.

ArcsinX marked 14 inline comments as done.Nov 26 2020, 12:38 AM
ArcsinX added inline comments.
clang-tools-extra/clangd/QueryDriverDatabase.cpp
67

Seems we can't. TargetRegistry::lookupTarget returns nullptr for arm-linux-gnueabihf with error message: No available targets are compatible with triple "arm-linux-gnueabihf"

336–337

I used your advice except std::move(Cmd), seems we do not need std::move here

kadircet added inline comments.Nov 26 2020, 12:56 AM
clang-tools-extra/clangd/QueryDriverDatabase.cpp
344

ofc we don't need it. but the thing is we are returning an Optional<tooling::Command> hence the return statement needs to invoke a constructor for Optional and without std::move it would invoke a copy constructor rather than a move-based one.

And even though rest of the calculations are cached per driver&language name, this copy is going to happen for every translation unit. So in theory it would be nice to prevent that.

kadircet added inline comments.Nov 26 2020, 12:58 AM
clang-tools-extra/clangd/QueryDriverDatabase.cpp
344

oh actually never mind, Cmd itself is also an Optional<tooling::Command>, i thought it was a naked tooling::Command

sammccall accepted this revision.Nov 26 2020, 3:48 AM

Still LG, thanks!

clang-tools-extra/clangd/QueryDriverDatabase.cpp
344

It's moot here, but this isn't the case anymore, return x can call a converting move constructor: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1579

This is part of C++14, and so we should be able to rely on it in LLVM. It wasn't true in the original C++11 spec, but since it was a defect report, reasonably-modern compilers allow the optimization even in C++11 mode.

Buggy compilers certainly exist, but I don't think we need to work around them for small performance issues. (If the code fails to compile on old-gcc without std::move, that's another story...)

ArcsinX edited the summary of this revision. (Show Details)Nov 26 2020, 3:57 AM
ArcsinX marked 4 inline comments as done.Nov 26 2020, 4:02 AM

Thanks for review!

This revision was automatically updated to reflect the committed changes.