This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Populate declarations inside TypeLocs for some invalid types
ClosedPublic

Authored by ilya-biryukov on Mar 27 2023, 8:06 AM.

Details

Summary

This also reverts 282cae0b9a602267ad7ef622f770066491332a11 as the
particular crash is now handled by the new code.

Before this change Clang would always leave declarations inside the
type-locs as null if the declarator had an invalid type. This patch
populates declarations even for invalid types if the structure of the
type and the type-locs match.

There are certain cases that may still cause crashes. These happen when
Clang recovers the type in a way that is not reflected in the
declarator's structure, e.g. adding a pointer when it was not present in
the code for ObjC interfaces or ignoring pointers written in the code
in C++ with auto return type (auto* foo() -> int). Those cases look
fixable with a better recovery strategy and I plan to follow up with
more patches to address those.

The first attempt caused 31 tests from check-clang to crash due to
different structure of the types and type-locs after certain errors. The
good news is that the failure is localized and mismatch in structures is
discovered by assertions inside DeclaratorLocFiller. Some notable
cases caught by existing tests:

  • Invalid chunks when type is fully ignored and replace with int or now. Crashed in C/C2x/n2838.c.
  • Invalid return types in lambdas. Crashed in CXX/drs/dr6xx.cpp.
  • Invalid member pointers. Crashed in CXX/dcl.dcl/dcl.spec/dcl.type/dcl.spec.auto/p3-generic-lambda-1y.cpp
  • ObjC recovery that adds pointers. Crashed in SemaObjC/blocks.m

This change also updates the output of Index/complete-blocks.m.
Not entirely sure what causes the change, but the new function signature
is closer to the source code, so this seems like an improvement.

Diff Detail

Event Timeline

ilya-biryukov created this revision.Mar 27 2023, 8:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 27 2023, 8:06 AM
Herald added a subscriber: arphaman. · View Herald Transcript
ilya-biryukov requested review of this revision.Mar 27 2023, 8:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 27 2023, 8:06 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

One other thing we probably should do is have an assert when creating a function type that none of its params are null. WDYT?

clang/lib/Sema/SemaType.cpp
5962

Shouldn't the D.setInvalidType(true) in each of the branches work here? So this variable is unnecessary? Else this is a good change IMO.

One other thing we probably should do is have an assert when creating a function type that none of its params are null. WDYT?

This would definitely be great, however I don't think this is possible without some large redesign.
Currently, FunctionType does not store parameter decls. Instead, they are stored in the FunctionTypeLoc.
The latter gets created by allocating the necessary number of bytes and filling the empty state.
In principle, it would be much easier if FuncionTypeLoc and FunctionType were created in parallel, ensuring consistency there would be much easier. I have also though about this, but this looks like a large change and I didn't explore further.

clang/lib/Sema/SemaType.cpp
5962

(Let me know if I'm misreading the suggestion, I was not sure if my understanding is correct).
If we call setInvalidType more, we would actually get more crashes as the problematic branch is the one that calls getTrivialTypeSourceInfo.
To avoid the extra variable, we could instead ensure that the type always gets replaced with something trivial T = Context.IntTy. But I didn't want to go this path because this causes worse error recovery (going from something like void(<recovered-to:int>) to <recovered-to:int>) and possibly correctness issues (e.g. will we start getting function-decls or lambdas that do not have function types and other assertions may fire).

One other thing we probably should do is have an assert when creating a function type that none of its params are null. WDYT?

This would definitely be great, however I don't think this is possible without some large redesign.
Currently, FunctionType does not store parameter decls. Instead, they are stored in the FunctionTypeLoc.
The latter gets created by allocating the necessary number of bytes and filling the empty state.
In principle, it would be much easier if FuncionTypeLoc and FunctionType were created in parallel, ensuring consistency there would be much easier. I have also though about this, but this looks like a large change and I didn't explore further.

Ah, hrmph. I guess I was just hoping for some assert (perhaps even in GetFullTypeForDeclarator?) to assert in order to give future folks a hint as to why their change crashes? Basically, whenever the FunctionType/FunctionTypeLoc are 'finalized' as a check perhaps? Not really looking for a refactor, so much as an assert to prevent us from getting here again.

clang/lib/Sema/SemaType.cpp
5962

My suggestion was that setInvalidType is called everywhere that AreDeclatorChunksValid above (sans 1 perhaps?). So my question was whether this introduced variable was necessary.

i've added a reproducer for https://reviews.llvm.org/D146634, can you patch that into here? unfortunately i couldn't get it to crash in any place other than through clangd.

Ah, hrmph. I guess I was just hoping for some assert (perhaps even in GetFullTypeForDeclarator?) to assert in order to give future folks a hint as to why their change crashes? Basically, whenever the FunctionType/FunctionTypeLoc are 'finalized' as a check perhaps? Not really looking for a refactor, so much as an assert to prevent us from getting here again.

Yeah, that makes a lot of sense. We need to fix the FIXMEs for that to work, I'm eager to try this in follow-ups. But probably not in this particular change as it seems like a lot of work and this is already moving in the right direction.

We probably need asserts in two places: GetFullTypeForDeclarator and getTrivialTypeSourceInfo.
I'm less worried about the former, but fixing getTrivialTypeSourceInfo might be harder as it has a lot of call sites and I suspect some of them actually pass FunctionType as inputs. The assert would basically prohibit creating FunctionTypeLoc without parameter declarations.
I don't think it's unreasonable, the code that produces FunctionTypeLoc without having parameters should either find the corresponding parameters or be rewritten to use QualType instead of TypeLoc, which does not need parameters.
But I expect this to be a lot of work to actually update the callsites, and this will probably take quite some time.

clang/lib/Sema/SemaType.cpp
5962

Ah, thanks for clarifying the question.
setInvalidType is called in much more places and the point of this patch is to capture a subcategory of invalid types that still follow the declarator structure.

erichkeane accepted this revision.Mar 28 2023, 6:02 AM
erichkeane added inline comments.
clang/lib/Sema/SemaType.cpp
5962

Got it, thanks for the clarification. I'm OK with this as-is, then.

This revision is now accepted and ready to land.Mar 28 2023, 6:02 AM
aaron.ballman accepted this revision.Mar 28 2023, 7:37 AM

LGTM, though the change should come with a release note. Suggestion you can take or leave as you see fit: should we turn the places where we removed the null pointer check into assertions that the pointer is nonnull? Or are we hoping the crash will be sufficient to tell us when we've missed a case?

  • Add assertions for removed null checks
  • Another forgotten non-null assertion
ilya-biryukov added a comment.EditedApr 6 2023, 10:02 AM

Sorry for taking so long to land this, it fell off my radar.

LGTM, though the change should come with a release note. Suggestion you can take or leave as you see fit: should we turn the places where we removed the null pointer check into assertions that the pointer is nonnull? Or are we hoping the crash will be sufficient to tell us when we've missed a case?

Added the check. I think they're useful, at the very least if folks hit them they have a better chance of finding this change and reaching out to the right people.

This revision was landed with ongoing or failed builds.Apr 6 2023, 10:03 AM
This revision was automatically updated to reflect the committed changes.