User Details
- User Since
- Feb 4 2021, 10:11 AM (137 w, 4 d)
Feb 20 2023
Suggested a few adjustments in LexTokenInternal you might want to test for perf improvements (orthogonal to this patch, but could boost its numbers :).
And also noted that Lexer::getBuffer() has same issue as getBufferLocation() re potential for pointer invalidation IIUC. At a minimum we need comments on these functions explaining any risks; better still to remove them from the public interface. If downstream users use these functions and complain, good - they need to be aware of this change.
Feb 15 2023
Only had a chance to give it a once over, I will look through more closely later, def by this weekend. Main thing is I think we shouldn't be exposing the buffer pointers after this change, i.e. no public function should return const char *, unless I'm missing something. If that box is checked and performance cost is negligible I'd give this the thumbs up.
Oct 15 2022
I like the late changes, just need to add comments to the public methods, and maybe move getReplacedTemplateParameterList over to Decl.
Oct 11 2022
Looks good, over to @ChuanqiXu
Oct 1 2022
First thank you for having separated out the public AST changes into other patches, it makes these mostly-Sema changes much easier to review.
Sep 27 2022
Sep 26 2022
I will give an example using a Typedef for exposition, but the same motivation applies with UsingType.
Say we have this code:
template <class T> struct A { using type1 = T; }; using Int = int; using type2 = A<Int>::type1;See this example live (with that resugaring patch) at: https://godbolt.org/z/jP64Pern8
We want the underlying type of type2 to have that Int sugar. It would also be good if we could keep the TypedefType sugar referencing the instantiation of type1 from the A<int> template.
The problem here is that the TypedefType is currently limited to having the declaration's underlying type, which is just from an instantiation of A<int>, and knows nothing about camel case Int.
This is similar to the infamous problem with std::string turning into std::basic_string in templates.We could emit a new TypedefDecl with the underlying type that we want, but creating declarations is expensive, so we want to avoid that.
We could create a new AST node that represented more accurately what was happening. But the question is, is this worth the cost?
Do you see use cases for this yourself?We want resugaring to be cheap and introduce as little changes in the AST as possible, to get a better chance of affording to have it to be on all the time.
If we introduce new nodes with more information, that might increase the cost and complexity, and it's hard to justify without an use case.
My concerns have been addressed, except for some more nitpicking on names which I think are uncontroversial.
Sep 20 2022
In general, forgetting about just the above particular way of creating them, there might still exist only one TypedefDecl in the whole program, which every TypedefType is referencing, and they can still be divergent, ie have different sugared underlying type.
Agree this needs a brief commit message, just explaining that previously the underlying type had to be canonical, but now it can be sugared, allowing a better/more informative representation.
Sep 19 2022
If I understand this correctly: whenever isDivergent() is true, getDecl() or getFoundDecl() is an arbitrary choice between multiple candidate (re)declarations of a typedef or using decl. In particular it will be the first redeclaration, even though each different redeclaration can contain different sugar information.
Sep 15 2022
Sep 13 2022
The lines you changed (clang/lib/Sema/SemaExpr.cpp:1091-1150) look good (IIUC you just change cast to cast_as and dyn_cast to getAs, and reorganize for clarity). I suggested some renaming and a few more comments to further clarify.
Sep 12 2022
Sep 8 2022
Sep 7 2022
Sep 5 2022
LGTM
I hope I'm not stepping on any toes given the recent changes in code ownership, but I'm accepting this and D130308 because
- They are a nice improvement to the AST that naturally follows from the earlier work adding sugar to type deductions,
- I have given it a fairly close look and it LGTM,
- @rsmith has taken a step back and so may not be available to give it a final look,
- @mizvekov has verified that the performance impact is negligible, and
- Other reviewers seem to have at least given it a once over and have not identified major issues.
Anyone object/need more time to review?
LGTM aside from a nit
This looks good, except for the confusion around the name getReplacedDecl - I would really like the public AST methods to be clearly named and documented, so please address that.
Aug 27 2022
Great examples.
Check this example out: https://godbolt.org/z/rsGsM6GrM
template <class ...Ts> struct A { template <class ...Us> struct B { using type1 = void ((void (*...fps)(Ts, Us))); }; }; using type2 = A<int, char>::B<short, bool>::type1;TypeAliasDecl 0x55ffe8b45368 <line:8:1, col:45> col:7 type2 'A<int, char>::B<short, bool>::type1':'void ((void (*)(int, short), void (*)(char, bool)))' `-ElaboratedType 0x55ffe8b452f0 'A<int, char>::B<short, bool>::type1' sugar `-TypedefType 0x55ffe8b452d0 'A<int, char>::B<short, bool>::type1' sugar |-TypeAlias 0x55ffe8b45258 'type1' `-FunctionProtoType 0x55ffe8b451e0 'void ((void (*)(int, short), void (*)(char, bool)))' cdecl |-ParenType 0x55ffe8b12150 'void' sugar | `-BuiltinType 0x55ffe8ac6370 'void' |-PointerType 0x55ffe8b44550 'void (*)(int, short)' | `-ParenType 0x55ffe8b444f0 'void (int, short)' sugar | `-FunctionProtoType 0x55ffe8b444b0 'void (int, short)' cdecl | |-BuiltinType 0x55ffe8ac6370 'void' | |-SubstTemplateTypeParmType 0x55ffe8b44310 'int' sugar | | |-TemplateTypeParmType 0x55ffe8b11440 'Ts' dependent contains_unexpanded_pack depth 0 index 0 pack | | | `-TemplateTypeParm 0x55ffe8b113c0 'Ts' | | `-BuiltinType 0x55ffe8ac6410 'int' | `-SubstTemplateTypeParmType 0x55ffe8b443c0 'short' sugar | |-TemplateTypeParmType 0x55ffe8b11960 'Us' dependent contains_unexpanded_pack depth 1 index 0 pack | | `-TemplateTypeParm 0x55ffe8b118d8 'Us' | `-BuiltinType 0x55ffe8ac63f0 'short' `-PointerType 0x55ffe8b450c0 'void (*)(char, bool)' `-ParenType 0x55ffe8b45060 'void (char, bool)' sugar `-FunctionProtoType 0x55ffe8b447d0 'void (char, bool)' cdecl |-BuiltinType 0x55ffe8ac6370 'void' |-SubstTemplateTypeParmType 0x55ffe8b44630 'char' sugar | |-TemplateTypeParmType 0x55ffe8b11440 'Ts' dependent contains_unexpanded_pack depth 0 index 0 pack | | `-TemplateTypeParm 0x55ffe8b113c0 'Ts' | `-BuiltinType 0x55ffe8ac63b0 'char' `-SubstTemplateTypeParmType 0x55ffe8b446e0 'bool' sugar |-TemplateTypeParmType 0x55ffe8b11960 'Us' dependent contains_unexpanded_pack depth 1 index 0 pack | `-TemplateTypeParm 0x55ffe8b118d8 'Us' `-BuiltinType 0x55ffe8ac6390 'bool'
If it were as simple as the above case the Resugarer TreeTransform could say keep a map of TemplateTypeParmTypes to current pack indices, and iterate those during each each TransformSubstTemplateTypeParmType call, but...
Two last thoughts:
1: To reiterate it doesn't seem right to view this (or any patch adding not-semantically-relevant info to the AST) as a one-size-fits-all costs vs. benefits optimization. On the one hand, the additional AST info hurts compilation performance. On the other, the info might make debugging faster, or be useful to a tool. A flag governing how much non-semantically-necessary info to add to the AST really seems the better solution in general, and warrants more discussion in cases like this (e.g. Matheus's other resugaring patches, which may cause this debate to resurface regardless of the approach he takes here). The user could set the flag high when debugging, and decrease it when diagnostics/debug info are less expected/not needed. (In light of the performance tests done here, the performance benefit of allowing the user to omit *other* non-semantically-necessary nodes from the AST could be significant; such a flag could allow that.)
Aug 19 2022
Aug 17 2022
Aug 16 2022
Once the FIXME is removed this looks good, but I was involved in this so better if @sammccall can give the thumbs up at least to the RecursiveASTVisitor code.
Aug 14 2022
It was very good to separate this out, thanks. Can you can do some TMP performance testing, to verify the impacts are negligible before taking resugaring into consideration, to allay potential concerns?
The second paragraph is talking about 'Canonical nodes', not 'Canonical types'.
A canonical node is a type node for which 'isSugared' method returns false.
Aug 11 2022
This part of the description is confusing:
Aug 10 2022
Aug 8 2022
This corrects a genuine deficiency in the AST, and the patch LGTM. Can we knock this off Matheus' stack?
Feb 2 2022
Jan 31 2022
Jan 16 2022
Jan 15 2022
Looks good, thanks for fixing this!
Jan 7 2022
There are already two way more sophisticated (forgive me my bias) implementations in Clang that are for checking if two statements or decls are the same.
- ODRHash, used in modules to discover ODR violations
- ASTStructuralEquivalenceContext, used in ASTImporter to discover if two AST nodes are the same or not (as a side effect we diagnose ODR violations as well).
It is not the first time, when such a similarity check is needed (see https://reviews.llvm.org/D75041). Of course reusing the before mentioned components would require some architectural changes, but it might be beneficial.
Dec 21 2021
Dec 16 2021
throughUsingDecl seems like a good solution, though I defer to regular ASTMatchers users.
Nov 29 2021
A couple thoughts/cases to consider...
Looks great, thanks for identifying the need and banging this out so quickly.
Hope you had some time to enjoy the holiday with your family!
Dave
Nov 19 2021
Looks good, a few notes.
Jun 12 2021
Was this performance hit when using the static analyzer? A quick search suggests isAnyDestructorNoReturn() is only called within the analyzer, whereas comparable CXXRecordDecl methods whose results are stored (hasIrrelevantDestructor() etc.) seem to be called somewhere by Sema.
May 24 2021
Looks good, thanks for doing this!
May 22 2021
Sorry for the delay.
Richard should probably weigh in before absolutely committing to
- BaseUsingDecl/UsingEnumDecl/UsingDecl (as implemented here) vs. single UsingDecl (Nathan's original approach) and
- Renaming getUsingDecl() to getIntroducer() (if it is to return a BaseUsingDecl).