The interfaces for designators (i.e. C99 designated initializers) was
done in two slightly different ways. This was rather wasteful as the
differences could be combined into one.
Details
- Reviewers
aaron.ballman rjmccall rsmith dblaikie shafik - Group Reviewers
Restricted Project - Commits
- rG3c07db5f58e9: [Clang] Refactor "Designators" into a unified implementation [NFC]
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/include/clang/AST/Designator.h | ||
---|---|---|
87–92 | This looks very odd -- these two index fields represent completely different kinds of information, one of which is specific to an implementation detail of InitListExpr / DesignatedInitExpr. Perhaps this class could be templated over its representation of an expression, instead of having both cases be possibilities here, and having the user of this class keep track of whether they're using the kind of designator that stores an expression pointer or the kind that stores an expression index? (We'd end up storing two different indexes for array range designators this way, but that seems like it should not be problematic.) | |
98 | Does this really need to be mutable? The AST version shouldn't be exposing a way to mutate this field! It seems like we should have a non-const overload of Designation::getDesignator for Sema to use if it really needs to modify this field. | |
313–319 | AST classes shouldn't reference Sema. But these functions do nothing; can they be removed? | |
340–346 | AST classes shouldn't reference Sema. But these functions do nothing; can they be removed? |
Combine the array designator internals and templatize them. It should make this a bit simpler.
clang/include/clang/AST/Designator.h | ||
---|---|---|
313–319 | Looks like they're dead. Removed. |
Drive by comment (I'll give a more thorough review when I have a moment): precommit CI seems to have found an issue to address.
Assertion failed: false && "Invalid accessor", file C:\ws\w5\llvm-project\premerge-checks\clang\include\clang/AST/Designator.h, line 276
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | Can we move the templating out from here to the whole Designator and Designation classes? It shouldn't be possible to mix the two kinds in the same Designation. |
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | Grr...My previous comment was eaten. I'll give it a shot. However, I'm a bit surprised at how designators are handled by Clang. I expected that a Designation would be an Expr with the Designators being L-values (e.g. MemberExprs / ArraySubscriptExprs), but instead the Designation exists just long enough to be turned into an explicit initialization list. Is there a reason to do it that way instead of using expressions? |
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | So it looks like moving the template outside of the class won't work. The ability to switch between Expr and unsigned while retaining the same overall type is hardwired into things like the ASTImporter. This is kind of a massive mess. Maybe we shouldn't even allow them to use both Expr and unsigned but instead require them to use one or the other? Maybe we could require unsigned with the understanding that the Expr can be converted into a constant? |
Friendly ping!
BTW, I'm planning on future changes to designated initializers. This is the first step. I hope to get rid of the template stuff in the followup.
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | I'm not understanding something. Currently the ASTImporter only deals with DesignatedInitExpr::Designators , which only ever store integer indexes. Basically, today, we have two different classes:
You want to refactor them to share code, which makes sense, because they are basically the same other than how they refer to expressions. (Not quite: DesignatedInitExpr can apparently refer to a field either as an IdentifierInfo* or as a FieldDecl*, whereas the Sema version always uses the IdentifierInfo* representation.) Each current user of one of these two classes uses only one of the two, which means they're either exclusively using integers to refer to expressions or exclusively using Expr*. So it seems to me that you should be able to update each user to use either Designator<unsigned> or Designator<Expr*>, depending on which class they used before. What am I missing? |
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | I'm still allowing them to use a Designator<unsigned> / Designator<Expr*> as they see fit, only it's hidden from them via the Create methods. I personally find the use of two different versions (one using unsigned and one using Expr*) completely baffling. Why can't they all use Expr*? Also the ASTImporter only outputs the start of an array init range, which is at the very least counter-intuitive. That's one of the issues I'd like to tackle with follow-up patches, hopefully getting rid of the need for this template all together. This does mean that in the interim a non-array range designator will have extra End & EllisisLoc fields that aren't used, but that shouldn't be too horrible, given that they'd be there anyway because of the union. |
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | The reason that's jumping out at me for having separate integer / Expr* implementations here is space-efficiency -- we get to make array range designators (and hence designators overall) be only 16 bytes rather than the 32 bytes they occupy in this patch (assuming 64-bit pointers) by storing indexes instead of pointers. If your eventual plan is to remove the children list from DesignatedInitExpr, and store the pointers only in the designators, that seems to cost 8 bytes per designator in the two common cases:
... plus it'll presumably be painful to make the Stmt child iterator be able to handle this. If you don't remove the separate children list from DesignatedInitExpr, then it seems like this approach will cost 16 bytes per designator in all cases, and we'll need to be careful in AST serialization / deserialization that we don't accidentally duplicate the Exprs that now have two pointers pointing to them instead of one, and likewise anywhere else that assumes each Expr is only reachable by one path through the AST (eg, TreeTransform, the recursive AST visitor). I think some more visibility into the eventual plan would help. |
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | The plan isn't detailed, but I basically want to address several of the points you mentioned here. In particular, I think the structure of DesignatedInitExpr is backwards from how every other Expr is handled in Clang. For instance, the Expr for something like s.t.u is a MemberExpr with a MemberExpr as its sub-expression and so on. DesignatedInitExpr on the other hand basically has a list of maybe expressions, maybe integers that refer to parts of the structure / array. It seems cleaner to me to use the MemberExpr / ArraySubscriptExpr way of referring to the member being initialized rather than using a specialized list that has to be handled differently from other Expr's. The first step in my evil plot is to do this simple refactoring, so that there's no initial functionality change, before I do the more invasive changes that may break things. I'm doing this because I'm working on a feature that uses the DIE syntax, and it would be much simpler to have it be a MemberExpr. Am I completely off base here? |
I think this is fine as a short-term stepping stone to a different representation.
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | Thanks, that's really helpful. OK, so the end design would be something like (just building out some details here so I can think about this better; I'm not expecting you would necessarily do exactly this!):
If something like that is the plan, then yes, I think that's completely reasonable, and seems like a nice improvement -- and I think it's fine if the intermediate state increases the AST size for designated initializers, as this patch does, especially if you're aiming for this to be completed within a single release cycle. (Just quickly checking the size of the representation: a single-field DesignatedInitExpr is currently 40 bytes + 16 bytes for the Designator array, and I think could easily be 40 bytes *total* with the new representation; with two fields, it's currently 40 + 32 and could be 40 + 40 with the new representation. I think that's going to be a win essentially always.) |
clang/include/clang/AST/Designator.h | ||
---|---|---|
89 | Thanks! I'm going to dig into the DIE stuff more and will have a much more detailed plan that I can give you to see what you think before I go too far down the wrong path. :-) |
I think these two bug reports: https://github.com/llvm/llvm-project/issues/46132 and https://github.com/llvm/llvm-project/issues/61118 may be related to this change.
Can we move the templating out from here to the whole Designator and Designation classes? It shouldn't be possible to mix the two kinds in the same Designation.