Don't store the init statement and condition variable if not needed.
This cuts the size of ForStmt by up to 2 pointers, and 1 pointer in
the common case. The condition and increment are stored unconditionally
since for statements without a condition and increment are quite uncommon.
The order of the children is kept the same.
Details
Diff Detail
- Repository
- rC Clang
Event Timeline
Do you have evidence that this complexity is worthwhile? (I wouldn't imagine we have very many ForStmts per translation unit, so saving 16 bytes for each of them seems unlikely to matter.)
include/clang/AST/Stmt.h | ||
---|---|---|
1871 | Please static_cast rather than reinterpret_cast throughout. (If the Stmt base class weren't at offset 0 within Expr, we'd want to apply the adjustment.) | |
1879 | No need for an explicit cast from Expr to base class Stmt. | |
test/Import/for-stmt/test.cpp | ||
3–8 | This seems like a (minor) problem: -ast-dump no longer shows what the things it's dumping mean. Maybe add the has_init / has_var flags to the dump of the ForStmt, or add labels to the substatements, or something? |
Strikes me that some data would be useful here, to help prioritize. Here's a histogram of occurrence counts for a libc++ module:
Count # Bits b/Rec % Abv Record Kind 43731 5471391 125.1 87.46 EXPR_DECL_REF 35751 822273 23.0 DECL_OMP_DECLARE_REDUCTION 29734 3431612 115.4 TYPE_TEMPLATE_SPECIALIZATION 25750 7472591 290.2 55.89 DECL_PARM_VAR 14986 651081 43.4 98.81 EXPR_IMPLICIT_CAST 14847 1620549 109.1 EXPR_CALL 13153 1968371 149.7 TYPE_FUNCTION_PROTO 12605 3797017 301.2 100.00 DECL_CONTEXT_LEXICAL 12515 715141 57.1 TYPE_TEMPLATE_TYPE_PARM 12480 2608644 209.0 DECL_TEMPLATE_TYPE_PARM 10571 1200811 113.6 EXPR_BINARY_OPERATOR 10300 955610 92.8 STMT_COMPOUND 10254 9421030 918.8 17.66 DECL_CXX_METHOD 10220 2252926 220.4 EXPR_CXX_DEPENDENT_SCOPE_MEMBER 10083 231909 23.0 STMT_NULL_PTR 9731 5196865 534.1 EXPR_CXX_UNRESOLVED_LOOKUP 8015 580911 72.5 87.16 EXPR_INTEGER_LITERAL 7935 3298497 415.7 EXPR_CXX_DEPENDENT_SCOPE_DECL_REF 7934 3379054 425.9 DECL_CXX_RECORD 7790 946360 121.5 EXPR_CXX_THIS 7508 350806 46.7 LOCAL_REDECLARATIONS 7155 1239819 173.3 EXPR_MEMBER 6754 1264508 187.2 EXPR_CXX_OPERATOR_CALL 6607 5819360 880.8 100.00 DECL_CONTEXT_VISIBLE 6461 736861 114.0 EXPR_UNARY_OPERATOR 6117 284391 46.5 TYPE_LVALUE_REFERENCE 6081 372753 61.3 STMT_RETURN 6066 1964810 323.9 99.64 DECL_TYPEDEF 5659 249455 44.1 TYPE_RECORD 5252 4884856 930.1 DECL_CLASS_TEMPLATE_SPECIALIZATION 5196 313722 60.4 TYPE_SUBST_TEMPLATE_TYPE_PARM 5189 532009 102.5 TYPE_DEPENDENT_NAME 5083 2145083 422.0 1.65 DECL_VAR 4886 296440 60.7 TYPE_TYPEDEF 4340 473950 109.2 STMT_DECL 4314 4078644 945.4 DECL_FUNCTION 4150 1436618 346.2 DECL_FUNCTION_TEMPLATE 3686 343246 93.1 TYPE_ELABORATED 3629 144649 39.9 TYPE_POINTER 3435 2907387 846.4 DECL_CXX_CONSTRUCTOR 3341 896701 268.4 DECL_CXX_BASE_SPECIFIERS 2847 376713 132.3 EXPR_PAREN 2684 271156 101.0 EXPR_CXX_BOOL_LITERAL 2651 208771 78.8 STMT_IF 2053 550511 268.1 EXPR_CXX_UNRESOLVED_CONSTRUCT 1938 268044 138.3 DECL_ACCESS_SPEC 1725 472401 273.9 DECL_NON_TYPE_TEMPLATE_PARM 1647 224673 136.4 EXPR_PAREN_LIST 1610 64624 40.1 TYPE_RVALUE_REFERENCE 1542 380997 247.1 48.57 DECL_FIELD 1446 392676 271.6 EXPR_CXX_UNRESOLVED_MEMBER 1411 283553 201.0 DECL_USING_SHADOW 1411 87833 62.2 TYPE_INJECTED_CLASS_NAME 1339 164195 122.6 EXPR_SUBST_NON_TYPE_TEMPLATE_PARM 1226 311278 253.9 DECL_CXX_CTOR_INITIALIZERS 1110 215604 194.2 EXPR_SIZEOF_PACK 1054 476456 452.0 DECL_CLASS_TEMPLATE 987 112161 113.6 EXPR_CXX_MEMBER_CALL 943 195005 206.8 EXPR_CXX_CONSTRUCT 941 271069 288.1 EXPR_CXX_STATIC_CAST 879 171231 194.8 EXPR_TYPE_TRAIT 771 40707 52.8 TYPE_PACK_EXPANSION 727 106103 145.9 DECL_IMPORT 696 146286 210.2 DECL_FRIEND 678 136788 201.8 EXPR_CSTYLE_CAST 664 70292 105.9 EXPR_ARRAY_SUBSCRIPT 628 67550 107.6 EXPR_PACK_EXPANSION 601 84473 140.6 EXPR_COMPOUND_ASSIGN_OPERATOR 564 71760 127.2 STMT_FOR 557 57643 103.5 EXPR_CXX_NULL_PTR_LITERAL 545 350959 644.0 DECL_CXX_DESTRUCTOR 523 867947 1659.6 DECL_CLASS_TEMPLATE_PARTIAL_SPECIALIZATION 495 120219 242.9 DECL_USING 476 87196 183.2 EXPR_SIZEOF_ALIGN_OF 447 292917 655.3 EXPR_STRING_LITERAL 434 32634 75.2 100.00 EXPR_CHARACTER_LITERAL 432 57714 133.6 EXPR_CONDITIONAL_OPERATOR 428 79870 186.6 DECL_ENUM_CONSTANT 418 38972 93.2 EXPR_MATERIALIZE_TEMPORARY 372 13302 35.8 TYPE_DECLTYPE 358 82004 229.1 EXPR_CXX_FUNCTIONAL_CAST 352 31172 88.6 EXPR_EXPR_WITH_CLEANUPS 342 49746 145.5 DECL_STATIC_ASSERT 334 168164 503.5 DECL_TYPEALIAS 315 76503 242.9 DECL_NAMESPACE 277 15299 55.2 STMT_BREAK 263 26857 102.1 STMT_CASE 234 75990 324.7 DECL_TYPE_ALIAS_TEMPLATE 225 13455 59.8 STMT_WHILE 217 21383 98.5 EXPR_CXX_BIND_TEMPORARY 217 34247 157.8 EXPR_INIT_LIST 191 68275 357.5 EXPR_CXX_TEMPORARY_OBJECT 172 22736 132.2 EXPR_CXX_NOEXCEPT 156 10176 65.2 TYPE_CONSTANT_ARRAY 153 7167 46.8 TYPE_PAREN 142 50666 356.8 EXPR_CXX_NEW 137 18637 136.0 EXPR_CXX_DEFAULT_ARG 116 6952 59.9 STMT_CXX_TRY 116 6952 59.9 STMT_CXX_CATCH 115 20981 182.4 EXPR_FLOATING_LITERAL 112 18062 161.3 DECL_LINKAGE_SPEC 108 6792 62.9 TYPE_MEMBER_POINTER
So, this patch would save about 8KB of memory when parsing all of libc++. That's not completely irrelevant as part of a patch series applying this more broadly, but it does suggest that your time might be better spent if you prioritize more common AST nodes for your improvements.
Yes indeed the result of saving a pointer in each ForStmt is under the noise, but it adds up in the aggregate.
Your data seems to mirror roughly the benchmark I am using (all of Boost). I went through all of the expression
and C++ expression classes and it seems that it would be possible to cut the space used by all
statements/expressions by ~11% just by moving some data to the bit-fields of Stmt.
Some cases which stands out in particular:
- Move the SourceLocation in DeclRefExpr to the bit-fields of Stmt. Would save 8 bytes.
- Move some of the data in BinaryOperator to the bit-fields of Stmt. Would save 8 bytes.
But there is a long tail of statements/expressions which can be packed tighter.
The reason I dealt with ForStmt first is only because I did similar modification to the
other basic statements. For example the size of IfStmtcan be cut by up to 3 pointers
+ 1 SourceLocation. (D53607).
For reference here are the other patches which are in review
IfStmt (D53607), CaseStmt (D53609), ReturnStmt (D53716),
WhileStmt (D53715), SwitchStmt (D53714) and PredefinedExpr (D53605).
include/clang/AST/Stmt.h | ||
---|---|---|
1871 | I would gladly get rid of these ugly reinterpret_cast, As you say these reinterpret_cast only work now because of the layout. A solution would be to move Expr and Stmt to something like StmtBase.h | |
1879 | same | |
test/Import/for-stmt/test.cpp | ||
8 | Yes indeed this is a good point. The same can be said of the other similar |
Add a flag in the output of -ast-dump indicating which sub-statement
the ForStmt is storing. This removes the ambiguity in the output
of -ast-dump.
include/clang/AST/Stmt.h | ||
---|---|---|
1871 | Ugh, I see, thanks. Yes, pulling Expr and Stmt out into a StmtBase.h or similar seems like a good approach to me. |
Let's close it for now. I agree that the complexity is probably not worth it. We can always
resurrect it in the future. I will submit a patch doing the move into a new StmtBase.h/.cpp
after I am done with packing the other expression classes.
Please static_cast rather than reinterpret_cast throughout. (If the Stmt base class weren't at offset 0 within Expr, we'd want to apply the adjustment.)