Page MenuHomePhabricator

[clang] Improve Serialization/Imporing of APValues
ClosedPublic

Authored by Tyker on Jun 21 2019, 3:07 AM.

Details

Summary

Changes:

  • initializer expressions of constexpr variable are now wraped in a ConstantExpr. this is mainly used for testing purposes. the old caching system has not yet been removed.
  • Add all the missing Serialization and Importing for APValue.
  • Cleanup leftover from last patch.
  • Add Tests for Import and serialization.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Tyker added inline comments.Sep 23 2019, 12:29 PM
clang/include/clang/AST/APValue.h
653

We're horribly inconsistent in this class

this class has many flaws. but is far too broadly used to fix.

clang/include/clang/AST/ASTContext.h
267

the modification to use addDestruction() was made in a previous revision (https://reviews.llvm.org/D63376).
the use is currently on master in ConstantExpr::MoveIntoResult in the RSK_APValue case of the switch.
this is just a removing an unused member.

clang/include/clang/AST/TextNodeDumper.h
149 ↗(On Diff #217600)

no worries, i wrote the original bug. i added APValue::dumpPretty which has almost the same output as APValue::printPretty but doesn't need an ASTContext. and is used for TextNodeDumper.

clang/lib/AST/APValue.cpp
790

this helper is not intended to be used outside of importing and serialization. it is logically part of initialization.
normal users are intended to use ArrayRef<APValue::LValuePathEntry> APValue::getLValuePath() const

clang/test/ASTMerge/APValue/APValue.cpp
1 ↗(On Diff #221169)

it wasn't intentional, i added via git add i don't think i did anything weird. is it a problem ?

2–3 ↗(On Diff #221169)

i don't know if it is normal. but i am getting an error hen i am not using -x c++
error: invalid argument '-std=gnu++2a' not allowed with 'C'

28 ↗(On Diff #221169)

i don't intend to add them in this patch or subsequent patches. i don't know how to use the features that have these representations, i don't even know if they can be stored stored in that AST. so this is untested code.
that said theses representations aren't complex. the imporing for FixePoint, ComplexInt, ComplexFloat is a no-op and for AddrLabelDiff it is trivial. for serialization, I can put an llvm_unreachable to mark them as untested if you want ?

aaron.ballman marked an inline comment as done.Sep 23 2019, 1:02 PM
aaron.ballman added inline comments.
clang/include/clang/AST/APValue.h
653

Agreed -- I wasn't suggesting to fix the whole class, but just the new APIs that we add to the class. It looks like the private functions most consistently use a capital letter in this class, unfortunately. Best to stick with the local convention when in conflict.

clang/include/clang/AST/ASTContext.h
267

Ahhh, thank you for the explanation, I was missing that context.

clang/include/clang/AST/PrettyPrinter.h
212

Wether -> Whether

clang/lib/AST/APValue.cpp
552

Since you're touching the code anyway, this can be const auto *.

623

Are you sure this doesn't change behavior? See the implementation of ASTContext::getAsArrayType(). Same question applies below.

638

const auto * and same question about behavior changes.

790

Nothing about this API suggests that. The name looks like a generic getter. Perhaps a more descriptive name and some comments would help?

clang/test/ASTMerge/APValue/APValue.cpp
1 ↗(On Diff #221169)

No idea; I'm on a platform where file modes are ignored. You should probably drop the svn property.

2–3 ↗(On Diff #221169)

There's no %s on that line for the source file, which is why you get that diagnostic. I'm not certain what that RUN line does, actually -- it does an AST merge with... nothing... and then prints it out?

If that's intended, then you only need the -x c++ on that one RUN line.

28 ↗(On Diff #221169)

I don't think llvm_unreachable makes a whole lot of sense unless the code is truly unreachable because there's no way to get an AST with that representation. By code inspection, the code looks reasonable but it does make me a bit uncomfortable to adopt it without tests. I suppose the FIXME is a reasonable compromise in this case, but if you have some spare cycles to investigate ways to get those representations, it would be appreciated.

Tyker updated this revision to Diff 221568.Sep 24 2019, 11:57 AM
Tyker marked 7 inline comments as done.

fixed most comments

clang/lib/AST/APValue.cpp
623

i ran the test suite after the change it there wasn't any test failures. but the test on dumping APValue are probably not as thorough as we would like them to be.
from analysis of ASTContext::getAsArrayType() the only effect i see on the element type is de-sugaring and canonicalization which shouldn't affect correctness of the output. de-sugaring requires the ASTContext but canonicalization doesn't.

i think the best way the have higher confidence is to ask rsmith what he thinks.

clang/test/ASTMerge/APValue/APValue.cpp
28 ↗(On Diff #221169)

the reason i proposed llvm_unreachable was because it passes the tests and prevents future developer from depending on the code that depend on it assuming it works.

aaron.ballman added inline comments.Sep 25 2019, 8:52 AM
clang/include/clang/AST/APValue.h
657

ReserveVector

664

SetLValueEmptyPath

clang/lib/AST/APValue.cpp
623

Yeah, I doubt we have good test coverage for all the various behaviors here. I was wondering if the qualifiers bit was handled properly with a simple cast. @rsmith is a good person to weigh in.

clang/test/ASTMerge/APValue/APValue.cpp
28 ↗(On Diff #221169)

We typically only use llvm_unreachable for situations where we believe the code path is impossible to reach, which is why I think that's the wrong approach. We could use an assertion to test the theory, however.

Tyker updated this revision to Diff 222169.Sep 27 2019, 7:18 AM
Tyker marked 3 inline comments as done.

made renamings

rsmith added inline comments.Sep 27 2019, 11:53 AM
clang/lib/AST/Expr.cpp
328

Can you use llvm_unreachable here? (Are there cases where we use RSK_None and then later find we actually have a value to store into the ConstantExpr?)

clang/lib/Serialization/ASTReader.cpp
8912

This is problematic.

ReadExpr will read a new copy of the expression, creating a distinct object. But in the case where we reach this when deserializing (for a MaterializeTemporaryExpr), we need to refer to the existing MaterializeTemporaryExpr in the initializer of its lifetime-extending declaration. We will also need to serialize the ASTContext's MaterializedTemporaryValues collection so that the temporaries lifetime-extended in a constant initializer get properly handled.

That all sounds very messy, so I think we should reconsider the model that we use for lifetime-extended materialized temporaries. As a half-baked idea:

  • When we lifetime-extend a temporary, create a MaterializedTemporaryDecl to hold its value, and modify MaterializeTemporaryExpr to refer to the MaterializedTemporaryDecl rather than to just hold the subexpression for the temporary.
  • Change the LValueBase representation to denote the declaration rather than the expression.
  • Store the constant evaluated value for a materialized temporary on the MaterializedTemporaryDecl rather than on a side-table in the ASTContext.

With that done, we should verify that all remaining Expr*s used as LValueBases are either only transiently used during evaluation or don't have these kinds of identity problems.

Tyker marked 10 inline comments as done.Oct 6 2019, 2:34 AM

update done tasks.

clang/lib/AST/APValue.cpp
623

the original question we had is whether it is correct to replace Ctx.ASTContext::getAsArrayType(ElemTy) by cast<ArrayType>(ElemTy.getCanonicalType()) in this context and the other comment below.

clang/lib/AST/Expr.cpp
328

we can put llvm_unreachable in the switch because of if (!Value.hasValue()) above the switch but we can't remove if (!Value.hasValue()).
all cases i have seen where if (!Value.hasValue()) is taken occur after a semantic error occured.

clang/lib/Serialization/ASTReader.cpp
8912

Would it be possible to adapt serialization/deserialization so that they make sure that MaterializeTemporaryExpr are unique.
by:

  • When serializing MaterializeTemporaryExpr serialize a key obtained from the pointer to the expression as it is unique.
  • When deserializing MaterializeTemporaryExpr deserializing the key, and than have a cache for previously deserialized expression that need to be unique.

This would make easier adding new Expr that require uniqueness and seem less complicated.
What do you think ?

Tyker marked an inline comment as done.Oct 23 2019, 4:04 PM
Tyker added inline comments.
clang/lib/Serialization/ASTReader.cpp
8912

i added a review that does the refactoring https://reviews.llvm.org/D69360.

Tyker added a comment.Dec 10 2019, 1:03 PM

now that the issue with uniqueness of expressions is solved. we should be able to keep going on that review @rsmith.
https://reviews.llvm.org/D63960 should be i think close to completion. so maybe for testing i could use immediate invocation as a source for ConstantExpr instead of the code i added to make constexpr variables emit ConstantExpr ?

Tyker updated this revision to Diff 236707.Jan 7 2020, 3:36 PM

rebased

Are we at a point where we can test this now? Perhaps by adding an assert in codegen that we always have an evaluated value for any constexpr variable that we emit?

clang/lib/Sema/SemaDecl.cpp
11883–11885

This may create additional AST nodes outside the ConstantExpr, which VarDecl::evaluateValue is not expecting to see (in particular, if we have cleanups for the initializer). Should the ConstantExpr go outside those nodes rather than inside?

clang/lib/Serialization/ASTReader.cpp
8912

What are the cases for which we still encounter expressions as lvalue bases during serialization? I think all the other ones should be OK, but maybe there's another interesting one we've overlooked.

Tyker updated this revision to Diff 282682.EditedAug 3 2020, 10:50 AM
Tyker edited the summary of this revision. (Show Details)

Sorry for the delay

Are we at a point where we can test this now?

Yes we can use consteval to test it. so i removed changes to constexpr AST modling.

i split this path into 2 this path deal with serialization and importing, the other (D85144) improves the dumping of APValues to properly test this patch

Tyker retitled this revision from [clang] Improve Serialization/Imporing/Dumping of APValues to [clang] Improve Serialization/Imporing of APValues.Aug 3 2020, 10:55 AM
Tyker edited the summary of this revision. (Show Details)
Tyker added inline comments.Aug 3 2020, 10:57 AM
clang/lib/Sema/SemaDecl.cpp
11883–11885

i removed the changes to the storage of constexpr values since it was only used for testing purposes and we can now use consteval for that purpose.

clang/lib/Serialization/ASTReader.cpp
8912

the 2 example that come to mind are string literals and type_info there are probably more.

Tyker updated this revision to Diff 283985.Aug 7 2020, 12:19 PM

rebased

rsmith added inline comments.Sep 28 2020, 12:13 AM
clang/include/clang/AST/APValue.h
654–656

Maybe something like this?

657

This function changes the size of the vector, so Reserve doesn't seem right to me. How about setVectorUninit? (Generally including Uninit in the names of the setters that leave things uninitialized would be useful.)

Also, instead of exposing GetInternal functions, how about adding functions that return a MutableArrayRef holding the array:

MutableArrayRef<APValue> setVectorUninit(unsigned N);
MutableArrayRef<LValuePathEntry> setLValueUninit(LValueBase B, const CharUnits &O, unsigned Size, bool OnePastTheEnd, bool IsNullPtr);
MutableArrayRef<const CXXRecordDecl*> MakeMemberPointerUninit(const ValueDecl *Member, bool IsDerivedMember, unsigned Size);

(It would be nice if MakeMemberPointer weren't so inconsistent with everything else this class does. But this whole interface is a mess.)

clang/lib/AST/APValue.cpp
859–860

Similarly maybe mentioning Uninit here would be useful.

clang/lib/AST/ASTImporter.cpp
6623–6626

Delete this commented-out code, please.

8996

We want the path in an APValue to be canonical, but importing a canonical decl might result in a non-canonical decl.

9016–9017

If you're intentionally not handling DynamicAllocLValue here (because those should always be transient), a comment to that effect would be useful.

9057
clang/lib/Serialization/ASTReader.cpp
8837–8953

You can assume here (and assert in the writer) that both floats in a ComplexFloat have the same fltSemantics.

8891
Tyker updated this revision to Diff 295626.Oct 1 2020, 10:55 AM
Tyker marked 15 inline comments as done.

addressed comments

Tyker added inline comments.Oct 1 2020, 10:56 AM
clang/lib/AST/ASTImporter.cpp
8996

but importing a canonical decl might result in a non-canonical decl.

this is a quite surprising behavior.

9016–9017

i added an asserts with a message.

rsmith accepted this revision.Oct 1 2020, 9:48 PM

Looks great, thanks!

This revision is now accepted and ready to land.Oct 1 2020, 9:48 PM
martong requested changes to this revision.Oct 2 2020, 6:44 AM

Sorry for the late review, I just noticed something which is not a logical error, but we could make the ASTImporter code much cleaner.

clang/lib/AST/ASTImporter.cpp
9036–9047

A series of importChecked would result in a much cleaner code by omitting the second check. Also, instead of breaking out from the switch we can immediately return with the error.

This applies for the other cases where there are more import calls after each other (e.g. AddrLabelDiff, Union, ...).

Giving it more thought, you could use importChecked instead of Import everywhere in this function (for consistency).

This revision now requires changes to proceed.Oct 2 2020, 6:44 AM
martong added inline comments.Oct 2 2020, 7:01 AM
clang/lib/AST/ASTImporter.cpp
8996

but importing a canonical decl might result in a non-canonical decl.

this is a quite surprising behavior.

If you already have an existing redecl chain in the destination ASTContext (let's say A->B->C, A is the canonical decl), then importing a canonical decl (let's say D) would result an A->B->C->D' chain. Where D' is the imported version of D.

Reverse ping: I have a patch implementing class type non-type template parameters that's blocked on this landing. If you won't have time to address @martong's comments soon, do you mind if I take this over and land it?

Reverse ping: I have a patch implementing class type non-type template parameters that's blocked on this landing. If you won't have time to address @martong's comments soon, do you mind if I take this over and land it?

It is okay for me to commit this patch in its current state. The changes I suggested could result in a cleaner code, but I can do those changes after we land this.

Tyker updated this revision to Diff 298318.Oct 15 2020, 12:27 AM

try to apply martongs's suggestions.

Tyker added a comment.EditedOct 15 2020, 12:28 AM

Reverse ping: I have a patch implementing class type non-type template parameters

that's blocked on this landing. If you won't have time to address @martong's comments soon, do you mind if I take this over and land it?

nice to see this coming.

It is okay for me to commit this patch in its current state. The changes I suggested could result in a cleaner code, but I can do those changes after we land this.

i couldn't apply martong's suggestion completely because importChecked is part of ASTNodeImporter not ASTImporter, but cleaned up some code.
but the "real" blocker is that the testing depends on D85144 for testing.
we could land it marking the tests XFAIL and correct it when dumping improvements arrive.

Reverse ping: I have a patch implementing class type non-type template parameters

that's blocked on this landing. If you won't have time to address @martong's comments soon, do you mind if I take this over and land it?

nice to see this coming.

It is okay for me to commit this patch in its current state. The changes I suggested could result in a cleaner code, but I can do those changes after we land this.

i couldn't apply martong's suggestion completely because importChecked is part of ASTNodeImporter not ASTImporter, but cleaned up some code.

This indicates that the newly added Import(APValue *) function should be part of the ASTNodeImporter. I came up with a draft patch on top of your patch:
https://github.com/llvm/llvm-project/commit/ac738cf854bdafa83a23c400bd5b2a90520566f9
You can see, this way we can eliminate some redundant ifs and casts.

but the "real" blocker is that the testing depends on D85144 for testing.
we could land it marking the tests XFAIL and correct it when dumping improvements arrive.

I'd be OK with that. We'll have coverage for this in various forms landing pretty soon.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 21 2020, 10:04 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

but the "real" blocker is that the testing depends on D85144 for testing.
we could land it marking the tests XFAIL and correct it when dumping improvements arrive.

I'd be OK with that. We'll have coverage for this in various forms landing pretty soon.

i appiled martong's patch and landed it marking the test as XFAIL.