This is an archive of the discontinued LLVM Phabricator instance.

[ConstExprPreter] Updated constant interpreter documentation
ClosedPublic

Authored by nand on Mar 6 2020, 12:59 AM.

Details

Summary

Updated the documentation to better reflect features implemented on the
constexpr branch at https://github.com/nandor/llvm-project and extended
the TODO list with known missing features

Diff Detail

Unit TestsFailed

Event Timeline

nand created this revision.Mar 6 2020, 12:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2020, 12:59 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

This file needs linewrapping

nand updated this revision to Diff 249992.Mar 12 2020, 11:20 AM

Wrapped to ~80 chars/line

rsmith added inline comments.Apr 7 2020, 3:39 PM
clang/docs/ConstantInterpreter.rst
118

object -> objects

167

Descriptor -> Descriptors

169–170

This sounds like this is talking about the memory layout of descriptors, but I think the text below is actually talking about the memory layout of blocks?

174

reserved -> reserves

How do you distinguish between initialized and in-lifetime-but-uninitialized primitives? Eg:

constexpr int f() {
  int a;
  return a; // error, a is in-lifetime but uninitialized
}
174

How do you represent the result of casting a pointer to an integer (which we permit only when constant-folding, but nonetheless we do permit)?

182

when -> When

The overwhelmingly common case is the set of initialized elements is [0, 1, ..., K) for some K. Have you considered instead storing this value as a union of an InitMap* and an integer, using the bottom bit to indicate which case we're in? (We don't need to allocate the map at all except in weird cases where someone makes a hole in the array through a pseudo-destructor or allocates out-of-order with construct_at or similar.)

How do you distinguish between in-lifetime-but-uninitialized elements and out-of-lifetime elements? For example:

using T = int;
constexpr int f(bool b) {
  int arr[5];
  arr[3].~T();
  arr[0] = 1; // ok, uninitialized -> initialized
  if (!b)
    arr[3] = 1; // error, arr[3] is not in lifetime
  else
    std::construct_at(arr + 3, 0); // ok, not in lifetime -> in lifetime and initialized
  return arr[3];
}

Maybe we should use two bits per primitive in the InitMap case and store both "initialized" and "in-lifetime"?

194–195

field -> fields

From the description below, it looks like sizeof(InlineDescriptor) is currently 16. That seems unnecessary: We could easily get this down to 8 bytes by bit-packing the offset and the flags. (Restricting ourselves to 2^59 bytes for each record seems unproblematic.) I suspect it might even be OK to pack this into 4 bytes; that'd still allow us to support objects whose representations are up to 128 MiB -- though perhaps that's getting a little too close to territory that programs might actually want to enter.

201

It's not completely clear to me what this would mean for a base class. Is the idea that this tracks whether base classes are in their period of construction / destruction, so that you can determine the dynamic type of the object?

204

I assume this is actually more generally tracking whether the subobject is within its lifetime?

223–224

This seems problematic -- we need to distinguish between pointers with value zero and null pointers to support targets where they are different. Perhaps we could instead represent null pointers as a new tag in the tagged union, and reserve a zero TargetPointer for only those cases where a pointer is zero but not null? (See APValue::isNullPointer() and LValue::IsNullPtr for how we track that in the old constant evaluator.)

231–232

I would say "pointers and pointer-like types", since at least member pointers are not actually a kind of pointer.

328

behavious -> behaviour

329–334

Why do we need all three of these? If this is replacing the APValue forms with a null base and only an offset, a single value seems sufficient.

331

ajusts -> adjusts

nand updated this revision to Diff 256346.Apr 9 2020, 11:42 AM
nand marked 15 inline comments as done.

addressed comments, fixed typos

nand added a comment.Apr 9 2020, 11:43 AM

Thanks for the comments! I tried to clarify what could be done in the future and what is already supported.

clang/docs/ConstantInterpreter.rst
174

Lifetime is handled by invalidating pointers, whereas descriptors and inline descriptors keep track of initialised bits.

In this particular case, which is not yet supported since the interpreter presently expects all variables to be initialised, the byte code generator will have to emit an opcode that fetches the local and checks the initialised bit. Since currently all locals are assumed to be initialised, the opcode for fetching a local does not perform this check. For primitives, this bit will probably be part of the block.

174

This is still on the TODO list.

First of all, there should be a fast path which detects common pointer -> int -> pointer conversions and avoids all cast, creating opcodes to emit diagnostics.

For the generic case, codegen for pointer-to-int will fail with a note and compilation will be re-attempted, classifying pointer-wide integers and pointers as a primitive type capable of holding either a pointer or an integer.

182

This is a great suggestion - I think the initmap could also be a rolling counter.

I was not aware of the case that you have mentioned - destructors do pose a problem and another map will be required.

194–195

Descriptors could be compacted - for now, I have been focusing on feature completeness and tried to avoid performance optimisations.

201

Yes, updated the docs.

204

It's only for active union fields - maybe it can be merged with the initialised bit later on.

223–224

Hmm, I was not aware of this - another item could be added to the union without too much problem. Do you have an example of such a target?

329–334

Required for compatibility with the atIndex() method, which is used by the opcode computing array offsets.

rsmith accepted this revision.Apr 14 2020, 7:48 PM
rsmith marked an inline comment as done.

Thanks! I'm happy for you to go ahead landing patches that implement the direction documented here. (Let me know if you want me to review them anyway, otherwise I'm happy to assume you've had someone else look over them already.)

clang/docs/ConstantInterpreter.rst
167–170

distinguished -> distinguishes

223–224

It looks like the only case where this happens right now is for the OpenCL local address space on AMDGPU. Example:

echo 'unsigned long f() { constexpr local int *p = 0; constexpr unsigned long n = __builtin_constant_p(1) ? (unsigned long)p : 0; return n; }' | ./build/bin/clang -x cl -cl-std=clc++ -target amdgcn - -emit-llvm -S -o -

reveals the null pointer value there is -1.

328

(Typo still present.)

329–334

OK. Tracking all three seems like it shouldn't be necessary for correctness, but if it makes the implementation easier, then that seems fine.

331

(Typo still present.)

This revision is now accepted and ready to land.Apr 14 2020, 7:48 PM
nand updated this revision to Diff 257637.Apr 15 2020, 2:21 AM
nand marked 2 inline comments as done.

fixed typos

This revision was automatically updated to reflect the committed changes.