This is a first cut.
- The constexpr stuff is not tested; I don't think it's going to work. Still investigating.
- Only GCC and clang are supported at the moment. I need to add other compilers.
Differential D54966
Implement P1007R3 `std::assume_aligned` ldionne on Nov 27 2018, 12:25 PM. Authored by
Details
This is a first cut.
Diff Detail Event Timeline
Comment Actions Chandler has said to me (via IRC) that when called at constexpr time, this should just be a return __p. I'll add that after we have std:: is_constant_evaluated
Comment Actions Updated the tests; added constexpr tests (which are currently disabled).
Comment Actions Can you update the cxx2a_status page to say that we're missing constexpr support for the feature?
Comment Actions We don't need to use __builtin_assume_aligned to implement this. We can use __attribute__((assume_aligned(N))) instead. That solves our constexpr problem. Comment Actions More than that, we can't use __builtin_assume_aligned, as currently implemented in Clang and GCC, to implement this -- it is non-constant if it cannot prove that the pointer is properly-aligned. The library specification for std::assume_aligned doesn't permit that behavior, and instead appears to require that a call is a constant subexpression whenever p is properly aligned (even if that can't be determined until runtime), meaning that it must also be a constant subexpression in cases where you cannot prove alignment one way or the other. Example: extern char p; // might be defined with `alignas(16)` extern char *q; char *r = q; // required to be initialized to `&p` if `p` is properly aligned, may be `&p` or `nullptr` otherwise char *q = std::assume_aligned<16>(&p); // required to be statically initialized if `p` is properly aligned int main() { assert((uintptr_t)&p & 15 || r == &q); } If you use __builtin_assume_aligned to implement std::assume_aligned, the above assertion can fail, because &p can't be proven to be properly-aligned. But if you use the attribute, then no checking will be done during constant evaluation, and the assertion will always pass. (Try it out yourself: https://godbolt.org/z/YKIdWF -- try changing the alignment from 1 to 2 and back, and observe that q switches between static and dynamic initialization, and r switches from being initialized to &p to being initialized to nullptr, even though the compiler does not know that p will not be 2-byte aligned.) It's OK that assume_aligned is a constant subexpression even when such a call would be UB; the fact that we might have undefined behavior within a constant expression evaluation of the library function std::assume_aligned is explicitly permitted by [expr.const]p4 (after the long list of bullets):
@chandlerc, @hfinkel: does an attribute-only implementation (with no constant evaluation enforcement) materially hurt the ability for the optimizer to use this annotation? Eg, in: extern char k[16]; void f() { // the initializer of p is a constant expression and might get constant-folded to &k by the frontend char *p = std::assume_aligned<16>(&k); // ...do stuff... } the alignment assumption may well never be emitted as IR. Is that acceptable? Comment Actions __attribute__((assume_aligned(N))) and __builtin_assume_aligned are, at the IR level, implemented in a very-similar way. For functions with that attribute on the return type, we essentially emit an alignment assumption on the return value at every call site (in CodeGenFunction::EmitCall). Thus, from the optimizer's perspective, I don't think that it makes a big difference. Comment Actions The point here is that you may well get no IR annotation whatsoever for the above call, because the frontend might constant-evaluate the initializer of p down to just &k, and then emit IR that just initializes p to &k with no alignment assumption. Whereas if we treated assume_aligned<N>(p) as non-constant in the cases where we cannot prove that p is suitably aligned (as __builtin_assume_aligned does), then we would emit IR for the alignment assumption, but the downside is that the initializer of p would no longer be a constant expression. Essentially, what I'm trying to gauge here is, is it OK that you probably don't actually get an alignment assumption in a function like the f() above, because it will probably be constant-evaluated away to nothing? Or do we need the constant evaluator to have some kind of side-channel by which it can communicate back to the code generator that an alignment assumption should be applied to k? Or, indeed, should assume_aligned<N>(p) not be treated as a constant expression unless we can prove during constant expression evaluation that p is in fact suitably aligned -- as GCC and Clang currently do for __builtin_assume_aligned? Comment Actions I thought about this side channel option for k, but I don't think that we can because we'd need to prove that f() function was always executed, and that's likely not generally possible. I think that the side channel would apply only to p.
From a C++ perspective, this seems suboptimal. I don't want people to duplicate code, some with assume_aligned, some without, if I want the same code with work both in a constexpr and not. A side channel would be better. It is a trade off, however, and I'd need to think more about it. Comment Actions After thinking more about it, I think users of p in the function f really should be able to assume the alignment, and whether they succeed at that should not be determined by whether the initialize of p happens to fail to be a constant expression for some reason. Let me lay out my reasoning, it comes from considering a few examples. 1a) The fact that the address happens to fold to a constant and thus the initializer is a constant as well is not going to be enough to reliably optimize the uses. Just because we know the address of some very hot vector data is a global and thus a constant "relocation" that we can fold does *not* mean that we will be able to reconstruct the alignment guarantees if we need to do so. This means that we would be missing real optimization opportunities here, and indeed, the exact opportunities that std::assume_aligned was intended to open up. 1b) Indeed, I could imagine a collection of routines which all use the same constant initialized address but which make different alignment assumptions. And I could imagine code dispatching (potentially dynamically) to the correctly aligned routine. That seems like reasonable code to expect to be able to write given this facility, and yet it would be directly undermined by what you describe.
When considering these kinds of situations, it really seems like this needs to be propagated. And not just locally, but as far as the constant evaluation proceeds, walking past as many constant evaluated wrappers as needed. =/ I don't really think of this as needing a side-channel so much as address in the constant evaluation needing to track alignment in some way such that these get propagated as one would expect.
Yeah, I really don't think we want to kill constant evaluation just to preserve alignment assumptions. Instead, we want to model those assumptions in the evaluator IMO (even if we don't allow them to fold away in core constant expressions or parts of the ABI).
|
_LIBCPP_INLINE_VISIBILITY?