Page MenuHomePhabricator

Add overloaded versions of builtin mem* functions
Needs RevisionPublic

Authored by jfb on May 1 2020, 6:01 PM.

Details

Summary

The mem* builtins are often used (or should be used) in places where time-of-check time-of-use security issues are important (e.g. copying from untrusted buffers), because it prevents multiple reads / multiple writes from occurring at the untrusted memory location. The current builtins don't accept volatile pointee parameters in C++, and merely warn about such parameters in C, which leads to confusion. In these settings, it's useful to overload the builtin and permit volatile pointee parameters. The code generation then directly emits the existing volatile variant of the mem* builtin function call, which ensures that the affected memory location is only accessed once (thereby preventing double-reads under an adversarial memory mapping).

Instead of just handling volatile, this patch handles other overloads when they make sense, specifically around pointer qualification. It also cleans up some of the custom diagnostic handling to reduce duplication.

The patch also happens to be a decent base to implement C's memset_s from Annex K, as well as an alternate (and arguably better approach) that the C++ Committee's proposal for byte-wise atomic memcpy as described in https://wg21.link/P1478 (in particular, it lets developer own the fences, a topic discussed in https://wg21.link/p2061).

Side-note: yes, ToCToU avoidance is a valid use for volatile https://wg21.link/p1152r0#uses.

RFC for this patch: http://lists.llvm.org/pipermail/cfe-dev/2020-May/065385.html

Diff Detail

Unit TestsFailed

TimeTest
50 mslinux > Clang.Sema::builtin-sized-memfns.cpp
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc /mnt/disks/ssd0/agent/llvm-project/clang/test/Sema/builtin-sized-memfns.cpp -verify -fsyntax-only -triple=arm64-unknown-unknown -fms-extensions -DCPY=1
60 mswindows > Clang.Sema::builtin-sized-memfns.cpp
Script: -- : 'RUN: at line 1'; c:\ws\w16n2-1\llvm-project\premerge-checks\build\bin\clang.exe -cc1 -internal-isystem c:\ws\w16n2-1\llvm-project\premerge-checks\build\lib\clang\12.0.0\include -nostdsysteminc C:\ws\w16n2-1\llvm-project\premerge-checks\clang\test\Sema\builtin-sized-memfns.cpp -verify -fsyntax-only -triple=arm64-unknown-unknown -fms-extensions -DCPY=1

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rsmith added a comment.Aug 4 2020, 6:24 PM

Thanks. I'd like @rjmccall to approve too, but the design of these intrinsics and the Sema and constant evaluation parts seem fine. (We don't strictly need constant evaluation to abort on library UB, so I think not catching the misalignment case is OK.)

clang/docs/LanguageExtensions.rst
2456–2458

"runtime constraint violation" is an odd phrase; in C, "constraint violation" means a diagnostic is required. Can we instead say that it results in undefined behavior?

2463

Please document the trivial copy check.

clang/lib/CodeGen/CGBuiltin.cpp
2707–2708

The example I gave should produce a non-volatile memcpy, and used to do so (we passed false as the fourth parameter to CreateMemCpy). With this patch, getPtrArgTypewill strip off the implicit conversion from volatile void* to void* in the argument type, so isVolatile below will be true, so I think it will now create a volatile memcpy for the same testcase. If that's not what's happening, then I'd like to understand why not :)

I'm not saying that's necessarily a bad change, but it is a change, and it's one we should make intentionally if we make it at all.

Patch looks basically okay to me, although I'll second Richard's concern that we shouldn't absent-mindedly start producing overloaded memcpys for ordinary __builtin_memcpy.

clang/docs/LanguageExtensions.rst
2455

"*The* element size must..." But I would suggest using "access size" consistently rather than "element size".

jfb updated this revision to Diff 283121.Aug 4 2020, 9:21 PM
jfb marked 2 inline comments as done.

Update docs

jfb added a comment.Aug 4 2020, 9:21 PM

Patch looks basically okay to me, although I'll second Richard's concern that we shouldn't absent-mindedly start producing overloaded memcpys for ordinary __builtin_memcpy.

Yeah I think that's a leftover from the first patch. Should I drop it? At the same time, address spaces are currently accidentally "working", should I drop that in a patch before this change? Or leave as-is?

clang/docs/LanguageExtensions.rst
2455

I'm being consistent with the naming for IR, which uses "element" as well. I'm not enamored with the naming, but wanted to point out the purposeful consistency to make sure you preferred "access size". Without precedent I would indeed prefer "access size", but have a slight preference for consistency here. This is extremely weakly held preference.

(I fixed "the").

2463

Should I bubble this to the rest of the builtin in a follow-up patch? I know there are cases where that'll cause issues, but I worry that it would be a pretty noisy diagnostic (especially if we instead bubble it to C's memcpy instead).

clang/lib/CodeGen/CGBuiltin.cpp
2707–2708

Oh yes, sorry I thought you were talking about something that getPtrArgType did implicitly! Indeed the C code behaves differently in that it doesn't just strip volatile anymore.

I'm not super thrilled by the default C behavior, and I think this new behavior removes a gotcha, and is in fact what I was going for in the first iteration of the patch. Now that I've separated the builtin I agree that it's a bit odd... but it seems like the right thing to do anyways? But it no longer matches the C library function to do so.

FWIW, this currently "works as you'd expect":

void f(__attribute__((address_space(32))) void *dst, const void *src, int sz) {
    __builtin_memcpy(dst, src, sz);
}

https://godbolt.org/z/dcWxcK

and I think that's completely accidental because the C library function doesn't (and, as John pointed out earlier, the builtin is meant to act like the C function).

vsk added a subscriber: vsk.Aug 5 2020, 10:46 AM
vsk added inline comments.
compiler-rt/lib/ubsan/ubsan_handlers.cpp
656–657

It looks like __ubsan_handle_invalid_builtin is meant to be recoverable, so I think this should be GET_REPORT_OPTIONS(false). Marking this unrecoverable makes it impossible to suppress redundant diagnostics at the same source location. It looks this isn't code you've added: feel free to punt this to me. If you don't mind folding in a fix, adding a test would be simple (perform UB in a loop and verify only one diagnostic is printed).

rjmccall added inline comments.Aug 5 2020, 11:29 AM
clang/docs/LanguageExtensions.rst
2455

IR naming is generally random fluff plucked from the mind of an inspired compiler engineer. User documentation is the point where we're supposed to put our bad choices behind us and do something that makes sense to users. :)

Two observations that are new to me in this review:

  1. We already treat all builtins as being overloaded on address space.
  2. The revised patch treats __builtin_mem*_overloaded as being overloaded *only* on address space, volatility, and atomicity. (We've tuned the design to a point where the other qualifiers don't matter any more.)

So, we're adding three features here: overloading on (a) address space, (b) volatility, and (c) atomicity. (a) is already available in the non-_overloaded form, and we seem to be approaching agreement that (b) should be available in the non-_overloaded form too. So that only leaves (c), which is really not _overloaded but _atomic.

Based on those observations I'm now thinking that we might prefer a somewhat different approach (but one that should require only minimal changes to the patch in front of us). Specifically:

  1. Stop treating lib builtins (eg, plain memcpy) as overloaded on address space. That's a (pre-existing) conformance bug, at least for the Embedded C TR.
  2. Keep __builtin_ forms of lib builtins overloaded on address space. (No change.)
  3. Also overload __builtin_ forms of lib builtins on volatility where it makes sense, instead of adding new builtin names __builtin_mem*_overloaded.
  4. Add a new name for the builtin for the atomic forms of memcpy and memset (__builtin_memcpy_unordered_atomic maybe?).
  5. Convert the "trivial types" check from an error to a warning and apply it to all the mem* overloads. (Though I think we might actually already have such a check, so this might only require extending it to cover the atomic builtin.)

What do you think?

clang/docs/LanguageExtensions.rst
2443

Does restrict really make sense here? It seems like the only difference it could possibly make would be to treat memcpy as memmove if either operand is marked restrict, but
(a) if the caller wants that, they can just use memcpy directly, and
(b) it's not correct to propagate restrict-ness from the caller to the callee anyway, because restrict-ness is really a property of the declaration of the identifier in its scope, not a property of its type:

void f(void *restrict p) {
  __builtin_memmove(p, p + 1, 4);
}

(c) a restrict qualifier on the pointee type is irrelevant to memcpy and a restrict qualifier on the pointer type isn't part of QUAL.

2444

I don't think `__unaligned matters any more. We always take the actual alignment inferred from the pointer arguments, just like we do for non-overloaded memcpy`.

2464–2465
clang/lib/CodeGen/CGBuiltin.cpp
2707–2708

FWIW, this currently "works as you'd expect":

void f(__attribute__((address_space(32))) void *dst, const void *src, int sz) {
    __builtin_memcpy(dst, src, sz);
}

The same is true even if you remove the __builtin_ (and add a suitable include), and that seems like a bug to me. It looks like we have special logic that treats *all* builtins taking pointers as being overloaded on address space, which seems wrong at least for lib builtins. C TR 18037:2008 is quite explicit about this. Section 5.1.4 says:

"""
The standard C library (ISO/IEC 9899:1999 clause 7 - Libraries) is unchanged; the library's
functions and objects continue to be declared only with regard to the generic address space. One
consequence is that pointers into named address spaces cannot be passed as arguments to library
functions except in the special case that the named address spaces are subsets of the generic
address space.
"""

We could retain that special rule for __builtin_-spelled variants of lib builtins. If we do, then maybe we shouldn't be adding __builtin_memcpy_overloaded at all and should only extend the behavior of __builtin_memcpy to also propagate volatility (and add a new builtin for the atomic case).

Regarding volatile, consider:

void maybe_volatile_memcpy(volatile void *dst, const volatile void *src, int sz, _Bool is_volatile) {
  if (is_volatile) {
#ifdef __clang__
    __builtin_memcpy_overloaded(dst, src, sz);
#elif __GNUC__
    // ...
#else
    // volatile char copy loop
#endif
  }
  memcpy(dst, src, sz);
}

With this patch, the above code will always perform a volatile memcpy. I think that's a surprise. A call to memcpy should follow the C semantics, even if we choose to change the semantics of __builtin_memcpy.

clang/lib/Sema/SemaChecking.cpp
5732

You need to call Sema::isCompleteType first before asking this question, in order to trigger class instantiation when necessary in C++. (Likewise for the checks in the previous function.)

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

If that's not true, or if we're willing to ignore it, I agree that making __builtin_memcpy do the right thing for qualifiers in general is the right thing to do.

jfb updated this revision to Diff 283390.Aug 5 2020, 2:02 PM

Do UBSan change suggested by Vedant.

jfb updated this revision to Diff 283400.Aug 5 2020, 2:36 PM
jfb marked an inline comment as done.

Add loop test requested by Vedant

compiler-rt/lib/ubsan/ubsan_handlers.cpp
656–657

I folded this into the patch.

jfb updated this revision to Diff 283402.Aug 5 2020, 2:42 PM
jfb marked an inline comment as done.

Use 'access size' instead of 'element size'.

clang/docs/LanguageExtensions.rst
2455

"access size" it is :)

jfb updated this revision to Diff 283406.Aug 5 2020, 3:02 PM
jfb marked 5 inline comments as done.

Remove restrict, update docs, call isCompleteType

jfb added a comment.Aug 5 2020, 3:02 PM

Two observations that are new to me in this review:

  1. We already treat all builtins as being overloaded on address space.
  2. The revised patch treats __builtin_mem*_overloaded as being overloaded *only* on address space, volatility, and atomicity. (We've tuned the design to a point where the other qualifiers don't matter any more.)

So, we're adding three features here: overloading on (a) address space, (b) volatility, and (c) atomicity. (a) is already available in the non-_overloaded form, and we seem to be approaching agreement that (b) should be available in the non-_overloaded form too. So that only leaves (c), which is really not _overloaded but _atomic.

Based on those observations I'm now thinking that we might prefer a somewhat different approach (but one that should require only minimal changes to the patch in front of us). Specifically:

  1. Stop treating lib builtins (eg, plain memcpy) as overloaded on address space. That's a (pre-existing) conformance bug, at least for the Embedded C TR.
  2. Keep __builtin_ forms of lib builtins overloaded on address space. (No change.)
  3. Also overload __builtin_ forms of lib builtins on volatility where it makes sense, instead of adding new builtin names __builtin_mem*_overloaded.
  4. Add a new name for the builtin for the atomic forms of memcpy and memset (__builtin_memcpy_unordered_atomic maybe?).
  5. Convert the "trivial types" check from an error to a warning and apply it to all the mem* overloads. (Though I think we might actually already have such a check, so this might only require extending it to cover the atomic builtin.)

What do you think?

That's fine with me, but as John noted that's inconsistent with what he thought builtins allowed (i.e. #define memcpy(dst, src, sz) __builtin_memcpy(dst, src, sz). If that's Not A Thing then your plan works. If it is then we need to tune it a bit.

Also note that the _atomic builtin also needs to support some overloading, at least for address spaces (and maybe volatile in the future).

So, let me know what you'd both rather see, so I don't ping-pong code too much.

clang/docs/LanguageExtensions.rst
2443

I dropped restrict.

2444

It's still allowed as a qualifier, though.

clang/lib/Sema/SemaChecking.cpp
5732

Before the condition, right? LMK if I added the right thing!

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless. It doesn't look like glibc does anything like this; do you know of a C standard library implementation that does?

If we want to follow that path, then we'll presumably (eventually) want address-space-_overloaded versions of all lib builtins that take pointers -- looks like that's around 60 functions total. That said, I do wonder how many of the functions in question that we're implicitly overloading on address space actually support such overloading -- certainly any of them that we lower to a call to a library function is going to go wrong at runtime.

+@tstellar, who added this functionality in r233706 -- what was the intent here?

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless. It doesn't look like glibc does anything like this; do you know of a C standard library implementation that does?

If we want to follow that path, then we'll presumably (eventually) want address-space-_overloaded versions of all lib builtins that take pointers -- looks like that's around 60 functions total. That said, I do wonder how many of the functions in question that we're implicitly overloading on address space actually support such overloading -- certainly any of them that we lower to a call to a library function is going to go wrong at runtime.

+@tstellar, who added this functionality in r233706 -- what was the intent here?

The goal of this patch was to avoid having to overload all the builtin with address spaces, which would be a lot of new builtins, but this functionality was added for targets that do not have a memcpy lib call, so I didn't consider the case where a libcall would be emitted.

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless.

It wouldn't be pointless; it would enable optimization of direct calls to memcpy (the 99% case) without the compiler having to special-case a function by name. And you don't need an #undef, since &memcpy doesn't trigger the preprocessor when memcpy is a function-like macro. I seem to remember this being widely used in some math libraries, but it's entirely possible that it's never been used for memcpy and the like. It's also entirely possible that I'm passing around folklore.

If we just want memcpy to do the right thing when called directly, that's not ridiculous. I don't think it would actually have any conformance problems: a volatile memcpy is just a less optimizable memcpy, and to the extent that an address-space-aware memcpy is different from the standard definition, it's probably UB to use the standard definition to copy memory in non-generic address spaces.

rsmith added a comment.Aug 6 2020, 4:50 PM

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

For what it's worth, giving __builtin_memcpy broader semantics than memcpy wouldn't be unprecedented: it's already the case that __builtin_memcpy is usable in constant expressions where plain memcpy is not (but there's no macro risk in that case at least since C++ memcpy can't be a macro, and a call to memcpy is never an ICE in C).

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless.

It wouldn't be pointless; it would enable optimization of direct calls to memcpy (the 99% case) without the compiler having to special-case a function by name.

I mean, yes, in an environment where the compiler didn't special-case the function by name anyway it wouldn't be pointless. I'm not aware of any such environment in popular usage, but that could just be ignorance on my part.

If we just want memcpy to do the right thing when called directly, that's not ridiculous. I don't think it would actually have any conformance problems: a volatile memcpy is just a less optimizable memcpy, and to the extent that an address-space-aware memcpy is different from the standard definition, it's probably UB to use the standard definition to copy memory in non-generic address spaces.

I think this is a constraint violation in C; C11 6.5.2.2/2 (a "Constraints" paragraph) requires that "Each argument shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter." So this would be an extension, not just defining UB, and seems like the kind of extension we'd normally diagnose under -pedantic-errors, but we could make an exception. I also think it'd be mildly surprising for an explicitly-declared function to allow different argument conversions depending on whether its name is memcpy.

It seems to me that you're not entirely comfortable with making memcpy and __builtin_memcpy differ in this way, and I'm not entirely comfortable with making memcpy's behavior differ from its declared type. Meanwhile, @tstellar's patch wants __builtin_memcpy to be overloaded on address space. I don't see a way to address all three concerns at once.

I think it would be reasonable in general to guarantee that our __builtin_ functions have contracts at least as wide as the underlying C function, but allow them to have extensions, and to keep the plain C functions unextended. I had actually thought we already did that in more cases than we do (but perhaps I was thinking about the LLVM math intrinsics that guarantee to not set errno). That would mean that a C standard library implementation is still free to #define foo(x,y,z) __builtin_foo(x,y,z), but if they do, they may pick up extensions.

jfb added a comment.Aug 11 2020, 11:17 AM

I think it would be reasonable in general to guarantee that our __builtin_ functions have contracts at least as wide as the underlying C function, but allow them to have extensions, and to keep the plain C functions unextended. I had actually thought we already did that in more cases than we do (but perhaps I was thinking about the LLVM math intrinsics that guarantee to not set errno). That would mean that a C standard library implementation is still free to #define foo(x,y,z) __builtin_foo(x,y,z), but if they do, they may pick up extensions.

Alright, how about I go with what you say here, and instead of adding __builtin_*_overloaded versions I just overload the __builtin_* variants? This includes having an optional 4th parameter for access size.
Alternatively, I could overload __builtin_*, but have a separate set of functions (say __builtin_*_sized) for the atomic access size variants.

jfb updated this revision to Diff 285414.Aug 13 2020, 10:03 AM

Update overloading as discussed: on the original builtins, and separate the _sized variant, making its 4th parameter non-optional. If this looks good, I'll need to update codege for a few builtins (to handle volatile), as well as add tests for their codegen and address space (which should already work, but isn't tested).

jfb updated this revision to Diff 285479.Aug 13 2020, 1:11 PM

Fix a test.

jfb added a comment.Aug 13 2020, 2:09 PM

Actually I think any subsequent updates to tests can be done in a follow-up patch, since I'm not changing the status-quo on address space here.

jfb added a comment.Aug 24 2020, 10:59 AM

Ping, I think I've addressed all comments here.

Thanks, I'm happy with this approach.

If I understand correctly, the primary (perhaps sole) purpose of __builtin_memcpy_sized is to provide a primitive from which an atomic operation can be produced. That being the case, I wonder if the name is emphasizing the wrong thing, and a name that contains atomic would be more obvious. As a concrete alternative, __atomic_unordered_memcpy is not much longer and seems to have the right kinds of implications. WDYT?

clang/docs/LanguageExtensions.rst
2395–2397

"can also be overloaded" -> "are also overloaded" would be clearer I think. ("Can also be overloaded" would suggest to me that it's the user of the builtin who overloads them, perhaps by declaring overloads.)

2403–2406

Is this true in general, or only for [w]mem{cpy,move}? I thought for the other cases, we required an array of char / wchar_t?

2409

I think this only applies to the above list minus the five functions you added to it. Given this and the previous comment, I'm not sure that merging the documentation on string builtins and memory builtins is working out well -- they seem to have more differences than similarities.

(memset is an outlier here that should be called out -- we don't seem to provide any constant evaluation support for it whatsoever.)

2451
2452
clang/lib/AST/ExprConstant.cpp
8870

getLimitedValue() here seems unnecessary; urem can take an APInt.

8872

Consider using toString instead of truncating each APSInt to uint64_t then to int. The size might reliably fit into uint64_t, but I don't think we can assume that int is large enough.

clang/lib/CodeGen/CGBuiltin.cpp
635

I'm not sure this castAs is necessarily correct. If the operand is C++11 nullptr, we could perform a null-to-pointer implicit conversion, and ArgTy could be NullPtrTy after stripping that back off here.

It seems like maybe what we want to do is strip off implicit conversions until we hit a non-pointer type, and take the pointee type we found immediately before that?

clang/lib/Sema/SemaChecking.cpp
5609

Generally, I'm a little uncomfortable about producing an error if a type is complete but allowing the construct if the type is incomplete -- that seems like a situation where a warning would be more appropriate to me. It's surprising and largely unprecedented that providing more information about a type would change the program from valid to invalid.

Do we really need the protection of an error here rather than an enabled-by-default warning? Moreover, don't we already have a warning for memcpy of a non-trivially-copyable object somewhere? If not, then I think we should add such a thing that also applies to the real memcpy, rather than only warning on the builtin.

5732

It would be more correct from a modules perspective to use

if (isCompleteType(Loc, T) && !T.isTriviallyCopyableType(Context))

That way, if the definition of the type is in some loaded-but-not-imported module file, we'll treat it the same as if the definition of the type is entirely unknown. (That also removes the need to check for the void case.) But given that this only allows us to accept code that is wrong in some sense, I'm not sure it really matters much.

jfb updated this revision to Diff 288131.Aug 26 2020, 4:22 PM
jfb marked 12 inline comments as done.

Address Richard's comments.

jfb added a comment.Aug 26 2020, 4:25 PM

Thanks, I'm happy with this approach.

If I understand correctly, the primary (perhaps sole) purpose of __builtin_memcpy_sized is to provide a primitive from which an atomic operation can be produced. That being the case, I wonder if the name is emphasizing the wrong thing, and a name that contains atomic would be more obvious. As a concrete alternative, __atomic_unordered_memcpy is not much longer and seems to have the right kinds of implications. WDYT?

Kinda, that's the motivating from Hans' paper which I'm following. One other use case (and the reason I assume Azul folks want it too) is when there's a GC that looks at objects. With this it knows it won't see torn objects when the GC is concurrent. It's similar, but generally atomic also implies an ordering, and here it's explicitly unordered (i.e. Bring Your Own Fences).

So I don't have a strong opinion on the name, but atomic seem slightly wrong.

Follow-ups I suggest in comments:

  • Make "must be trivially copyable" a warning throughout (not just here, but for atomics too).
  • Implement that diagnostic for mem* functions.
clang/docs/LanguageExtensions.rst
2403–2406

This is just moving documentation that was below. Phab is confused with the diff.

2409

[w]memset are indeed the odd ones, update says so.

clang/lib/AST/ExprConstant.cpp
8870

Their bitwidth doesn't always match, and that asserts out.

8872

OK I updated 2 other places as well.

clang/lib/CodeGen/CGBuiltin.cpp
635

Ah good catch! The new functions I'm adding just disallow nullptr, but the older ones allow it. I've modified the code accordingly and added a test in CodeGen for nullptr.

2707–2708

Yes, I believe that this is a pre-existing inconsistency with the non-__builtin_ variants.

clang/lib/Sema/SemaChecking.cpp
5609

That rationale makes sense, but it's pre-existing behavior for atomic. I can change all of them in a follow-up if that's OK?

We don't have such a check for other builtins. I can do a second follow-up to then adopt these warnings for them too?

jfb added a reviewer: rsmith.Aug 29 2020, 3:52 PM
erichkeane accepted this revision.Wed, Nov 4, 9:34 AM

I went through all the comments here, plus looked at the code myself. I believe all of the comments by other reviewers have been fixed/answered acceptably. I don't have any additional comments to add, therefore I think it is appropriate to accept this revision.

@jfb: Please give this a day or two before committing to give the other reviewers a chance to speak up!

This revision is now accepted and ready to land.Wed, Nov 4, 9:34 AM
rsmith requested changes to this revision.Wed, Nov 4, 5:43 PM

I think the documentation isn't quite right yet, but otherwise I think I'm happy. (With a couple of code change suggestions.)

In D79279#2240487, @jfb wrote:

If I understand correctly, the primary (perhaps sole) purpose of __builtin_memcpy_sized is to provide a primitive from which an atomic operation can be produced. That being the case, I wonder if the name is emphasizing the wrong thing, and a name that contains atomic would be more obvious. As a concrete alternative, __atomic_unordered_memcpy is not much longer and seems to have the right kinds of implications. WDYT?

Kinda, that's the motivating from Hans' paper which I'm following. One other use case (and the reason I assume Azul folks want it too) is when there's a GC that looks at objects. With this it knows it won't see torn objects when the GC is concurrent. It's similar, but generally atomic also implies an ordering, and here it's explicitly unordered (i.e. Bring Your Own Fences).

So I don't have a strong opinion on the name, but atomic seem slightly wrong.

I mean, I see your point, but I think not mentioning atomic at all also seems a little surprising, given how tied this feature is to atomic access (eg, rejecting access sizes larger than the inline atomic width). But this is not a hill I have any desire to die on :) If you'd prefer to keep the current name, I can live with it.

clang/docs/LanguageExtensions.rst
2403–2406

The documentation that was moved applied to memcpy, memmove, wmemcpy, and wmemmove. In the new location, it doesn't apply to wmemcpy nor wmemmove but does now apply to memchr, memcmp, ..., for which it is incorrect.

We used to distinguish between "string builtins", which had the restriction to character types, and "memory builtins", which had the restriction to trivially-copyable types. Can you put that distinction back, or otherwise restore the wording to the old / correct state?

2409

This feature test macro doesn't cover memcpy / memmove; this documentation is incorrect for older versions of Clang. Clang 4-7 define this feature test macro but do not support constant evaluation of memcpy / memmove.

clang/lib/CodeGen/CGBuiltin.cpp
636–643

(Just a simplification, NFC.)

clang/lib/Sema/SemaChecking.cpp
5609

The pre-existing behavior for atomic builtins is to reject if the type is incomplete. Eg;

<stdin>:1:33: error: address argument to atomic operation must be a pointer to a trivially-copyable type ('struct A *' invalid)
struct A; void f(struct A *p) { __atomic_store(p, p, 0); }
                                ^              ~

We should do the same here. (Though I'd suggest calling RequireCompleteType instead to get a more meaningful diagnostic.) These days I think we should check for unsized types too, eg:

if (RequireCompleteSizedType(ScrOp->getBeginLoc(), SrcValTy))
  return true;
if (!SrcValTy.isTriviallyCopyableType(Context) && !SrcValTy->isVoidType())
  return ExprError(...);
This revision now requires changes to proceed.Wed, Nov 4, 5:43 PM