This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
docs/
34/36
LanguageExtensions.rst
-
include/clang/
-
clang/
-
Basic/
6/6
Builtins.def
-
DiagnosticASTKinds.td
7/7
DiagnosticSemaKinds.td
-
Sema/
-
Sema.h
-
lib/
-
AST/
8/8
ExprConstant.cpp
-
CodeGen/
-
CGBuilder.h
10/11
CGBuiltin.cpp
-
CodeGenFunction.h
-
Sema/
31/32
SemaChecking.cpp
-
test/
-
CodeGen/
-
builtin-memfns.c
-
builtin-sized-memfns.c
-
ubsan-builtin-checks.c
-
ubsan-builtin-ctz-clz.c
-
ubsan-builtin-mem_sized.c
-
CodeGenObjC/
-
builtin-memfns.m
-
Sema/
-
builtin-sized-memfns.cpp
-
SemaCXX/
-
constexpr-string.cpp
-
compiler-rt/
-
lib/ubsan/
-
ubsan/
-
ubsan_handlers.h
2/2
ubsan_handlers.cpp
-
test/ubsan/TestCases/Misc/
-
ubsan/
-
TestCases/
-
Misc/
1/1
builtins-ctz-clz.cpp
-
builtins-mem_sized.cpp
-
builtins.cpp

Differential D79279

Add overloaded versions of builtin mem* functions
Needs RevisionPublic

Authored by jfb on May 1 2020, 6:01 PM.

Download Raw Diff

Details

Reviewers

tstellar
rsmith
erichkeane

Summary

The mem* builtins are often used (or should be used) in places where time-of-check time-of-use security issues are important (e.g. copying from untrusted buffers), because it prevents multiple reads / multiple writes from occurring at the untrusted memory location. The current builtins don't accept volatile pointee parameters in C++, and merely warn about such parameters in C, which leads to confusion. In these settings, it's useful to overload the builtin and permit volatile pointee parameters. The code generation then directly emits the existing volatile variant of the mem* builtin function call, which ensures that the affected memory location is only accessed once (thereby preventing double-reads under an adversarial memory mapping).

Instead of just handling volatile, this patch handles other overloads when they make sense, specifically around pointer qualification. It also cleans up some of the custom diagnostic handling to reduce duplication.

The patch also happens to be a decent base to implement C's memset_s from Annex K, as well as an alternate (and arguably better approach) that the C++ Committee's proposal for byte-wise atomic memcpy as described in https://wg21.link/P1478 (in particular, it lets developer own the fences, a topic discussed in https://wg21.link/p2061).

Side-note: yes, ToCToU avoidance is a valid use for volatile https://wg21.link/p1152r0#uses.

RFC for this patch: http://lists.llvm.org/pipermail/cfe-dev/2020-May/065385.html

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	50 ms	linux > Clang.Sema::builtin-sized-memfns.cpp
	60 ms	windows > Clang.Sema::builtin-sized-memfns.cpp

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

jfb added inline comments.Jul 2 2020, 5:29 PM

clang/lib/Sema/SemaChecking.cpp
1505	It's to avoid weird corner cases where this check isn't super relevant, but subsequent ones are. It avoids making `isVolatileQualified` below sad because e.g. `void` makes the `QualType` null. That one can't be `_Atomic`, and it can be `volatile` but then the size won't match the `_Atomic`'s size.
1523	I don't think so here either.

Harbormaster failed remote builds in B62786: Diff 275266!Jul 2 2020, 6:53 PM

rjmccall added inline comments.Jul 2 2020, 11:01 PM

clang/lib/Sema/SemaChecking.cpp
1606	I am not a fan of this lambda style, not because I dislike lambdas, but because you've pulled a ton of code that's supporting one or two cases (that could easily be handled together) into a much wider scope. Your helper code are doing a ton of redundant type checks and is probably not as general as you think it is. You need to call `DefaultFunctionArrayLvalueConversion` on the pointer arguments, after which you can just check for a pointer type. You also need to convert the size argument to a `size_t` as if initializing a parameter. If you do these things, the IRGen code will get much simpler because e.g. it will not need to specially handle arrays anymore. You will also start magically doing the right thing w.r.t ODR-uses of constexpr variables.

gchatelet added inline comments.Jul 3 2020, 1:00 PM

clang/include/clang/Basic/Builtins.def
491	I don't see `memmove_inline` being useful but memset and memcmp would make sense to add as building blocks for C++ implementations (e.g. libc memcpy) As for this new addition, how about `__builtin_memcpy_honor_qualifiers`? I fear that `__builtin_memcpy_overloaded` is too ambiguous.

Follow John's suggestions

Harbormaster completed remote builds in B65267: Diff 279896.Jul 22 2020, 12:26 PM

You need to add user docs for these builtins.

clang/include/clang/Basic/DiagnosticSemaKinds.td
8961	I don't know why you're adding a bunch of new diagnostics about _Atomic.
clang/lib/CodeGen/CGBuiltin.cpp
636	Since arrays are handled separately now, this is just `getPointeeType()`, but I don't know why you need to support ObjC object pointer types here at all.
clang/lib/CodeGen/CGExpr.cpp
1070 ↗	(On Diff #279896)	Why arrays?
clang/lib/Sema/SemaChecking.cpp
5552	Do you ever write these back into the call?
5574	You already know that DstTy and SrcTy are non-null here. Why do you need to support atomic types for these operations anyway? It just seems treacherous and unnecessary.

Address all but one of John's comments

clang/include/clang/Basic/DiagnosticSemaKinds.td
8961	Maybe the tests clarify this? Here's my rationale for the 3 new atomic diagnostics: Don't support mixing `volatile` and `atomic`, because we'd need to add IR support for it. It might be useful, but as a follow-up. Overloaded `memcpy` figures out the atomic operation size based on the element's own size. There's a destination and a source pointer, and we can't figure out the expected atomic operation size if they differ. It's likely an unintentional error to have different sizes when doing an atomic `memcpy`, so instead of figuring out the largest common matching size I figure it's better to diagnose. Supporting non-lock-free sizes seems fraught with peril, since it's likely unintentional. It's certainly doable (loop call the runtime support), but it's unclear if we should take the lock just once over the entire loop, or once for load+store, or once for load and once for store. I don't see a point in supporting it.
clang/lib/CodeGen/CGBuiltin.cpp
636	I'll remove ObjC handling for now, I added it because of code like what's in: clang/test/CodeGenObjC/builtin-memfns.m // PR13697 void cpy1(int a, id b) { // CHECK-LABEL: @cpy1( // CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8 {{.}}, i8 {{.}}, i64 8, i1 false) memcpy(a, b, 8); } Should we support this? It seems to me like yes, but you seem to think otherwise? On arrays / ObjC being handled now: that's not really true... or rather, it now is for the builtins I'm adding, but not for the previously existing builtins. We can't just get the pointer argument type for this code: // <rdar://problem/11314941> // Make sure we don't over-estimate the alignment of fields of // packed structs. struct PS { int modes[4]; } __attribute__((packed)); struct PS ps; void test8(int arg) { // CHECK: @test8 // CHECK: call void @llvm.memcpy{{.}} align 4 {{.}} align 1 {{.*}} 16, i1 false) __builtin_memcpy(arg, ps.modes, sizeof(struct PS)); } Because `__builtin_memcpy` doesn't perform the conversion. Arguable a pre-existing bug, which I can patch here as I have, or fix in Sema if you'd rather see that? LMK.
clang/lib/Sema/SemaChecking.cpp
5574	Leftover from the refactoring :) It's useful to get atomic memcpy, see https://wg21.link/P1478 It's also part of "support overloaded memcpy" which is what doing more than `volatile` implies.

Harbormaster completed remote builds in B65315: Diff 279984.Jul 22 2020, 6:29 PM

Is there a need for an atomic memcpy at all? Why is it useful to allow this operation to take on "atomic" semantics — which aren't actually atomic because the loads and stores to elements are torn — with hardcoded memory ordering and somewhat arbitrary rules about what the atomic size is?

In D79279#2168479, @rjmccall wrote:

Is there a need for an atomic memcpy at all? Why is it useful to allow this operation to take on "atomic" semantics — which aren't actually atomic because the loads and stores to elements are torn — with hardcoded memory ordering and somewhat arbitrary rules about what the atomic size is?

Hans lays out a rationale for usefulness in his paper, but what I've implemented is more useful: it's *unordered* so you can fence as you desire around it, yet it guarantees a minimum memory access size based on the pointer parameters. For example, copying an atomic int will be 4 byte operations which are single-copy-atomic, but the accesses from one int to the next aren't performed in any guaranteed order (or observable in any guaranteed order either). I talked about this with him a while ago but IIRC he wasn't sure about implementation among other things, so when you asked me to widen my original volatile-only memcpy to also do other qualifiers, I realized that it was a neat way to do atomic as well as other qualifiers. I've talked to a few SG1 folks about this, and I believe (for other reasons too) it's where the design will end up for Hans' paper.

In D79279#2168533, @jfb wrote:

In D79279#2168479, @rjmccall wrote:

Is there a need for an atomic memcpy at all? Why is it useful to allow this operation to take on "atomic" semantics — which aren't actually atomic because the loads and stores to elements are torn — with hardcoded memory ordering and somewhat arbitrary rules about what the atomic size is?

Hans lays out a rationale for usefulness in his paper, but what I've implemented is more useful: it's *unordered* so you can fence as you desire around it, yet it guarantees a minimum memory access size based on the pointer parameters. For example, copying an atomic int will be 4 byte operations which are single-copy-atomic, but the accesses from one int to the next aren't performed in any guaranteed order (or observable in any guaranteed order either). I talked about this with him a while ago but IIRC he wasn't sure about implementation among other things, so when you asked me to widen my original volatile-only memcpy to also do other qualifiers, I realized that it was a neat way to do atomic as well as other qualifiers. I've talked to a few SG1 folks about this, and I believe (for other reasons too) it's where the design will end up for Hans' paper.

I can see the usefulness of this operation, but it seems like a odd semantic mismatch for what is basically just a memcpy where one of the pointers happens to have _Atomic type, like you're shoe-horning it into this builtin just to avoid declaring a different one.

In D79279#2168649, @rjmccall wrote:

In D79279#2168533, @jfb wrote:

In D79279#2168479, @rjmccall wrote:

Is there a need for an atomic memcpy at all? Why is it useful to allow this operation to take on "atomic" semantics — which aren't actually atomic because the loads and stores to elements are torn — with hardcoded memory ordering and somewhat arbitrary rules about what the atomic size is?

Hans lays out a rationale for usefulness in his paper, but what I've implemented is more useful: it's *unordered* so you can fence as you desire around it, yet it guarantees a minimum memory access size based on the pointer parameters. For example, copying an atomic int will be 4 byte operations which are single-copy-atomic, but the accesses from one int to the next aren't performed in any guaranteed order (or observable in any guaranteed order either). I talked about this with him a while ago but IIRC he wasn't sure about implementation among other things, so when you asked me to widen my original volatile-only memcpy to also do other qualifiers, I realized that it was a neat way to do atomic as well as other qualifiers. I've talked to a few SG1 folks about this, and I believe (for other reasons too) it's where the design will end up for Hans' paper.

I can see the usefulness of this operation, but it seems like a odd semantic mismatch for what is basically just a memcpy where one of the pointers happens to have _Atomic type, like you're shoe-horning it into this builtin just to avoid declaring a different one.

I'm following the discussion we had here regarding overloading:

There are other qualifiers that can meaningfully contribute to the operation here besides volatile, such as restrict and (more importantly) address spaces. And again, for the copy operations these might differ between the two pointer types.

In both cases, I’d say that the logical design is to allow the pointers to be to arbitrarily-qualified types. We can then propagate that information from the builtin into the LLVM intrinsic call as best as we’re allowed. So I think you should make builtins called something like __builtin_overloaded_memcpy (name to be decided) and just have their semantics be type-directed.

Ah yes, I’d like to hear what others think of this. I hadn’t thought about it before you brought it up, and it sounds like a good idea.

As you noted earlier, for memcpy you probably want to express differences in destination and source qualification, even if today IR can't express e.g. volatile source and non-volatile destination. You were talking about volatile, but this applies to the entire combination of dst+src qualified with zero-to-five volatile, _Atomic, __unaligned, restrict, and address space. Pulling the entire combination space out into different functions would create way too many functions. Right now the implementation has a few limitations: it treats both dst and src as volatile if either are, it can't do _Atomic with volatile so we diagnose, and it ignores restrict. Otherwise it supports all combinations.

Improve documentation

Harbormaster failed remote builds in B65382: Diff 280135!Jul 23 2020, 8:21 AM

Re-update

Harbormaster completed remote builds in B65383: Diff 280136.Jul 23 2020, 9:26 AM

In D79279#2169522, @jfb wrote:

In D79279#2168649, @rjmccall wrote:

In D79279#2168533, @jfb wrote:

In D79279#2168479, @rjmccall wrote:

Is there a need for an atomic memcpy at all? Why is it useful to allow this operation to take on "atomic" semantics — which aren't actually atomic because the loads and stores to elements are torn — with hardcoded memory ordering and somewhat arbitrary rules about what the atomic size is?

Hans lays out a rationale for usefulness in his paper, but what I've implemented is more useful: it's *unordered* so you can fence as you desire around it, yet it guarantees a minimum memory access size based on the pointer parameters. For example, copying an atomic int will be 4 byte operations which are single-copy-atomic, but the accesses from one int to the next aren't performed in any guaranteed order (or observable in any guaranteed order either). I talked about this with him a while ago but IIRC he wasn't sure about implementation among other things, so when you asked me to widen my original volatile-only memcpy to also do other qualifiers, I realized that it was a neat way to do atomic as well as other qualifiers. I've talked to a few SG1 folks about this, and I believe (for other reasons too) it's where the design will end up for Hans' paper.

I can see the usefulness of this operation, but it seems like a odd semantic mismatch for what is basically just a memcpy where one of the pointers happens to have _Atomic type, like you're shoe-horning it into this builtin just to avoid declaring a different one.

I'm following the discussion we had here regarding overloading:

There are other qualifiers that can meaningfully contribute to the operation here besides volatile, such as restrict and (more importantly) address spaces. And again, for the copy operations these might differ between the two pointer types.

In both cases, I’d say that the logical design is to allow the pointers to be to arbitrarily-qualified types. We can then propagate that information from the builtin into the LLVM intrinsic call as best as we’re allowed. So I think you should make builtins called something like __builtin_overloaded_memcpy (name to be decided) and just have their semantics be type-directed.

Ah yes, I’d like to hear what others think of this. I hadn’t thought about it before you brought it up, and it sounds like a good idea.

As you noted earlier, for memcpy you probably want to express differences in destination and source qualification, even if today IR can't express e.g. volatile source and non-volatile destination. You were talking about volatile, but this applies to the entire combination of dst+src qualified with zero-to-five volatile, _Atomic, __unaligned, restrict, and address space. Pulling the entire combination space out into different functions would create way too many functions. Right now the implementation has a few limitations: it treats both dst and src as volatile if either are, it can't do _Atomic with volatile so we diagnose, and it ignores restrict. Otherwise it supports all combinations.

My point is that this has nothing to do with the ordinary semantics of _Atomic. You're basically just looking at the word "atomic" and saying that, hey, a minimum access size is sortof related to atomicity.

If you want this to be able to control the minimum access size, you should allow that to be passed in as an optional argument instead.

My point is that this has nothing to do with the ordinary semantics of _Atomic. You're basically just looking at the word "atomic" and saying that, hey, a minimum access size is sortof related to atomicity.

If you want this to be able to control the minimum access size, you should allow that to be passed in as an optional argument instead.

OK so it sounds like you're suggesting *two* versions of the overloaded builtins:

__builtin_memcpy_overloaded which overloads on volatile, restrict, __unaligned, and address spaces, but not on _Atomic qualifiers.
__builtin_atomic_memcpy_overloaded which overloads on volatile (but unsupported for now), restrict, and address spaces, but not on _Atomic qualifiers (because it's implicit), and not on __unaligned because that's a constraint. This takes an extra "element size" parameter, which we hope folks don't confuse with the size parameter (I'd expect a template or macro wrapper to hide that extra parameter when actually using the builtin).

Of course, that's two versions for each of memcpy, memmove, memset, and any other *mem that we decide to add to this list of overloadable functions.

Is that correct?

I don't think any of these should allow _Atomic unless we're going to give it some sort of consistent atomic semantics (which is hard to imagine being useful), and I think you should just take an extra argument of the minimum access width on all of them uniformly if you think that's important. Builtins can have optional arguments.

In D79279#2170095, @rjmccall wrote:

I don't think any of these should allow _Atomic unless we're going to give it some sort of consistent atomic semantics (which is hard to imagine being useful), and I think you should just take an extra argument of the minimum access width on all of them uniformly if you think that's important. Builtins can have optional arguments.

OK so: __builtin_memcpy_overloaded which overloads on volatile, restrict, __unaligned, and address spaces, but not on _Atomic qualifiers. Optionally, a 4th integer parameter can be provided to represent element_size. If provided, this becomes an unordered atomic memcpy with element size equal to or greater than the provided element_size. That value must be a power of two, and must be lock-free (what we call maximum atomic inline width in target info). If provided, then __unaligned is invalid, and volatile ought to be valid but is currently unsupported because IR can't do atomic+volatile memcpy (it would be useful, say for shared memory, but Patches Welcome).

Do you think there should be any relationship at all between dst/src pointee type's size and element_size? i.e. if I copy short* using an element size of 1 byte, is that OK? It seems like larger element sizes is always OK, but smaller might be a programmer error? If that's what they wanted, they could have done (char*)my_short. Or is this trying to be too helpful?

I think the argument is treated as if it were 1 if not given. That's all that ordinary memcpy formally guarantees, which seems to work fine (semantically, if not performance-wise) for pretty much everything today. I don't think you need any restrictions on element size. It's probably sensible to require the pointers to be dynamically aligned to a multiple of the access width, but I don't think you can enforce that statically. And of course the length needs to be a multiple of the access size.

Do you think it'd be useful to have different guarantees for different operands? I guess it could come up, but it'd be a whole lot of extra complexity that I can't imagine we'd ever support.

If one of the arguments is volatile, arguably the minimum access width (if given) needs to be exact. If we don't support that right now, it's okay to make it an error, which is basically you've already done with the _Atomic volatile diagnostic.

In D79279#2170157, @rjmccall wrote:

I think the argument is treated as if it were 1 if not given. That's all that ordinary memcpy formally guarantees, which seems to work fine (semantically, if not performance-wise) for pretty much everything today.

I'm not sure that's true: consider a memcpy implementation which copies some bytes twice (at different access size, there's an overlap because somehow it's more efficient). That would probably violate the programmer's expectations, and I don't think volatile nor atomic memcpy allow this (but regular memcpy does).

I don't think you need any restrictions on element size. It's probably sensible to require the pointers to be dynamically aligned to a multiple of the access width, but I don't think you can enforce that statically.

Agreed, if we're given a short and told to copy 4 bytes at a time then UBSan could find the constraint violation on alignment, but generally the only way we can diagnose is if the parameter is __unaligned (because there you're explicitly telling me it's not aligned, and the constraint is that it has to be).

And of course the length needs to be a multiple of the access size.

Yeah.

Do you think it'd be useful to have different guarantees for different operands? I guess it could come up, but it'd be a whole lot of extra complexity that I can't imagine we'd ever support.

You mean, if element_size is passed then you get different guarantees? I think that's what makes sense: if you're asking for atomic memcpy then you get guarantees. If you're asking for volatile mempcy then you get others. That's why overloading (and multiple parameters) can be confusing, but at the same time I think it's better than having the combinatorial number of named functions instead.

If one of the arguments is volatile, arguably the minimum access width (if given) needs to be exact. If we don't support that right now, it's okay to make it an error, which is basically you've already done with the _Atomic volatile diagnostic.

Agreed. volatile with size makes a lot of sense, and the IR version of it, once created, ought to not be able to widen accesses. volatile without a size specified makes sense too, because you just want a single read and a single write, don't care about tearing.

In D79279#2170187, @jfb wrote:

In D79279#2170157, @rjmccall wrote:

I think the argument is treated as if it were 1 if not given. That's all that ordinary memcpy formally guarantees, which seems to work fine (semantically, if not performance-wise) for pretty much everything today.

I'm not sure that's true: consider a memcpy implementation which copies some bytes twice (at different access size, there's an overlap because somehow it's more efficient). That would probably violate the programmer's expectations, and I don't think volatile nor atomic memcpy allow this (but regular memcpy does).

Yes, that's true, if you need an only-accessed-once guarantee, that's above and beyond just a minimum access size. I agree that volatile would need to make this guarantee.

Do you think it'd be useful to have different guarantees for different operands? I guess it could come up, but it'd be a whole lot of extra complexity that I can't imagine we'd ever support.

You mean, if element_size is passed then you get different guarantees?

No, sorry, I mean different guarantees for the different pointer operands. In principle, we could allow you to say that the memcpy has to be done with 4-byte accesses from the source and 2-byte accesses to the destination. That's implementable but a lot of work.

If one of the arguments is volatile, arguably the minimum access width (if given) needs to be exact. If we don't support that right now, it's okay to make it an error, which is basically you've already done with the _Atomic volatile diagnostic.

Agreed. volatile with size makes a lot of sense, and the IR version of it, once created, ought to not be able to widen accesses. volatile without a size specified makes sense too, because you just want a single read and a single write, don't care about tearing.

Right.

Do you think it'd be useful to have different guarantees for different operands? I guess it could come up, but it'd be a whole lot of extra complexity that I can't imagine we'd ever support.

You mean, if element_size is passed then you get different guarantees?

No, sorry, I mean different guarantees for the different pointer operands. In principle, we could allow you to say that the memcpy has to be done with 4-byte accesses from the source and 2-byte accesses to the destination. That's implementable but a lot of work.

Gotcha. Yeah I think it's useful as a niche thing, and if IR supports that for say volatile then we can honor lopsided volatile overloads (instead of treating the entire thing as volatile). I hadn't really thought about lopsided access sizes (since it fell out of _Atomic). Maybe it's useful? I was just banning unequal sizes before because it seemed like a mistake to copy to/from different types. If we wanted to support it, I suppose we could add another optional parameter, so I'm OK not doing it now, and adding later if useful.

Alright, I'll update the patch as discussed, thanks!

These new builtins should ideally support constant evaluation if possible.

clang/docs/LanguageExtensions.rst
2455–2456	This is missing some important details: What does the size parameter mean? Is it number of bytes or number of elements? If it's number of bytes, what happens if it's not a multiple of the element size, particularly in the `_Atomic` case? What does the value parameter to `memset` mean? Is it splatted to the element width? Does it specify a complete element value? For `_Atomic`, what memory order is used? For `volatile`, what access size / type is used? Do we want to make any promises? Are the loads and stores typed or untyped? (In particular, do we annotate with TBAA metadata?) Do we guarantee to copy the object representation or only the value representation? (Do we preserve the values of padding bits in the source, and initialize padding bits in the destination?) You should also document whether constant evaluation of these builtins is supported.
2462–2464	Mixing those qualifiers doesn't seem like it will work in many cases: we don't allow mixing `volatile` and `_Atomic` (though I'm not sure why; LLVM supports volatile atomic operations), and presumably we shouldn't allow mixing `__unaligned` and `_Atomic` (although I don't see any tests for that, and maybe we should just outright disallow combining `_Atomic` with `__unaligned` in general).
clang/include/clang/Basic/Builtins.def
474	Are these really GCC builtins?
1486	The new builtins probably belong in this section of the file instead.
clang/include/clang/Basic/DiagnosticSemaKinds.td
7979–7981	I'd prefer to keep this diagnostic separate, since it communicates more information than `err_argument_needs_trivial_copy` does: specifically that we need a trivial copy because we're performing an atomic operation.
8953–8961	Please format these diagnostics consistently with the rest of the file: line break after `Error<`, wrap to 80 columns, don't leave blank lines between individual diagnostics in a group of related diagnostics.
clang/lib/Sema/SemaChecking.cpp
1277–1279	There are a bunch of places in this file that do manual argument count checking and could use `checkArgCount` instead (search for `err_typecheck_call_too_` to find them). If you want to clean this up, please do so in a separate change.
5613–5615	Do we need this constraint? If one side is atomic and the other is not, then we can do all of the operations with the atomic width. If both sides are atomic, then one side is 2^N times the size of the other; we can do 2^N operations on one side for each operation on the other side. Maybe the second case is not worth the effort, but permitting (for example) a memcpy from an `_Atomic int` to a `char` seems useful and there doesn't seem to be a good reason to disallow it.

jfb mentioned this in D84666: [NFC] Sema: use checkArgCount instead of custom checking.Jul 27 2020, 8:34 AM

Address comments

I've addressed @rsmith @rjmccall suggestions (unless noted), thanks!
An open question: as of 6e4aa1e48138182685431c76184dfc36e620aea2 @dneilson added an assertion on CreateElementUnorderedAtomicMemCpy to check that the pointer arguments have alignments of at least the element size. That makes sense when the IR is only ever built internally to LLVM, but now that I'm adding a builtin it's more of a dynamic property. Should I also force this in the frontend (understanding that alignment isn't always well known at compile time), or should simply assume that the alignment is correct because it's a dynamic property?

I left some FIXMEs in the CodeGen test for this.

clang/docs/LanguageExtensions.rst
2455–2456	Most of these are answered in the update. Some of the issue is that the current documentation is silent on these points already, by saying "same as C's `mem*` function". I'm relying on that approach here as well. Size is bytes. `memset` value is an `unsigned char`. Memory order is unordered, and accesses themselves are done in indeterminate order. For `volatile`, it falls out of the new wording that we don't provide access size guarantees. We'd need to nail down IR better to do so, and I don't think it's the salient property (though as discussed above, it might be useful, and the `element_size` parameter make it easy to do so). Same on TBAA, no mention because "same as C" (no TBAA annotations). Same on copying bits as-is. Good point on constant evaluation. I added support. Note that we don't have `memset` constant evaluation, so I didn't support it. Seems easy, but ought to be a separate patch.
2462–2464	`volatile` and `_Atomic` ought to work... For this code I didn't make it work (even if it might be useful), because we'd need IR support for it. On mixing `_Atomic __unaligned`: I left a FIXME because I'm not 100% sure, given the alignment discussion on atomic in general. Let's see where we settle: if we make it a pure runtime property then `__unaligned` ought to be fine because it's a constraint violation if the actual pointer is truly unaligned.
clang/include/clang/Basic/Builtins.def
474	Oops, I didn't see that comment, was just copying `__builtin_memcpy_inline`. I'll move it too.
clang/lib/Sema/SemaChecking.cpp
1277–1279	D84666
5613–5615	Based on @rjmccall's feedback, I'm disallowing `_Atomic` qualification, and keying off the optional `element_size` parameter to determine atomicity. I'm also only taking in one size, not two, since as discussed it might be useful to allow two but I haven't heard that anyone actually wants it at the moment.

jfb marked an inline comment as done.Jul 27 2020, 5:29 PM

jfb added inline comments.

clang/lib/AST/ExprConstant.cpp
8806–8808	If we end up making alignment a runtime constraint, then I'll need to check it in consteval. Otherwise I don't think we need to check anything since Sema ought to have done all the required checks already.

Harbormaster completed remote builds in B65942: Diff 281087.Jul 27 2020, 7:11 PM

rsmith added inline comments.Jul 27 2020, 7:53 PM

clang/docs/LanguageExtensions.rst
2440–2442	What happens if `byte_element_size` does not divide `byte_size`?
2440–2442	Did you really mean `void` here? I've been pretty confused by some of the stuff happening below that seems to depend on the actual type of the passed pointer, which would make more sense if you meant `QUAL T ` here rather than `QUAL void*`. Do the builtins do different things for different argument pointee types or not?
2449–2450	"the pointer's element size" -- do you mean "the provided element size"? Does the element size need to be a compile-time constant? (Presumably, but you don't say so.)
2451–2452	Presumably this means that it's an error if we don't provide lock-free atomic access for that size. Would be worth saying so.
2455–2456	What happens if they're not? Is it UB, or is it just not guaranteed to be atomic?
2461	*facilities
clang/include/clang/Basic/DiagnosticSemaKinds.td
8959–8960	Given the new documentation, I would expect you don't need this any more.
8963–8965	Presumably the number of bytes need not be a compile-time constant? It's a bit weird to produce an error rather than a warning on a case that would be valid but (perhaps?) UB if the argument were non-constant.
clang/lib/AST/ExprConstant.cpp
8806–8808	I don't see how you can check the alignment at compile time given a `void*` argument. We presumably need to check that the element size (if given) divides the total size, assuming the outcome is UB if not.

jfb mentioned this in rG389f009c5757: [NFC] Sema: use checkArgCount instead of custom checking.Jul 28 2020, 1:41 PM

Address Richard's comments.

This is almost ready I think!
There are a few things still open, I'd love feedback on them.

clang/docs/LanguageExtensions.rst
2440–2442	Runtime constraint violation. constexpr needs to catch this too, added. Though IIUC we can't actually check alignment in constexpr, which makes sense since there's no actual address. Similarly, I think we ought to add UBSan builtin check for this. I think it makes sense to add as an option to `CreateElementUnorderedAtomicMemCpy`: either assert-check at compile-time (the current default, which triggers assertions as I've annotated in the tests' FIXME), or at runtime if the sanitizer is enabled. WDYT? I've added these two to the documentation.
2440–2442	Oh yeah, this should be `T` and `U`. Fixed. They used to key atomicity off of element size, but now that we have the extra parameter we only look at `T` and `U` for correctness (not behavior).
clang/include/clang/Basic/DiagnosticSemaKinds.td
8963–8965	I commented below, indeed it seems like some of this ought to be relaxed.
clang/lib/Sema/SemaChecking.cpp
5667	I'm re-thinking these checks: if (ElSz->urem(DstElSz)) return ExprError( Diag(TheCall->getBeginLoc(), PDiag(diag::err_atomic_builtin_ext_size_mismatches_el)) << (int)ElSz->getLimitedValue() << DstElSz << DstValTy << DstOp->getSourceRange() << Arg->getSourceRange()); I'm not sure we ought to have them anymore. We know that the types are trivially copyable, it therefore doesn't really matter if you're copying with operations smaller than the type itself. For example: struct Data { int a, b, c, d; }; It ought to be fine to do 4-byte copies of `Data`, if whatever your algorithm is is happy with that. I therefore think I'll remove these checks based on the dst / src element types. The only thing that seems to make sense is making sure that you don't straddle object boundaries with element size. I removed sizeless types: we'll codegen whatever you ask for.

Harbormaster completed remote builds in B66592: Diff 282281.Jul 31 2020, 12:49 PM

rsmith added inline comments.Jul 31 2020, 4:51 PM

clang/docs/LanguageExtensions.rst
2439
2463	From the above description, I think the documentation is unclear what the types `T` and `U` are used for. I think the answer is something like: """ The types `T` and `U` are required to be trivially-copyable types, and `byte_element_size` (if specified) must be a multiple of the size of both types. `dst` and `src` are expected to be suitably aligned for `T` and `U` objects, respectively. """ But... we actually get the alignment information by analyzing pointer argument rather than from the types, just like we do for `memcpy` and `memmove`, so maybe the latter part is not right. (What did you intend regarding alignment for the non-atomic case?) The trivial-copyability and divisibility checks don't seem fundamentally important to the goal of the builtin, so I wonder if we could actually just use `void` here and remove the extra checks. (I don't really have strong views one way or the other on this, except that we should either document what `T` and `U` are used for or make the builtins not care about the pointee type beyond its qualifiers.)
clang/lib/AST/ExprConstant.cpp
8866	We could use the same logic we use in `__builtin_is_aligned` here. For any object whose value the constant evaluator can reason about, we should be able to compute at least a minimal alignment (though the actual runtime alignment might of course be greater).
clang/lib/CodeGen/CGBuiltin.cpp
2707–2708	Looking through implicit conversions in `getPtrArgType` here will change the code we generate for cases like: void f(volatile void p, volatile void q) { memcpy(p, q, 4); } ... (in C, where we permit such implicit conversions) to use a volatile memcpy intrinsic. Is that an intentional change?
clang/lib/CodeGen/CGExpr.cpp
1167–1168 ↗	(On Diff #282281)	Do we still need this? We should be doing the decay in `Sema`.

Address Richard's comments, add UBSan support.

Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2020, 5:49 PM

Herald added a subscriber: Restricted Project. · View Herald Transcript

Thanks for the detailed comments, I think I've addressed all of them! I also added UBSan support to check the builtin invocation. I think this patch is pretty much ready to go. A follow-up will need to add the support functions to compiler-rt (they're currently optional, as per https://reviews.llvm.org/D33240), and in cases where size is known we can inline them (as we do for memcpy and friends).

clang/docs/LanguageExtensions.rst
2463	You're right. I've removed most treatment of `T` / `U`, and updated the documentation. I left the trivial copy check, but `void*` is a usual escape hatch. Divisibility is now only checked for `size` / `element_size`.
clang/lib/AST/ExprConstant.cpp
8866	I think the runtime alignment is really the only thing that matters here. I played with constexpr checking based on what `__builtin_is_aligned` does, and it's not particularly useful IMO.
clang/lib/CodeGen/CGBuiltin.cpp
2707–2708	I'm confused... what's the difference that this makes for the pre-existing builtins? My intent was to get the `QualType` unconditionally, but I can conditionalize it if needed... However this ought to make no difference: static QualType getPtrArgType(CodeGenModule &CGM, const CallExpr *E, unsigned ArgNo) { QualType ArgTy = E->getArg(ArgNo)->IgnoreImpCasts()->getType(); if (ArgTy->isArrayType()) return CGM.getContext().getAsArrayType(ArgTy)->getElementType(); if (ArgTy->isObjCObjectPointerType()) return ArgTy->castAs<clang::ObjCObjectPointerType>()->getPointeeType(); return ArgTy->castAs<clang::PointerType>()->getPointeeType(); } and indeed I can't see the example you provided change in IR from one to the other. The issue I'm working around is that getting it unconditionally would make ObjC code sad when `id` is passed in as I outlined above.
clang/lib/Sema/SemaChecking.cpp
5667	They're gone, we now only check that size and element size match up.

jfb added inline comments.Aug 4 2020, 5:57 PM

compiler-rt/test/ubsan/TestCases/Misc/builtins-ctz-clz.cpp
2	Phab is confused.... I did a git rename of `compiler-rt/test/ubsan/TestCases/Misc/builtins.cpp` and it thinks this is new, and I deleted the other.

Thanks. I'd like @rjmccall to approve too, but the design of these intrinsics and the Sema and constant evaluation parts seem fine. (We don't strictly need constant evaluation to abort on library UB, so I think not catching the misalignment case is OK.)

clang/docs/LanguageExtensions.rst
2456–2458	"runtime constraint violation" is an odd phrase; in C, "constraint violation" means a diagnostic is required. Can we instead say that it results in undefined behavior?
2463	Please document the trivial copy check.
clang/lib/CodeGen/CGBuiltin.cpp
2707–2708	The example I gave should produce a non-volatile memcpy, and used to do so (we passed `false` as the fourth parameter to `CreateMemCpy`). With this patch, `getPtrArgType`will strip off the implicit conversion from `volatile void` to `void` in the argument type, so `isVolatile` below will be `true`, so I think it will now create a volatile memcpy for the same testcase. If that's not what's happening, then I'd like to understand why not :) I'm not saying that's necessarily a bad change, but it is a change, and it's one we should make intentionally if we make it at all.

Harbormaster completed remote builds in B67023: Diff 283078.Aug 4 2020, 6:33 PM

Patch looks basically okay to me, although I'll second Richard's concern that we shouldn't absent-mindedly start producing overloaded memcpys for ordinary __builtin_memcpy.

clang/docs/LanguageExtensions.rst
2455	"The element size must..." But I would suggest using "access size" consistently rather than "element size".

Update docs

In D79279#2195299, @rjmccall wrote:

Patch looks basically okay to me, although I'll second Richard's concern that we shouldn't absent-mindedly start producing overloaded memcpys for ordinary __builtin_memcpy.

Yeah I think that's a leftover from the first patch. Should I drop it? At the same time, address spaces are currently accidentally "working", should I drop that in a patch before this change? Or leave as-is?

clang/docs/LanguageExtensions.rst
2455	I'm being consistent with the naming for IR, which uses "element" as well. I'm not enamored with the naming, but wanted to point out the purposeful consistency to make sure you preferred "access size". Without precedent I would indeed prefer "access size", but have a slight preference for consistency here. This is extremely weakly held preference. (I fixed "the").
2463	Should I bubble this to the rest of the builtin in a follow-up patch? I know there are cases where that'll cause issues, but I worry that it would be a pretty noisy diagnostic (especially if we instead bubble it to C's `memcpy` instead).
clang/lib/CodeGen/CGBuiltin.cpp
2707–2708	Oh yes, sorry I thought you were talking about something that `getPtrArgType` did implicitly! Indeed the C code behaves differently in that it doesn't just strip `volatile` anymore. I'm not super thrilled by the default C behavior, and I think this new behavior removes a gotcha, and is in fact what I was going for in the first iteration of the patch. Now that I've separated the builtin I agree that it's a bit odd... but it seems like the right thing to do anyways? But it no longer matches the C library function to do so. FWIW, this currently "works as you'd expect": void f(__attribute__((address_space(32))) void dst, const void src, int sz) { __builtin_memcpy(dst, src, sz); } https://godbolt.org/z/dcWxcK and I think that's completely accidental because the C library function doesn't (and, as John pointed out earlier, the builtin is meant to act like the C function).

Harbormaster completed remote builds in B67046: Diff 283121.Aug 4 2020, 10:28 PM

vsk added a subscriber: vsk.Aug 5 2020, 10:46 AM

vsk added inline comments.

compiler-rt/lib/ubsan/ubsan_handlers.cpp
656–657	It looks like `__ubsan_handle_invalid_builtin` is meant to be recoverable, so I think this should be `GET_REPORT_OPTIONS(false)`. Marking this unrecoverable makes it impossible to suppress redundant diagnostics at the same source location. It looks this isn't code you've added: feel free to punt this to me. If you don't mind folding in a fix, adding a test would be simple (perform UB in a loop and verify only one diagnostic is printed).

rjmccall added inline comments.Aug 5 2020, 11:29 AM

clang/docs/LanguageExtensions.rst
2455	IR naming is generally random fluff plucked from the mind of an inspired compiler engineer. User documentation is the point where we're supposed to put our bad choices behind us and do something that makes sense to users. :)

Two observations that are new to me in this review:

We already treat all builtins as being overloaded on address space.
The revised patch treats __builtin_mem*_overloaded as being overloaded *only* on address space, volatility, and atomicity. (We've tuned the design to a point where the other qualifiers don't matter any more.)

So, we're adding three features here: overloading on (a) address space, (b) volatility, and (c) atomicity. (a) is already available in the non-_overloaded form, and we seem to be approaching agreement that (b) should be available in the non-_overloaded form too. So that only leaves (c), which is really not _overloaded but _atomic.

Based on those observations I'm now thinking that we might prefer a somewhat different approach (but one that should require only minimal changes to the patch in front of us). Specifically:

Stop treating lib builtins (eg, plain memcpy) as overloaded on address space. That's a (pre-existing) conformance bug, at least for the Embedded C TR.
Keep __builtin_ forms of lib builtins overloaded on address space. (No change.)
Also overload __builtin_ forms of lib builtins on volatility where it makes sense, instead of adding new builtin names __builtin_mem*_overloaded.
Add a new name for the builtin for the atomic forms of memcpy and memset (__builtin_memcpy_unordered_atomic maybe?).
Convert the "trivial types" check from an error to a warning and apply it to all the mem* overloads. (Though I think we might actually already have such a check, so this might only require extending it to cover the atomic builtin.)

What do you think?

clang/docs/LanguageExtensions.rst
2443	Does `restrict` really make sense here? It seems like the only difference it could possibly make would be to treat `memcpy` as `memmove` if either operand is marked restrict, but (a) if the caller wants that, they can just use `memcpy` directly, and (b) it's not correct to propagate restrict-ness from the caller to the callee anyway, because restrict-ness is really a property of the declaration of the identifier in its scope, not a property of its type: void f(void *restrict p) { __builtin_memmove(p, p + 1, 4); } (c) a restrict qualifier on the pointee type is irrelevant to memcpy and a restrict qualifier on the pointer type isn't part of QUAL.
2444	I don't think `__unaligned `matters any more. We always take the actual alignment inferred from the pointer arguments, just like we do for non-overloaded` memcpy`.
2464–2465
clang/lib/CodeGen/CGBuiltin.cpp
2707–2708	FWIW, this currently "works as you'd expect": void f(__attribute__((address_space(32))) void dst, const void src, int sz) { __builtin_memcpy(dst, src, sz); } The same is true even if you remove the `__builtin_` (and add a suitable include), and that seems like a bug to me. It looks like we have special logic that treats all builtins taking pointers as being overloaded on address space, which seems wrong at least for lib builtins. C TR 18037:2008 is quite explicit about this. Section 5.1.4 says: """ The standard C library (ISO/IEC 9899:1999 clause 7 - Libraries) is unchanged; the library's functions and objects continue to be declared only with regard to the generic address space. One consequence is that pointers into named address spaces cannot be passed as arguments to library functions except in the special case that the named address spaces are subsets of the generic address space. """ We could retain that special rule for `__builtin_`-spelled variants of lib builtins. If we do, then maybe we shouldn't be adding `__builtin_memcpy_overloaded` at all and should only extend the behavior of `__builtin_memcpy` to also propagate volatility (and add a new builtin for the atomic case). Regarding volatile, consider: void maybe_volatile_memcpy(volatile void dst, const volatile void src, int sz, _Bool is_volatile) { if (is_volatile) { #ifdef __clang__ __builtin_memcpy_overloaded(dst, src, sz); #elif __GNUC__ // ... #else // volatile char copy loop #endif } memcpy(dst, src, sz); } With this patch, the above code will always perform a volatile `memcpy`. I think that's a surprise. A call to `memcpy` should follow the C semantics, even if we choose to change the semantics of `__builtin_memcpy`.
clang/lib/Sema/SemaChecking.cpp
5732	You need to call `Sema::isCompleteType` first before asking this question, in order to trigger class instantiation when necessary in C++. (Likewise for the checks in the previous function.)

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

If that's not true, or if we're willing to ignore it, I agree that making __builtin_memcpy do the right thing for qualifiers in general is the right thing to do.

Do UBSan change suggested by Vedant.

Add loop test requested by Vedant

compiler-rt/lib/ubsan/ubsan_handlers.cpp
656–657	I folded this into the patch.

Use 'access size' instead of 'element size'.

clang/docs/LanguageExtensions.rst
2455	"access size" it is :)

Remove restrict, update docs, call isCompleteType

In D79279#2197118, @rsmith wrote:

Two observations that are new to me in this review:

We already treat all builtins as being overloaded on address space.

The revised patch treats __builtin_mem*_overloaded as being overloaded *only* on address space, volatility, and atomicity. (We've tuned the design to a point where the other qualifiers don't matter any more.)

So, we're adding three features here: overloading on (a) address space, (b) volatility, and (c) atomicity. (a) is already available in the non-_overloaded form, and we seem to be approaching agreement that (b) should be available in the non-_overloaded form too. So that only leaves (c), which is really not _overloaded but _atomic.

Based on those observations I'm now thinking that we might prefer a somewhat different approach (but one that should require only minimal changes to the patch in front of us). Specifically:

Stop treating lib builtins (eg, plain memcpy) as overloaded on address space. That's a (pre-existing) conformance bug, at least for the Embedded C TR.

Keep __builtin_ forms of lib builtins overloaded on address space. (No change.)

Also overload __builtin_ forms of lib builtins on volatility where it makes sense, instead of adding new builtin names __builtin_mem*_overloaded.

Add a new name for the builtin for the atomic forms of memcpy and memset (__builtin_memcpy_unordered_atomic maybe?).

Convert the "trivial types" check from an error to a warning and apply it to all the mem* overloads. (Though I think we might actually already have such a check, so this might only require extending it to cover the atomic builtin.)

What do you think?

That's fine with me, but as John noted that's inconsistent with what he thought builtins allowed (i.e. #define memcpy(dst, src, sz) __builtin_memcpy(dst, src, sz). If that's Not A Thing then your plan works. If it is then we need to tune it a bit.

Also note that the _atomic builtin also needs to support some overloading, at least for address spaces (and maybe volatile in the future).

So, let me know what you'd both rather see, so I don't ping-pong code too much.

clang/docs/LanguageExtensions.rst
2443	I dropped `restrict`.
2444	It's still allowed as a qualifier, though.
clang/lib/Sema/SemaChecking.cpp
5732	Before the condition, right? LMK if I added the right thing!

Harbormaster completed remote builds in B67193: Diff 283390.Aug 5 2020, 3:44 PM

Harbormaster completed remote builds in B67199: Diff 283402.Aug 5 2020, 4:13 PM

Harbormaster completed remote builds in B67198: Diff 283400.Aug 5 2020, 4:19 PM

Harbormaster completed remote builds in B67201: Diff 283406.Aug 5 2020, 4:42 PM

In D79279#2197176, @rjmccall wrote:

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless. It doesn't look like glibc does anything like this; do you know of a C standard library implementation that does?

If we want to follow that path, then we'll presumably (eventually) want address-space-_overloaded versions of all lib builtins that take pointers -- looks like that's around 60 functions total. That said, I do wonder how many of the functions in question that we're implicitly overloading on address space actually support such overloading -- certainly any of them that we lower to a call to a library function is going to go wrong at runtime.

+@tstellar, who added this functionality in r233706 -- what was the intent here?

In D79279#2200916, @rsmith wrote:

In D79279#2197176, @rjmccall wrote:

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless. It doesn't look like glibc does anything like this; do you know of a C standard library implementation that does?

If we want to follow that path, then we'll presumably (eventually) want address-space-_overloaded versions of all lib builtins that take pointers -- looks like that's around 60 functions total. That said, I do wonder how many of the functions in question that we're implicitly overloading on address space actually support such overloading -- certainly any of them that we lower to a call to a library function is going to go wrong at runtime.

+@tstellar, who added this functionality in r233706 -- what was the intent here?

The goal of this patch was to avoid having to overload all the builtin with address spaces, which would be a lot of new builtins, but this functionality was added for targets that do not have a memcpy lib call, so I didn't consider the case where a libcall would be emitted.

In D79279#2200916, @rsmith wrote:

In D79279#2197176, @rjmccall wrote:

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless.

It wouldn't be pointless; it would enable optimization of direct calls to memcpy (the 99% case) without the compiler having to special-case a function by name. And you don't need an #undef, since &memcpy doesn't trigger the preprocessor when memcpy is a function-like macro. I seem to remember this being widely used in some math libraries, but it's entirely possible that it's never been used for memcpy and the like. It's also entirely possible that I'm passing around folklore.

If we just want memcpy to do the right thing when called directly, that's not ridiculous. I don't think it would actually have any conformance problems: a volatile memcpy is just a less optimizable memcpy, and to the extent that an address-space-aware memcpy is different from the standard definition, it's probably UB to use the standard definition to copy memory in non-generic address spaces.

In D79279#2201136, @rjmccall wrote:

In D79279#2200916, @rsmith wrote:

In D79279#2197176, @rjmccall wrote:

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

For what it's worth, giving __builtin_memcpy broader semantics than memcpy wouldn't be unprecedented: it's already the case that __builtin_memcpy is usable in constant expressions where plain memcpy is not (but there's no macro risk in that case at least since C++ memcpy can't be a macro, and a call to memcpy is never an ICE in C).

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless.

It wouldn't be pointless; it would enable optimization of direct calls to memcpy (the 99% case) without the compiler having to special-case a function by name.

I mean, yes, in an environment where the compiler didn't special-case the function by name anyway it wouldn't be pointless. I'm not aware of any such environment in popular usage, but that could just be ignorance on my part.

If we just want memcpy to do the right thing when called directly, that's not ridiculous. I don't think it would actually have any conformance problems: a volatile memcpy is just a less optimizable memcpy, and to the extent that an address-space-aware memcpy is different from the standard definition, it's probably UB to use the standard definition to copy memory in non-generic address spaces.

I think this is a constraint violation in C; C11 6.5.2.2/2 (a "Constraints" paragraph) requires that "Each argument shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter." So this would be an extension, not just defining UB, and seems like the kind of extension we'd normally diagnose under -pedantic-errors, but we could make an exception. I also think it'd be mildly surprising for an explicitly-declared function to allow different argument conversions depending on whether its name is memcpy.

It seems to me that you're not entirely comfortable with making memcpy and __builtin_memcpy differ in this way, and I'm not entirely comfortable with making memcpy's behavior differ from its declared type. Meanwhile, @tstellar's patch wants __builtin_memcpy to be overloaded on address space. I don't see a way to address all three concerns at once.

I think it would be reasonable in general to guarantee that our __builtin_ functions have contracts at least as wide as the underlying C function, but allow them to have extensions, and to keep the plain C functions unextended. I had actually thought we already did that in more cases than we do (but perhaps I was thinking about the LLVM math intrinsics that guarantee to not set errno). That would mean that a C standard library implementation is still free to #define foo(x,y,z) __builtin_foo(x,y,z), but if they do, they may pick up extensions.

In D79279#2201484, @rsmith wrote:

I think it would be reasonable in general to guarantee that our __builtin_ functions have contracts at least as wide as the underlying C function, but allow them to have extensions, and to keep the plain C functions unextended. I had actually thought we already did that in more cases than we do (but perhaps I was thinking about the LLVM math intrinsics that guarantee to not set errno). That would mean that a C standard library implementation is still free to #define foo(x,y,z) __builtin_foo(x,y,z), but if they do, they may pick up extensions.

Alright, how about I go with what you say here, and instead of adding __builtin_*_overloaded versions I just overload the __builtin_* variants? This includes having an optional 4th parameter for access size.
Alternatively, I could overload __builtin_*, but have a separate set of functions (say __builtin_*_sized) for the atomic access size variants.

Update overloading as discussed: on the original builtins, and separate the _sized variant, making its 4th parameter non-optional. If this looks good, I'll need to update codege for a few builtins (to handle volatile), as well as add tests for their codegen and address space (which should already work, but isn't tested).

Harbormaster completed remote builds in B68291: Diff 285414.Aug 13 2020, 11:12 AM

Fix a test.

Harbormaster completed remote builds in B68321: Diff 285479.Aug 13 2020, 2:02 PM

Actually I think any subsequent updates to tests can be done in a follow-up patch, since I'm not changing the status-quo on address space here.

Ping, I think I've addressed all comments here.

Thanks, I'm happy with this approach.

If I understand correctly, the primary (perhaps sole) purpose of __builtin_memcpy_sized is to provide a primitive from which an atomic operation can be produced. That being the case, I wonder if the name is emphasizing the wrong thing, and a name that contains atomic would be more obvious. As a concrete alternative, __atomic_unordered_memcpy is not much longer and seems to have the right kinds of implications. WDYT?

clang/docs/LanguageExtensions.rst
2395–2397	"can also be overloaded" -> "are also overloaded" would be clearer I think. ("Can also be overloaded" would suggest to me that it's the user of the builtin who overloads them, perhaps by declaring overloads.)
2403–2406	Is this true in general, or only for `[w]mem{cpy,move}`? I thought for the other cases, we required an array of `char` / `wchar_t`?
2409	I think this only applies to the above list minus the five functions you added to it. Given this and the previous comment, I'm not sure that merging the documentation on string builtins and memory builtins is working out well -- they seem to have more differences than similarities. (`memset` is an outlier here that should be called out -- we don't seem to provide any constant evaluation support for it whatsoever.)
2451
2452
clang/lib/AST/ExprConstant.cpp
8870	`getLimitedValue()` here seems unnecessary; `urem` can take an `APInt`.
8872	Consider using `toString` instead of truncating each `APSInt` to `uint64_t` then to `int`. The size might reliably fit into `uint64_t`, but I don't think we can assume that `int` is large enough.
clang/lib/CodeGen/CGBuiltin.cpp
635	I'm not sure this `castAs` is necessarily correct. If the operand is C++11 `nullptr`, we could perform a null-to-pointer implicit conversion, and `ArgTy` could be `NullPtrTy` after stripping that back off here. It seems like maybe what we want to do is strip off implicit conversions until we hit a non-pointer type, and take the pointee type we found immediately before that?
clang/lib/Sema/SemaChecking.cpp
5609	Generally, I'm a little uncomfortable about producing an error if a type is complete but allowing the construct if the type is incomplete -- that seems like a situation where a warning would be more appropriate to me. It's surprising and largely unprecedented that providing more information about a type would change the program from valid to invalid. Do we really need the protection of an error here rather than an enabled-by-default warning? Moreover, don't we already have a warning for `memcpy` of a non-trivially-copyable object somewhere? If not, then I think we should add such a thing that also applies to the real `memcpy`, rather than only warning on the builtin.
5732	It would be more correct from a modules perspective to use if (isCompleteType(Loc, T) && !T.isTriviallyCopyableType(Context)) That way, if the definition of the type is in some loaded-but-not-imported module file, we'll treat it the same as if the definition of the type is entirely unknown. (That also removes the need to check for the `void` case.) But given that this only allows us to accept code that is wrong in some sense, I'm not sure it really matters much.

Address Richard's comments.

In D79279#2235085, @rsmith wrote:

Thanks, I'm happy with this approach.

If I understand correctly, the primary (perhaps sole) purpose of __builtin_memcpy_sized is to provide a primitive from which an atomic operation can be produced. That being the case, I wonder if the name is emphasizing the wrong thing, and a name that contains atomic would be more obvious. As a concrete alternative, __atomic_unordered_memcpy is not much longer and seems to have the right kinds of implications. WDYT?

Kinda, that's the motivating from Hans' paper which I'm following. One other use case (and the reason I assume Azul folks want it too) is when there's a GC that looks at objects. With this it knows it won't see torn objects when the GC is concurrent. It's similar, but generally atomic also implies an ordering, and here it's explicitly unordered (i.e. Bring Your Own Fences).

So I don't have a strong opinion on the name, but atomic seem slightly wrong.

Follow-ups I suggest in comments:

Make "must be trivially copyable" a warning throughout (not just here, but for atomics too).
Implement that diagnostic for mem* functions.

clang/docs/LanguageExtensions.rst
2403–2406	This is just moving documentation that was below. Phab is confused with the diff.
2409	[w]memset are indeed the odd ones, update says so.
clang/lib/AST/ExprConstant.cpp
8870	Their bitwidth doesn't always match, and that asserts out.
8872	OK I updated 2 other places as well.
clang/lib/CodeGen/CGBuiltin.cpp
635	Ah good catch! The new functions I'm adding just disallow nullptr, but the older ones allow it. I've modified the code accordingly and added a test in CodeGen for nullptr.
2707–2708	Yes, I believe that this is a pre-existing inconsistency with the non-`__builtin_` variants.
clang/lib/Sema/SemaChecking.cpp
5609	That rationale makes sense, but it's pre-existing behavior for atomic. I can change all of them in a follow-up if that's OK? We don't have such a check for other builtins. I can do a second follow-up to then adopt these warnings for them too?

Harbormaster completed remote builds in B69689: Diff 288131.Aug 26 2020, 5:34 PM

jfb added a reviewer: rsmith.Aug 29 2020, 3:52 PM

Herald added a subscriber: danielkiss. · View Herald TranscriptAug 29 2020, 3:52 PM

ping

gchatelet mentioned this in D86066: IR: Merge MemCpyInlineInst and MemCpyInst.Sep 11 2020, 1:03 AM

ping

I went through all the comments here, plus looked at the code myself. I believe all of the comments by other reviewers have been fixed/answered acceptably. I don't have any additional comments to add, therefore I think it is appropriate to accept this revision.

@jfb: Please give this a day or two before committing to give the other reviewers a chance to speak up!

This revision is now accepted and ready to land.Nov 4 2020, 9:34 AM

I think the documentation isn't quite right yet, but otherwise I think I'm happy. (With a couple of code change suggestions.)

In D79279#2240487, @jfb wrote:

In D79279#2235085, @rsmith wrote:

If I understand correctly, the primary (perhaps sole) purpose of __builtin_memcpy_sized is to provide a primitive from which an atomic operation can be produced. That being the case, I wonder if the name is emphasizing the wrong thing, and a name that contains atomic would be more obvious. As a concrete alternative, __atomic_unordered_memcpy is not much longer and seems to have the right kinds of implications. WDYT?

Kinda, that's the motivating from Hans' paper which I'm following. One other use case (and the reason I assume Azul folks want it too) is when there's a GC that looks at objects. With this it knows it won't see torn objects when the GC is concurrent. It's similar, but generally atomic also implies an ordering, and here it's explicitly unordered (i.e. Bring Your Own Fences).

So I don't have a strong opinion on the name, but atomic seem slightly wrong.

I mean, I see your point, but I think not mentioning atomic at all also seems a little surprising, given how tied this feature is to atomic access (eg, rejecting access sizes larger than the inline atomic width). But this is not a hill I have any desire to die on :) If you'd prefer to keep the current name, I can live with it.

clang/docs/LanguageExtensions.rst
2403–2406	The documentation that was moved applied to `memcpy`, `memmove`, `wmemcpy`, and `wmemmove`. In the new location, it doesn't apply to `wmemcpy` nor `wmemmove` but does now apply to `memchr`, `memcmp`, ..., for which it is incorrect. We used to distinguish between "string builtins", which had the restriction to character types, and "memory builtins", which had the restriction to trivially-copyable types. Can you put that distinction back, or otherwise restore the wording to the old / correct state?
2409	This feature test macro doesn't cover `memcpy` / `memmove`; this documentation is incorrect for older versions of Clang. Clang 4-7 define this feature test macro but do not support constant evaluation of `memcpy` / `memmove`.
clang/lib/CodeGen/CGBuiltin.cpp
636–643	(Just a simplification, NFC.)
clang/lib/Sema/SemaChecking.cpp
5609	The pre-existing behavior for atomic builtins is to reject if the type is incomplete. Eg; <stdin>:1:33: error: address argument to atomic operation must be a pointer to a trivially-copyable type ('struct A ' invalid) struct A; void f(struct A p) { __atomic_store(p, p, 0); } ^ ~ We should do the same here. (Though I'd suggest calling `RequireCompleteType` instead to get a more meaningful diagnostic.) These days I think we should check for unsized types too, eg: if (RequireCompleteSizedType(ScrOp->getBeginLoc(), SrcValTy)) return true; if (!SrcValTy.isTriviallyCopyableType(Context) && !SrcValTy->isVoidType()) return ExprError(...);

This revision now requires changes to proceed.Nov 4 2020, 5:43 PM

Revision Contents

Path

Size

clang/

docs/

LanguageExtensions.rst

79 lines

include/

clang/

Basic/

Builtins.def

5 lines

DiagnosticASTKinds.td

3 lines

DiagnosticSemaKinds.td

11 lines

Sema/

Sema.h

2 lines

lib/

AST/

ExprConstant.cpp

20 lines

CodeGen/

CGBuilder.h

28 lines

CGBuiltin.cpp

162 lines

CodeGenFunction.h

10 lines

Sema/

SemaChecking.cpp

210 lines

test/

CodeGen/

builtin-memfns.c

200 lines

builtin-sized-memfns.c

113 lines

ubsan-builtin-checks.c

	ubsan-builtin-ctz-clz.c
	ubsan-builtin-checks.c

8 lines

ubsan-builtin-mem_sized.c

104 lines

CodeGenObjC/

builtin-memfns.m

34 lines

Sema/

builtin-sized-memfns.cpp

258 lines

SemaCXX/

constexpr-string.cpp

21 lines

compiler-rt/

lib/

ubsan/

ubsan_handlers.h

12 lines

ubsan_handlers.cpp

43 lines

test/

ubsan/

TestCases/

Misc/

builtins-ctz-clz.cpp

44 lines

builtins-mem_sized.cpp

93 lines

builtins.cpp

Diff 285414

clang/docs/LanguageExtensions.rst

Show First 20 Lines • Show All 2,355 Lines • ▼ Show 20 Lines

number. This canonicalization is useful for implementing certain number. This canonicalization is useful for implementing certain

numeric primitives such as frexp. See `LLVM canonicalize intrinsic numeric primitives such as frexp. See `LLVM canonicalize intrinsic

<https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic>`_ for <https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic>`_ for

more information on the semantics. more information on the semantics.

String builtins String builtins

--------------- ---------------

Clang provides constant expression evaluation support for builtins forms of Clang provides support for builtins forms of the following functions from the C

the following functions from the C standard library headers standard library headers ``<string.h>`` and ``<wchar.h>``:

``<string.h>`` and ``<wchar.h>``:

* ``memchr`` * ``memchr``

* ``memcmp`` (and its deprecated BSD / POSIX alias ``bcmp``) * ``memcmp`` (and its deprecated BSD / POSIX alias ``bcmp``)

* ``memcpy``

* ``memmove``

* ``memset``

* ``strchr`` * ``strchr``

* ``strcmp`` * ``strcmp``

* ``strlen`` * ``strlen``

* ``strncmp`` * ``strncmp``

* ``wcschr`` * ``wcschr``

* ``wcscmp`` * ``wcscmp``

* ``wcslen`` * ``wcslen``

* ``wcsncmp`` * ``wcsncmp``

* ``wmemchr`` * ``wmemchr``

* ``wmemcmp`` * ``wmemcmp``

* ``wmemcpy``

* ``wmemmove``

In each case, the builtin form has the name of the C library function prefixed In each case, the builtin form has the name of the C library function prefixed

by ``__builtin_``. Example: by ``__builtin_``. Example:

.. code-block:: c .. code-block:: c

void *p = __builtin_memchr("foobar", 'b', 5); void *p = __builtin_memchr("foobar", 'b', 5);

In addition to the above, one further builtin is provided: These builtins require source and destination to be pointers to trivially

copyable types or to ``void`` prior to conversion to the parameter type.

Beyond C's specification for these functions, the above builtins can also be

overloaded on non-default address spaces. Further, the following builtins can

also be overloaded on ``volatile``:

rsmithUnsubmitted

Done

"can also be overloaded" -> "are also overloaded" would be clearer I think. ("Can also be overloaded" would suggest to me that it's the user of the builtin who overloads them, perhaps by declaring overloads.)

rsmith: "can also be overloaded" -> "are also overloaded" would be clearer I think. ("Can also be…

* ``memcpy``

* ``memmove``

* ``memset``

Constant evaluation support is only provided when the source and destination are

pointers to arrays with the same trivially copyable element type, and the given

size is an exact multiple of the element size that is no greater than the number

of elements accessible through the source and destination operands.

rsmithUnsubmitted

Done

Is this true in general, or only for [w]mem{cpy,move}? I thought for the other cases, we required an array of char / wchar_t?

rsmith: Is this true in general, or only for `[w]mem{cpy,move}`? I thought for the other cases, we…

jfbAuthorUnsubmitted

Done

This is just moving documentation that was below. Phab is confused with the diff.

jfb: This is just moving documentation that was below. Phab is confused with the diff.

rsmithUnsubmitted

Not Done

The documentation that was moved applied to memcpy, memmove, wmemcpy, and wmemmove. In the new location, it doesn't apply to wmemcpy nor wmemmove but does now apply to memchr, memcmp, ..., for which it is incorrect.

We used to distinguish between "string builtins", which had the restriction to character types, and "memory builtins", which had the restriction to trivially-copyable types. Can you put that distinction back, or otherwise restore the wording to the old / correct state?

rsmith: The documentation that was moved applied to `memcpy`, `memmove`, `wmemcpy`, and `wmemmove`. In…

Support for constant expression evaluation for the above builtins can be detected

with ``__has_feature(cxx_constexpr_string_builtins)``.

rsmithUnsubmitted

Done

I think this only applies to the above list minus the five functions you added to it. Given this and the previous comment, I'm not sure that merging the documentation on string builtins and memory builtins is working out well -- they seem to have more differences than similarities.

(memset is an outlier here that should be called out -- we don't seem to provide any constant evaluation support for it whatsoever.)

rsmith: I think this only applies to the above list minus the five functions you added to it. Given…

jfbAuthorUnsubmitted

Done

[w]memset are indeed the odd ones, update says so.

jfb: [w]memset are indeed the odd ones, update says so.

rsmithUnsubmitted

Not Done

This feature test macro doesn't cover memcpy / memmove; this documentation is incorrect for older versions of Clang. Clang 4-7 define this feature test macro but do not support constant evaluation of memcpy / memmove.

rsmith: This feature test macro doesn't cover `memcpy` / `memmove`; this documentation is incorrect for…

In addition to the above, the following builtins are provided:

.. code-block:: c .. code-block:: c

char *__builtin_char_memchr(const char *haystack, int needle, size_t size); char *__builtin_char_memchr(const char *haystack, int needle, size_t size);

``__builtin_char_memchr(a, b, c)`` is identical to ``__builtin_char_memchr(a, b, c)`` is identical to

``(char*)__builtin_memchr(a, b, c)`` except that its use is permitted within ``(char*)__builtin_memchr(a, b, c)`` except that its use is permitted within

constant expressions in C++11 onwards (where a cast from ``void*`` to ``char*`` constant expressions in C++11 onwards (where a cast from ``void*`` to ``char*``

is disallowed in general). is disallowed in general).

Constant evaluation support for the ``__builtin_mem*`` functions is provided

only for arrays of ``char``, ``signed char``, ``unsigned char``, or ``char8_t``,

despite these functions accepting an argument of type ``const void*``.

Support for constant expression evaluation for the above builtins can be detected

with ``__has_feature(cxx_constexpr_string_builtins)``.

Memory builtins

---------------

* ``__builtin_memcpy_inline``

.. code-block:: c .. code-block:: c

void __builtin_memcpy_inline(void *dst, const void *src, size_t size); void __builtin_memcpy_inline(void *dst, const void *src, size_t size);

``__builtin_memcpy_inline(dst, src, size)`` is identical to ``__builtin_memcpy_inline(dst, src, size)`` is identical to

``__builtin_memcpy(dst, src, size)`` except that the generated code is ``__builtin_memcpy(dst, src, size)`` except that the generated code is

guaranteed not to call any external functions. See [LLVM IR ‘llvm.memcpy.inline’ guaranteed not to call any external functions. See [LLVM IR ‘llvm.memcpy.inline’

Intrinsic](https://llvm.org/docs/LangRef.html#llvm-memcpy-inline-intrinsic) for Intrinsic](https://llvm.org/docs/LangRef.html#llvm-memcpy-inline-intrinsic) for

more information. more information.

Note that the `size` argument must be a compile time constant. Note that the `size` argument must be a compile time constant for

``__builtin_memcpy_inline(dst, src, size)``.

Clang provides constant expression evaluation support for builtin forms of the

following functions from the C standard library headers

``<string.h>`` and ``<wchar.h>``:

* ``memcpy`` Constant evaluation support is not yet provided for ``__builtin_memcpy_inline``.

* ``memmove``

* ``wmemcpy``

* ``wmemmove``

In each case, the builtin form has the name of the C library function prefixed

by ``__builtin_``.

Constant evaluation support is only provided when the source and destination .. code-block:: c

are pointers to arrays with the same trivially copyable element type, and the

given size is an exact multiple of the element size that is no greater than

the number of elements accessible through the source and destination operands.

Constant evaluation support is not yet provided for ``__builtin_memcpy_inline``. void* __builtin_memcpy_sized(QUAL0 void *dst, QUAL1 const void *src, size_t byte_size, size_t byte_access_size)

rsmithUnsubmitted

Done

* ``__builtin_memset_overloaded(QUAL T *dst, unsigned char val, size_t byte_size, size_t byte_element_size = <unspecified>)``

- These overloads support destinations and sources which are a mix of the

+ These overloads support destinations and sources which have a mix of the

following qualifiers:

rsmith:

void* __builtin_memmove_sized(QUAL0 void *dst, QUAL1 const void *src, size_t byte_size, size_t byte_access_size)

void* __builtin_memset_sized(QUAL void *dst, unsigned char val, size_t byte_size, size_t byte_access_size)

rsmithUnsubmitted

Done

What happens if byte_element_size does not divide byte_size?

rsmith: What happens if `byte_element_size` does not divide `byte_size`?

jfbAuthorUnsubmitted

Done

Runtime constraint violation. constexpr needs to catch this too, added. Though IIUC we can't actually check alignment in constexpr, which makes sense since there's no actual address.

Similarly, I think we ought to add UBSan builtin check for this. I think it makes sense to add as an option to CreateElementUnorderedAtomicMemCpy: either assert-check at compile-time (the current default, which triggers assertions as I've annotated in the tests' FIXME), or at runtime if the sanitizer is enabled. WDYT?

I've added these two to the documentation.

jfb: Runtime constraint violation. constexpr needs to catch this too, added. Though IIUC we can't…

rsmithUnsubmitted

Done

Did you really mean void* here? I've been pretty confused by some of the stuff happening below that seems to depend on the actual type of the passed pointer, which would make more sense if you meant QUAL T * here rather than QUAL void*. Do the builtins do different things for different argument pointee types or not?

rsmith: Did you really mean `void*` here? I've been pretty confused by some of the stuff happening…

jfbAuthorUnsubmitted

Done

Oh yeah, this should be T* and U*. Fixed.

They used to key atomicity off of element size, but now that we have the extra parameter we only look at T and U for correctness (not behavior).

jfb: Oh yeah, this should be `T*` and `U*`. Fixed. They used to key atomicity off of element size…

Access size must be a compile-time constant. The memory will be accessed with a

rsmithUnsubmitted

Done

* ``volatile``

- * ``restrict``

* ``__unaligned``

Does restrict really make sense here? It seems like the only difference it could possibly make would be to treat memcpy as memmove if either operand is marked restrict, but
(a) if the caller wants that, they can just use memcpy directly, and
(b) it's not correct to propagate restrict-ness from the caller to the callee anyway, because restrict-ness is really a property of the declaration of the identifier in its scope, not a property of its type:

void f(void *restrict p) {
  __builtin_memmove(p, p + 1, 4);
}

(c) a restrict qualifier on the pointee type is irrelevant to memcpy and a restrict qualifier on the pointer type isn't part of QUAL.

rsmith: Does `restrict` really make sense here? It seems like the only difference it could possibly…

jfbAuthorUnsubmitted

Done

I dropped restrict.

jfb: I dropped `restrict`.

sequence of operations of size equal to or a multiple of the requested access

rsmithUnsubmitted

Done

* ``restrict``

- * ``__unaligned``

* non-default address spaces

I don't think `__unaligned matters any more. We always take the actual alignment inferred from the pointer arguments, just like we do for non-overloaded memcpy`.

rsmith: I don't think ``__unaligned`` matters any more. We always take the actual alignment inferred…

jfbAuthorUnsubmitted

Done

It's still allowed as a qualifier, though.

jfb: It's still allowed as a qualifier, though.

size. The order of operations is unspecified, and each access has unordered

atomic semantics. This means that reads and writes do not tear at the individual

access level, and they each occur exactly once, but the order in which they

occur (and in which they are observable) can only be guaranteed using

appropriate fences around the function call. The access size must therefore be a

lock-free size for the target architecture. It is undefined behavior to provide

rsmithUnsubmitted

Done

"the pointer's element size" -- do you mean "the provided element size"?

Does the element size need to be a compile-time constant? (Presumably, but you don't say so.)

rsmith: "the pointer's element size" -- do you mean "the provided element size"? Does the element size…

a memory locations which is aligned to less than the access size. It is

rsmithUnsubmitted

Done

lock-free size for the target architecture. It is undefined behavior to provide

- a memory locations which is aligned to less than the access size. It is

+ a memory location which is aligned to less than the access size. It is

undefined behavior to provide a size which is not evenly divided by the

rsmith:

undefined behavior to provide a size which is not evenly divided by the

rsmithUnsubmitted

Done

Presumably this means that it's an error if we don't provide lock-free atomic access for that size. Would be worth saying so.

rsmith: Presumably this means that it's an error if we don't provide lock-free atomic access for that…

rsmithUnsubmitted

Done

a memory locations which is aligned to less than the access size. It is

- undefined behavior to provide a size which is not evenly divided by the

+ undefined behavior to provide a size which is not evenly divisible by the

specified access size.

rsmith:

specified access size.

Atomic Min/Max builtins with memory ordering Atomic Min/Max builtins with memory ordering

rjmccallUnsubmitted

Done

"*The* element size must..." But I would suggest using "access size" consistently rather than "element size".

rjmccall: "*The* element size must..." But I would suggest using "access size" consistently rather than…

jfbAuthorUnsubmitted

Done

I'm being consistent with the naming for IR, which uses "element" as well. I'm not enamored with the naming, but wanted to point out the purposeful consistency to make sure you preferred "access size". Without precedent I would indeed prefer "access size", but have a slight preference for consistency here. This is extremely weakly held preference.

(I fixed "the").

jfb: I'm being consistent with the naming for IR, which uses "element" as well. I'm not enamored…

rjmccallUnsubmitted

Done

IR naming is generally random fluff plucked from the mind of an inspired compiler engineer. User documentation is the point where we're supposed to put our bad choices behind us and do something that makes sense to users. :)

rjmccall: IR naming is generally random fluff plucked from the mind of an inspired compiler engineer.

jfbAuthorUnsubmitted

Done

"access size" it is :)

jfb: "access size" it is :)

-------------------------------------------- --------------------------------------------

rsmithUnsubmitted

Done

This is missing some important details:

What does the size parameter mean? Is it number of bytes or number of elements? If it's number of bytes, what happens if it's not a multiple of the element size, particularly in the _Atomic case?
What does the value parameter to memset mean? Is it splatted to the element width? Does it specify a complete element value?
For _Atomic, what memory order is used?
For volatile, what access size / type is used? Do we want to make any promises?
Are the loads and stores typed or untyped? (In particular, do we annotate with TBAA metadata?)
Do we guarantee to copy the object representation or only the value representation? (Do we preserve the values of padding bits in the source, and initialize padding bits in the destination?)

You should also document whether constant evaluation of these builtins is supported.

rsmith: This is missing some important details: - What does the size parameter mean? Is it number of…

jfbAuthorUnsubmitted

Done

Most of these are answered in the update.

Some of the issue is that the current documentation is silent on these points already, by saying "same as C's mem* function". I'm relying on that approach here as well.

Size is bytes.

memset value is an unsigned char.

Memory order is unordered, and accesses themselves are done in indeterminate order.

For volatile, it falls out of the new wording that we don't provide access size guarantees. We'd need to nail down IR better to do so, and I don't think it's the salient property (though as discussed above, it might be useful, and the element_size parameter make it easy to do so).

Same on TBAA, no mention because "same as C" (no TBAA annotations).

Same on copying bits as-is.

Good point on constant evaluation. I added support. Note that we don't have memset constant evaluation, so I didn't support it. Seems easy, but ought to be a separate patch.

jfb: Most of these are answered in the update. Some of the issue is that the current documentation…

rsmithUnsubmitted

Done

What happens if they're not? Is it UB, or is it just not guaranteed to be atomic?

rsmith: What happens if they're not? Is it UB, or is it just not guaranteed to be atomic?

There are two atomic builtins with min/max in-memory comparison and swap. There are two atomic builtins with min/max in-memory comparison and swap.

rsmithUnsubmitted

Done

"runtime constraint violation" is an odd phrase; in C, "constraint violation" means a diagnostic is required. Can we instead say that it results in undefined behavior?

rsmith: "runtime constraint violation" is an odd phrase; in C, "constraint violation" means a…

The syntax and semantics are similar to GCC-compatible __atomic_* builtins. The syntax and semantics are similar to GCC-compatible __atomic_* builtins.

* ``__atomic_fetch_min`` * ``__atomic_fetch_min``

rsmithUnsubmitted

Done

*facilities

rsmith: *facilities

* ``__atomic_fetch_max`` * ``__atomic_fetch_max``

rsmithUnsubmitted

Done

From the above description, I think the documentation is unclear what the types T and U are used for. I think the answer is something like:

"""
The types T and U are required to be trivially-copyable types, and byte_element_size (if specified) must be a multiple of the size of both types. dst and src are expected to be suitably aligned for T and U objects, respectively.
"""

But... we actually get the alignment information by analyzing pointer argument rather than from the types, just like we do for memcpy and memmove, so maybe the latter part is not right. (What did you intend regarding alignment for the non-atomic case?) The trivial-copyability and divisibility checks don't seem fundamentally important to the goal of the builtin, so I wonder if we could actually just use void here and remove the extra checks. (I don't really have strong views one way or the other on this, except that we should either document what T and U are used for or make the builtins not care about the pointee type beyond its qualifiers.)

rsmith: From the above description, I think the documentation is unclear what the types `T` and `U` are…

jfbAuthorUnsubmitted

Done

You're right. I've removed most treatment of T / U, and updated the documentation.

I left the trivial copy check, but void* is a usual escape hatch.

Divisibility is now only checked for size / element_size.

jfb: You're right. I've removed most treatment of `T` / `U`, and updated the documentation. I left…

rsmithUnsubmitted

Done

Please document the trivial copy check.

rsmith: Please document the trivial copy check.

jfbAuthorUnsubmitted

Done

Should I bubble this to the rest of the builtin in a follow-up patch? I know there are cases where that'll cause issues, but I worry that it would be a pretty noisy diagnostic (especially if we instead bubble it to C's memcpy instead).

jfb: Should I bubble this to the rest of the builtin in a follow-up patch? I know there are cases…

The builtins work with signed and unsigned integers and require to specify memory ordering. The builtins work with signed and unsigned integers and require to specify memory ordering.

rsmithUnsubmitted

Done

Mixing those qualifiers doesn't seem like it will work in many cases: we don't allow mixing volatile and _Atomic (though I'm not sure why; LLVM supports volatile atomic operations), and presumably we shouldn't allow mixing __unaligned and _Atomic (although I don't see any tests for that, and maybe we should just outright disallow combining _Atomic with __unaligned in general).

rsmith: Mixing those qualifiers doesn't seem like it will work in many cases: we don't allow mixing…

jfbAuthorUnsubmitted

Done

volatile and _Atomic ought to work...

For this code I didn't make it work (even if it might be useful), because we'd need IR support for it.

On mixing _Atomic __unaligned: I left a FIXME because I'm not 100% sure, given the alignment discussion on atomic in general. Let's see where we settle: if we make it a pure runtime property then __unaligned ought to be fine because it's a constraint violation if the actual pointer is truly unaligned.

jfb: `volatile` and `_Atomic` ought to work... For this code I didn't make it work (even if it…

The return value is the original value that was stored in memory before comparison. The return value is the original value that was stored in memory before comparison.

rsmithUnsubmitted

Done

and might be non-uniform throughout the operation.

- The overloaded builtins expect both destination and source to be trivially

- copyable types.

+ The overloaded builtins require both ``dst`` and ``src`` to be pointers to trivially copyable types or to ``void`` prior to conversion to the parameter type.

The builtins can be used as building blocks for different facilities:

rsmith:

Example: Example:

.. code-block:: c .. code-block:: c

unsigned int val = __atomic_fetch_min(unsigned int *pi, unsigned int ui, __ATOMIC_RELAXED); unsigned int val = __atomic_fetch_min(unsigned int *pi, unsigned int ui, __ATOMIC_RELAXED);

The third argument is one of the memory ordering specifiers ``__ATOMIC_RELAXED``, The third argument is one of the memory ordering specifiers ``__ATOMIC_RELAXED``,

▲ Show 20 Lines • Show All 1,143 Lines • Show Last 20 Lines

clang/include/clang/Basic/Builtins.def

	Show First 20 Lines • Show All 465 Lines • ▼ Show 20 Lines
	BUILTIN(__builtin_rotateleft16, "UsUsUs", "nc")			BUILTIN(__builtin_rotateleft16, "UsUsUs", "nc")
	BUILTIN(__builtin_rotateleft32, "UZiUZiUZi", "nc")			BUILTIN(__builtin_rotateleft32, "UZiUZiUZi", "nc")
	BUILTIN(__builtin_rotateleft64, "UWiUWiUWi", "nc")			BUILTIN(__builtin_rotateleft64, "UWiUWiUWi", "nc")
	BUILTIN(__builtin_rotateright8, "UcUcUc", "nc")			BUILTIN(__builtin_rotateright8, "UcUcUc", "nc")
	BUILTIN(__builtin_rotateright16, "UsUsUs", "nc")			BUILTIN(__builtin_rotateright16, "UsUsUs", "nc")
	BUILTIN(__builtin_rotateright32, "UZiUZiUZi", "nc")			BUILTIN(__builtin_rotateright32, "UZiUZiUZi", "nc")
	BUILTIN(__builtin_rotateright64, "UWiUWiUWi", "nc")			BUILTIN(__builtin_rotateright64, "UWiUWiUWi", "nc")

	// Random GCC builtins			// Random GCC builtins
				rsmithUnsubmitted Done Reply Inline Actions Are these really GCC builtins? rsmith: Are these really GCC builtins?
				jfbAuthorUnsubmitted Done Reply Inline Actions Oops, I didn't see that comment, was just copying `__builtin_memcpy_inline`. I'll move it too. jfb: Oops, I didn't see that comment, was just copying `__builtin_memcpy_inline`. I'll move it too.
	BUILTIN(__builtin_constant_p, "i.", "nctu")			BUILTIN(__builtin_constant_p, "i.", "nctu")
	BUILTIN(__builtin_classify_type, "i.", "nctu")			BUILTIN(__builtin_classify_type, "i.", "nctu")
	BUILTIN(__builtin___CFStringMakeConstantString, "FCcC", "nc")			BUILTIN(__builtin___CFStringMakeConstantString, "FCcC", "nc")
	BUILTIN(__builtin___NSStringMakeConstantString, "FCcC", "nc")			BUILTIN(__builtin___NSStringMakeConstantString, "FCcC", "nc")
	BUILTIN(__builtin_va_start, "vA.", "nt")			BUILTIN(__builtin_va_start, "vA.", "nt")
	BUILTIN(__builtin_va_end, "vA", "n")			BUILTIN(__builtin_va_end, "vA", "n")
	BUILTIN(__builtin_va_copy, "vAA", "n")			BUILTIN(__builtin_va_copy, "vAA", "n")
	BUILTIN(__builtin_stdarg_start, "vA.", "nt")			BUILTIN(__builtin_stdarg_start, "vA.", "nt")
	BUILTIN(__builtin_assume_aligned, "vvCz.", "nc")			BUILTIN(__builtin_assume_aligned, "vvCz.", "nc")
	BUILTIN(__builtin_bcmp, "ivCvCz", "Fn")			BUILTIN(__builtin_bcmp, "ivCvCz", "Fn")
	BUILTIN(__builtin_bcopy, "vvvz", "n")			BUILTIN(__builtin_bcopy, "vvvz", "n")
	BUILTIN(__builtin_bzero, "vv*z", "nF")			BUILTIN(__builtin_bzero, "vv*z", "nF")
	BUILTIN(__builtin_fprintf, "iPcC.", "Fp:1:")			BUILTIN(__builtin_fprintf, "iPcC.", "Fp:1:")
	BUILTIN(__builtin_memchr, "vvCiz", "nF")			BUILTIN(__builtin_memchr, "vvCiz", "nF")
	BUILTIN(__builtin_memcmp, "ivCvCz", "nF")			BUILTIN(__builtin_memcmp, "ivCvCz", "nF")
	BUILTIN(__builtin_memcpy, "vvvC*z", "nF")			BUILTIN(__builtin_memcpy, "vvvC*z", "nF")
	BUILTIN(__builtin_memcpy_inline, "vvvCIz", "nt")
	BUILTIN(__builtin_memmove, "vvvC*z", "nF")			BUILTIN(__builtin_memmove, "vvvC*z", "nF")
				gchateletUnsubmitted Done Reply Inline Actions `overloaded` doesn't bring much semantic (says the one who added `__builtin_memcpy_inline`...). Can you come up with something that describes more precisely what the intends are? Also `memset`, `memcmp`, `memcpy`, `memmove` will have their `inline` and `overloaded` versions. This is becoming a crowded space. It may be confusing in the long run. If we want to go in that direction maybe we should come up with a consistent pattern: `__builtin_<memfun>_<feature>`. WDYT? gchatelet: `overloaded` doesn't bring much semantic (says the one who added `__builtin_memcpy_inline`...).
				jfbAuthorUnsubmitted Done Reply Inline Actions Flipping it around is fine with me, see update (done with `sed`). What's your approach on choosing what gets an `inline` variant and what doesn't? `memcmp` is easy to add, but I wonder how far it's useful to go... I can just wait for requests as well (as I imagine you're doing?). jfb: Flipping it around is fine with me, see update (done with `sed`). What's your approach on…
				gchateletUnsubmitted Done Reply Inline Actions I don't see `memmove_inline` being useful but memset and memcmp would make sense to add as building blocks for C++ implementations (e.g. libc memcpy) As for this new addition, how about `__builtin_memcpy_honor_qualifiers`? I fear that `__builtin_memcpy_overloaded` is too ambiguous. gchatelet: I don't see `memmove_inline` being useful but memset and memcmp would make sense to add as…
	BUILTIN(__builtin_mempcpy, "vvvC*z", "nF")			BUILTIN(__builtin_mempcpy, "vvvC*z", "nF")
	BUILTIN(__builtin_memset, "vviz", "nF")			BUILTIN(__builtin_memset, "vviz", "nF")
	BUILTIN(__builtin_printf, "icC*.", "Fp:0:")			BUILTIN(__builtin_printf, "icC*.", "Fp:0:")
	BUILTIN(__builtin_stpcpy, "cccC*", "nF")			BUILTIN(__builtin_stpcpy, "cccC*", "nF")
	BUILTIN(__builtin_stpncpy, "cccC*z", "nF")			BUILTIN(__builtin_stpncpy, "cccC*z", "nF")
	BUILTIN(__builtin_strcasecmp, "icCcC", "nF")			BUILTIN(__builtin_strcasecmp, "icCcC", "nF")
	BUILTIN(__builtin_strcat, "cccC*", "nF")			BUILTIN(__builtin_strcat, "cccC*", "nF")
	BUILTIN(__builtin_strchr, "ccCi", "nF")			BUILTIN(__builtin_strchr, "ccCi", "nF")
	▲ Show 20 Lines • Show All 978 Lines • ▼ Show 20 Lines
	BUILTIN(__builtin_saddll_overflow, "bSLLiCSLLiCSLLi*", "n")			BUILTIN(__builtin_saddll_overflow, "bSLLiCSLLiCSLLi*", "n")
	BUILTIN(__builtin_ssub_overflow, "bSiCSiCSi*", "n")			BUILTIN(__builtin_ssub_overflow, "bSiCSiCSi*", "n")
	BUILTIN(__builtin_ssubl_overflow, "bSLiCSLiCSLi*", "n")			BUILTIN(__builtin_ssubl_overflow, "bSLiCSLiCSLi*", "n")
	BUILTIN(__builtin_ssubll_overflow, "bSLLiCSLLiCSLLi*", "n")			BUILTIN(__builtin_ssubll_overflow, "bSLLiCSLLiCSLLi*", "n")
	BUILTIN(__builtin_smul_overflow, "bSiCSiCSi*", "n")			BUILTIN(__builtin_smul_overflow, "bSiCSiCSi*", "n")
	BUILTIN(__builtin_smull_overflow, "bSLiCSLiCSLi*", "n")			BUILTIN(__builtin_smull_overflow, "bSLiCSLiCSLi*", "n")
	BUILTIN(__builtin_smulll_overflow, "bSLLiCSLLiCSLLi*", "n")			BUILTIN(__builtin_smulll_overflow, "bSLLiCSLLiCSLLi*", "n")

	// Clang builtins (not available in GCC).			// Clang builtins (not available in GCC).
				rsmithUnsubmitted Done Reply Inline Actions The new builtins probably belong in this section of the file instead. rsmith: The new builtins probably belong in this section of the file instead.
	BUILTIN(__builtin_addressof, "v*v&", "nct")			BUILTIN(__builtin_addressof, "v*v&", "nct")
	BUILTIN(__builtin_operator_new, "v*z", "tc")			BUILTIN(__builtin_operator_new, "v*z", "tc")
	BUILTIN(__builtin_operator_delete, "vv*", "tn")			BUILTIN(__builtin_operator_delete, "vv*", "tn")
	BUILTIN(__builtin_char_memchr, "ccCiz", "n")			BUILTIN(__builtin_char_memchr, "ccCiz", "n")
	BUILTIN(__builtin_dump_struct, "ivCv", "tn")			BUILTIN(__builtin_dump_struct, "ivCv", "tn")
	BUILTIN(__builtin_preserve_access_index, "v.", "t")			BUILTIN(__builtin_preserve_access_index, "v.", "t")
				BUILTIN(__builtin_memcpy_inline, "vvvCIz", "nt")
				BUILTIN(__builtin_memcpy_sized, "vvvC*zz", "nt")
				BUILTIN(__builtin_memmove_sized, "vvvC*zz", "nt")
				BUILTIN(__builtin_memset_sized, "vvizz", "nt")

	// Alignment builtins (uses custom parsing to support pointers and integers)			// Alignment builtins (uses custom parsing to support pointers and integers)
	BUILTIN(__builtin_is_aligned, "bvC*z", "nct")			BUILTIN(__builtin_is_aligned, "bvC*z", "nct")
	BUILTIN(__builtin_align_up, "vvCz", "nct")			BUILTIN(__builtin_align_up, "vvCz", "nct")
	BUILTIN(__builtin_align_down, "vvCz", "nct")			BUILTIN(__builtin_align_down, "vvCz", "nct")

	// Safestack builtins			// Safestack builtins
	BUILTIN(__builtin___get_unsafe_stack_start, "v*", "Fn")			BUILTIN(__builtin___get_unsafe_stack_start, "v*", "Fn")
	▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

clang/include/clang/Basic/DiagnosticASTKinds.td

	Show First 20 Lines • Show All 279 Lines • ▼ Show 20 Lines
	def note_constexpr_memcpy_overlap : Note<			def note_constexpr_memcpy_overlap : Note<
	"'%select{memcpy\|wmemcpy}0' between overlapping memory regions">;			"'%select{memcpy\|wmemcpy}0' between overlapping memory regions">;
	def note_constexpr_memcpy_unsupported : Note<			def note_constexpr_memcpy_unsupported : Note<
	"'%select{%select{memcpy\|wmemcpy}1\|%select{memmove\|wmemmove}1}0' "			"'%select{%select{memcpy\|wmemcpy}1\|%select{memmove\|wmemmove}1}0' "
	"not supported: %select{"			"not supported: %select{"
	"size to copy (%4) is not a multiple of size of element type %3 (%5)\|"			"size to copy (%4) is not a multiple of size of element type %3 (%5)\|"
	"source is not a contiguous array of at least %4 elements of type %3\|"			"source is not a contiguous array of at least %4 elements of type %3\|"
	"destination is not a contiguous array of at least %4 elements of type %3}2">;			"destination is not a contiguous array of at least %4 elements of type %3}2">;
				def note_constexpr_mem_sized_bad_size : Note<
				"size parameter is %0, expected a size that is evenly divisible by "
				"element size %1">;
	def note_constexpr_bit_cast_unsupported_type : Note<			def note_constexpr_bit_cast_unsupported_type : Note<
	"constexpr bit_cast involving type %0 is not yet supported">;			"constexpr bit_cast involving type %0 is not yet supported">;
	def note_constexpr_bit_cast_unsupported_bitfield : Note<			def note_constexpr_bit_cast_unsupported_bitfield : Note<
	"constexpr bit_cast involving bit-field is not yet supported">;			"constexpr bit_cast involving bit-field is not yet supported">;
	def note_constexpr_bit_cast_invalid_type : Note<			def note_constexpr_bit_cast_invalid_type : Note<
	"bit_cast %select{from\|to}0 a %select{\|type with a }1"			"bit_cast %select{from\|to}0 a %select{\|type with a }1"
	"%select{union\|pointer\|member pointer\|volatile\|reference}2 "			"%select{union\|pointer\|member pointer\|volatile\|reference}2 "
	"%select{type\|member}1 is not allowed in a constant expression">;			"%select{type\|member}1 is not allowed in a constant expression">;
	▲ Show 20 Lines • Show All 290 Lines • Show Last 20 Lines

clang/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,943 Lines • ▼ Show 20 Lines
def err_arc_typecheck_convert_incompatible_pointer : Error<		def err_arc_typecheck_convert_incompatible_pointer : Error<
"incompatible pointer types passing retainable parameter of type %0"		"incompatible pointer types passing retainable parameter of type %0"
"to a CF function expecting %1 type">;		"to a CF function expecting %1 type">;

def err_builtin_fn_use : Error<"builtin functions must be directly called">;		def err_builtin_fn_use : Error<"builtin functions must be directly called">;

def warn_call_wrong_number_of_arguments : Warning<		def warn_call_wrong_number_of_arguments : Warning<
"too %select{few\|many}0 arguments in call to %1">;		"too %select{few\|many}0 arguments in call to %1">;
		def err_atomic_qualifier_invalid : Error<
		"parameter cannot have the _Atomic qualifier (%0 invalid)">;
def err_atomic_builtin_must_be_pointer : Error<		def err_atomic_builtin_must_be_pointer : Error<
"address argument to atomic builtin must be a pointer (%0 invalid)">;		"address argument to atomic builtin must be a pointer (%0 invalid)">;
def err_atomic_builtin_must_be_pointer_intptr : Error<		def err_atomic_builtin_must_be_pointer_intptr : Error<
"address argument to atomic builtin must be a pointer to integer or pointer"		"address argument to atomic builtin must be a pointer to integer or pointer"
" (%0 invalid)">;		" (%0 invalid)">;
def err_atomic_builtin_cannot_be_const : Error<		def err_atomic_builtin_cannot_be_const : Error<
"address argument to atomic builtin cannot be const-qualified (%0 invalid)">;		"address argument to atomic builtin cannot be const-qualified (%0 invalid)">;
def err_atomic_builtin_must_be_pointer_intfltptr : Error<		def err_atomic_builtin_must_be_pointer_intfltptr : Error<
Show All 11 Lines	def err_atomic_op_needs_atomic : Error<
"address argument to atomic operation must be a pointer to _Atomic "		"address argument to atomic operation must be a pointer to _Atomic "
"type (%0 invalid)">;		"type (%0 invalid)">;
def err_atomic_op_needs_non_const_atomic : Error<		def err_atomic_op_needs_non_const_atomic : Error<
"address argument to atomic operation must be a pointer to non-%select{const\|constant}0 _Atomic "		"address argument to atomic operation must be a pointer to non-%select{const\|constant}0 _Atomic "
"type (%1 invalid)">;		"type (%1 invalid)">;
def err_atomic_op_needs_non_const_pointer : Error<		def err_atomic_op_needs_non_const_pointer : Error<
"address argument to atomic operation must be a pointer to non-const "		"address argument to atomic operation must be a pointer to non-const "
"type (%0 invalid)">;		"type (%0 invalid)">;
def err_atomic_op_needs_trivial_copy : Error<		def err_atomic_op_needs_trivial_copy : Error<
"address argument to atomic operation must be a pointer to a "		"address argument to atomic operation must be a pointer to a "
"trivially-copyable type (%0 invalid)">;		"trivially-copyable type (%0 invalid)">;
rsmithUnsubmitted Done Reply Inline Actions I'd prefer to keep this diagnostic separate, since it communicates more information than `err_argument_needs_trivial_copy` does: specifically that we need a trivial copy because we're performing an atomic operation. rsmith: I'd prefer to keep this diagnostic separate, since it communicates more information than…
def err_atomic_op_needs_atomic_int_or_ptr : Error<		def err_atomic_op_needs_atomic_int_or_ptr : Error<
"address argument to atomic operation must be a pointer to %select{\|atomic }0"		"address argument to atomic operation must be a pointer to %select{\|atomic }0"
"integer or pointer (%1 invalid)">;		"integer or pointer (%1 invalid)">;
def err_atomic_op_needs_atomic_int : Error<		def err_atomic_op_needs_atomic_int : Error<
"address argument to atomic operation must be a pointer to "		"address argument to atomic operation must be a pointer to "
"%select{\|atomic }0integer (%1 invalid)">;		"%select{\|atomic }0integer (%1 invalid)">;
def warn_atomic_op_has_invalid_memory_order : Warning<		def warn_atomic_op_has_invalid_memory_order : Warning<
"memory order argument to atomic operation is invalid">,		"memory order argument to atomic operation is invalid">,
▲ Show 20 Lines • Show All 953 Lines • ▼ Show 20 Lines

def warn_null_arg : Warning<		def warn_null_arg : Warning<
"null passed to a callee that requires a non-null argument">,		"null passed to a callee that requires a non-null argument">,
InGroup<NonNull>;		InGroup<NonNull>;
def warn_null_ret : Warning<		def warn_null_ret : Warning<
"null returned from %select{function\|method}0 that requires a non-null return value">,		"null returned from %select{function\|method}0 that requires a non-null return value">,
InGroup<NonNull>;		InGroup<NonNull>;

		def err_const_arg : Error<
		"argument must be non-const, got %0">;

		def err_sized_volatile_unsupported : Error<
		"specifying an access size for volatile memory operations is unsupported "
		"(%0 is volatile)">;
		def err_elsz_must_be_lock_free : Error<
		"element size must be a lock-free size, %0 exceeds %1 bytes">;
		rsmithUnsubmitted Done Reply Inline Actions Given the new documentation, I would expect you don't need this any more. rsmith: Given the new documentation, I would expect you don't need this any more.

		rjmccallUnsubmitted Done Reply Inline Actions I don't know why you're adding a bunch of new diagnostics about _Atomic. rjmccall: I don't know why you're adding a bunch of new diagnostics about _Atomic.
		jfbAuthorUnsubmitted Done Reply Inline Actions Maybe the tests clarify this? Here's my rationale for the 3 new atomic diagnostics: Don't support mixing `volatile` and `atomic`, because we'd need to add IR support for it. It might be useful, but as a follow-up. Overloaded `memcpy` figures out the atomic operation size based on the element's own size. There's a destination and a source pointer, and we can't figure out the expected atomic operation size if they differ. It's likely an unintentional error to have different sizes when doing an atomic `memcpy`, so instead of figuring out the largest common matching size I figure it's better to diagnose. Supporting non-lock-free sizes seems fraught with peril, since it's likely unintentional. It's certainly doable (loop call the runtime support), but it's unclear if we should take the lock just once over the entire loop, or once for load+store, or once for load and once for store. I don't see a point in supporting it. jfb: Maybe the tests clarify this? Here's my rationale for the 3 new atomic diagnostics: * Don't…
		rsmithUnsubmitted Done Reply Inline Actions Please format these diagnostics consistently with the rest of the file: line break after `Error<`, wrap to 80 columns, don't leave blank lines between individual diagnostics in a group of related diagnostics. rsmith: Please format these diagnostics consistently with the rest of the file: line break after…
def err_lifetimebound_no_object_param : Error<		def err_lifetimebound_no_object_param : Error<
"'lifetimebound' attribute cannot be applied; %select{static \|non-}0member "		"'lifetimebound' attribute cannot be applied; %select{static \|non-}0member "
"function has no implicit object parameter">;		"function has no implicit object parameter">;
def err_lifetimebound_ctor_dtor : Error<		def err_lifetimebound_ctor_dtor : Error<
		rsmithUnsubmitted Done Reply Inline Actions Presumably the number of bytes need not be a compile-time constant? It's a bit weird to produce an error rather than a warning on a case that would be valid but (perhaps?) UB if the argument were non-constant. rsmith: Presumably the number of bytes need not be a compile-time constant? It's a bit weird to produce…
		jfbAuthorUnsubmitted Done Reply Inline Actions I commented below, indeed it seems like some of this ought to be relaxed. jfb: I commented below, indeed it seems like some of this ought to be relaxed.
"'lifetimebound' attribute cannot be applied to a "		"'lifetimebound' attribute cannot be applied to a "
"%select{constructor\|destructor}0">;		"%select{constructor\|destructor}0">;

// CHECK: returning address/reference of stack memory		// CHECK: returning address/reference of stack memory
def warn_ret_stack_addr_ref : Warning<		def warn_ret_stack_addr_ref : Warning<
"%select{address of\|reference to}0 stack memory associated with "		"%select{address of\|reference to}0 stack memory associated with "
"%select{local variable\|parameter}2 %1 returned">,		"%select{local variable\|parameter}2 %1 returned">,
InGroup<ReturnStackAddress>;		InGroup<ReturnStackAddress>;
▲ Show 20 Lines • Show All 1,967 Lines • Show Last 20 Lines

clang/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 12,224 Lines • ▼ Show 20 Lines	private:
bool SemaBuiltinLongjmp(CallExpr *TheCall);		bool SemaBuiltinLongjmp(CallExpr *TheCall);
bool SemaBuiltinSetjmp(CallExpr *TheCall);		bool SemaBuiltinSetjmp(CallExpr *TheCall);
ExprResult SemaBuiltinAtomicOverloaded(ExprResult TheCallResult);		ExprResult SemaBuiltinAtomicOverloaded(ExprResult TheCallResult);
ExprResult SemaBuiltinNontemporalOverloaded(ExprResult TheCallResult);		ExprResult SemaBuiltinNontemporalOverloaded(ExprResult TheCallResult);
ExprResult SemaAtomicOpsOverloaded(ExprResult TheCallResult,		ExprResult SemaAtomicOpsOverloaded(ExprResult TheCallResult,
AtomicExpr::AtomicOp Op);		AtomicExpr::AtomicOp Op);
ExprResult SemaBuiltinOperatorNewDeleteOverloaded(ExprResult TheCallResult,		ExprResult SemaBuiltinOperatorNewDeleteOverloaded(ExprResult TheCallResult,
bool IsDelete);		bool IsDelete);
		ExprResult SemaBuiltinMemcpySized(ExprResult TheCallResult);
		ExprResult SemaBuiltinMemsetSized(ExprResult TheCallResult);
bool SemaBuiltinConstantArg(CallExpr *TheCall, int ArgNum,		bool SemaBuiltinConstantArg(CallExpr *TheCall, int ArgNum,
llvm::APSInt &Result);		llvm::APSInt &Result);
bool SemaBuiltinConstantArgRange(CallExpr *TheCall, int ArgNum, int Low,		bool SemaBuiltinConstantArgRange(CallExpr *TheCall, int ArgNum, int Low,
int High, bool RangeIsError = true);		int High, bool RangeIsError = true);
bool SemaBuiltinConstantArgMultiple(CallExpr *TheCall, int ArgNum,		bool SemaBuiltinConstantArgMultiple(CallExpr *TheCall, int ArgNum,
unsigned Multiple);		unsigned Multiple);
bool SemaBuiltinConstantArgPower2(CallExpr *TheCall, int ArgNum);		bool SemaBuiltinConstantArgPower2(CallExpr *TheCall, int ArgNum);
bool SemaBuiltinConstantArgShiftedByte(CallExpr *TheCall, int ArgNum,		bool SemaBuiltinConstantArgShiftedByte(CallExpr *TheCall, int ArgNum,
▲ Show 20 Lines • Show All 446 Lines • Show Last 20 Lines

clang/lib/AST/ExprConstant.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,785 Lines • ▼ Show 20 Lines	if (Info.getLangOpts().CPlusPlus11)
Info.CCEDiag(E, diag::note_constexpr_invalid_function)		Info.CCEDiag(E, diag::note_constexpr_invalid_function)
<< /isConstexpr/0 << /isConstructor/0		<< /isConstexpr/0 << /isConstructor/0
<< (std::string("'") + Info.Ctx.BuiltinInfo.getName(BuiltinOp) + "'");		<< (std::string("'") + Info.Ctx.BuiltinInfo.getName(BuiltinOp) + "'");
else		else
Info.CCEDiag(E, diag::note_invalid_subexpr_in_const_expr);		Info.CCEDiag(E, diag::note_invalid_subexpr_in_const_expr);
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case Builtin::BI__builtin_memcpy:		case Builtin::BI__builtin_memcpy:
case Builtin::BI__builtin_memmove:		case Builtin::BI__builtin_memmove:
		case Builtin::BI__builtin_memcpy_sized:
		case Builtin::BI__builtin_memmove_sized:
case Builtin::BI__builtin_wmemcpy:		case Builtin::BI__builtin_wmemcpy:
case Builtin::BI__builtin_wmemmove: {		case Builtin::BI__builtin_wmemmove: {
bool WChar = BuiltinOp == Builtin::BIwmemcpy \|\|		bool WChar = BuiltinOp == Builtin::BIwmemcpy \|\|
BuiltinOp == Builtin::BIwmemmove \|\|		BuiltinOp == Builtin::BIwmemmove \|\|
BuiltinOp == Builtin::BI__builtin_wmemcpy \|\|		BuiltinOp == Builtin::BI__builtin_wmemcpy \|\|
BuiltinOp == Builtin::BI__builtin_wmemmove;		BuiltinOp == Builtin::BI__builtin_wmemmove;
bool Move = BuiltinOp == Builtin::BImemmove \|\|		bool Move = BuiltinOp == Builtin::BImemmove \|\|
BuiltinOp == Builtin::BIwmemmove \|\|		BuiltinOp == Builtin::BIwmemmove \|\|
BuiltinOp == Builtin::BI__builtin_memmove \|\|		BuiltinOp == Builtin::BI__builtin_memmove \|\|
		BuiltinOp == Builtin::BI__builtin_memmove_sized \|\|
BuiltinOp == Builtin::BI__builtin_wmemmove;		BuiltinOp == Builtin::BI__builtin_wmemmove;
		bool Sized = BuiltinOp == Builtin::BI__builtin_memcpy_sized \|\|
		BuiltinOp == Builtin::BI__builtin_memmove_sized;
		jfbAuthorUnsubmitted Done Reply Inline Actions If we end up making alignment a runtime constraint, then I'll need to check it in consteval. Otherwise I don't think we need to check anything since Sema ought to have done all the required checks already. jfb: If we end up making alignment a runtime constraint, then I'll need to check it in consteval.
		rsmithUnsubmitted Done Reply Inline Actions I don't see how you can check the alignment at compile time given a `void` argument. We presumably need to check that the element size (if given) divides the total size, assuming the outcome is UB if not. rsmith:* I don't see how you can check the alignment at compile time given a `void*` argument. We…

// The result of mem* is the first argument.		// The result of mem* is the first argument.
if (!Visit(E->getArg(0)))		if (!Visit(E->getArg(0)))
return false;		return false;
LValue Dest = Result;		LValue Dest = Result;

LValue Src;		LValue Src;
if (!EvaluatePointer(E->getArg(1), Src, Info))		if (!EvaluatePointer(E->getArg(1), Src, Info))
Show All 37 Lines	if (T->isIncompleteType()) {
Info.FFDiag(E, diag::note_constexpr_memcpy_incomplete_type) << Move << T;		Info.FFDiag(E, diag::note_constexpr_memcpy_incomplete_type) << Move << T;
return false;		return false;
}		}
if (!T.isTriviallyCopyableType(Info.Ctx)) {		if (!T.isTriviallyCopyableType(Info.Ctx)) {
Info.FFDiag(E, diag::note_constexpr_memcpy_nontrivial) << Move << T;		Info.FFDiag(E, diag::note_constexpr_memcpy_nontrivial) << Move << T;
return false;		return false;
}		}

		if (Sized) {
		// mem*_sized functions have a 4th parameter which denotes atomic element
		// size in bytes. Constexpr interpretation doesn't care about atomicity,
		// but needs to check runtime constraints on size. We can't check the
		// alignment runtime constraints.
		rsmithUnsubmitted Done Reply Inline Actions We could use the same logic we use in `__builtin_is_aligned` here. For any object whose value the constant evaluator can reason about, we should be able to compute at least a minimal alignment (though the actual runtime alignment might of course be greater). rsmith: We could use the same logic we use in `__builtin_is_aligned` here. For any object whose value…
		jfbAuthorUnsubmitted Done Reply Inline Actions I think the runtime alignment is really the only thing that matters here. I played with constexpr checking based on what `__builtin_is_aligned` does, and it's not particularly useful IMO. jfb: I think the runtime alignment is really the only thing that matters here. I played with…
		APSInt ElSz;
		if (!EvaluateInteger(E->getArg(3), ElSz, Info))
		return false;
		if (N.urem(ElSz.getLimitedValue()) != 0) {
		rsmithUnsubmitted Done Reply Inline Actions `getLimitedValue()` here seems unnecessary; `urem` can take an `APInt`. rsmith: `getLimitedValue()` here seems unnecessary; `urem` can take an `APInt`.
		jfbAuthorUnsubmitted Done Reply Inline Actions Their bitwidth doesn't always match, and that asserts out. jfb: Their bitwidth doesn't always match, and that asserts out.
		Info.FFDiag(E, diag::note_constexpr_mem_sized_bad_size)
		<< (int)N.getLimitedValue() << (int)ElSz.getLimitedValue();
		rsmithUnsubmitted Done Reply Inline Actions Consider using `toString` instead of truncating each `APSInt` to `uint64_t` then to `int`. The size might reliably fit into `uint64_t`, but I don't think we can assume that `int` is large enough. rsmith: Consider using `toString` instead of truncating each `APSInt` to `uint64_t` then to `int`. The…
		jfbAuthorUnsubmitted Done Reply Inline Actions OK I updated 2 other places as well. jfb: OK I updated 2 other places as well.
		return false;
		}
		}

// Figure out how many T's we're copying.		// Figure out how many T's we're copying.
uint64_t TSize = Info.Ctx.getTypeSizeInChars(T).getQuantity();		uint64_t TSize = Info.Ctx.getTypeSizeInChars(T).getQuantity();
if (!WChar) {		if (!WChar) {
uint64_t Remainder;		uint64_t Remainder;
llvm::APInt OrigN = N;		llvm::APInt OrigN = N;
llvm::APInt::udivrem(OrigN, TSize, N, Remainder);		llvm::APInt::udivrem(OrigN, TSize, N, Remainder);
if (Remainder) {		if (Remainder) {
Info.FFDiag(E, diag::note_constexpr_memcpy_unsupported)		Info.FFDiag(E, diag::note_constexpr_memcpy_unsupported)
▲ Show 20 Lines • Show All 6,308 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGBuilder.h

Show First 20 Lines • Show All 273 Lines • ▼ Show 20 Lines	public:
}		}
llvm::CallInst *CreateMemCpy(Address Dest, Address Src, uint64_t Size,		llvm::CallInst *CreateMemCpy(Address Dest, Address Src, uint64_t Size,
bool IsVolatile = false) {		bool IsVolatile = false) {
return CreateMemCpy(Dest.getPointer(), Dest.getAlignment().getAsAlign(),		return CreateMemCpy(Dest.getPointer(), Dest.getAlignment().getAsAlign(),
Src.getPointer(), Src.getAlignment().getAsAlign(), Size,		Src.getPointer(), Src.getAlignment().getAsAlign(), Size,
IsVolatile);		IsVolatile);
}		}

		using CGBuilderBaseTy::CreateElementUnorderedAtomicMemCpy;
		llvm::CallInst *CreateElementUnorderedAtomicMemCpy(Address Dest, Address Src,
		llvm::Value *Size,
		CharUnits ElementSize) {
		return CreateElementUnorderedAtomicMemCpy(
		Dest.getPointer(), Dest.getAlignment().getAsAlign(), Src.getPointer(),
		Src.getAlignment().getAsAlign(), Size, ElementSize.getQuantity());
		}

using CGBuilderBaseTy::CreateMemCpyInline;		using CGBuilderBaseTy::CreateMemCpyInline;
llvm::CallInst *CreateMemCpyInline(Address Dest, Address Src, uint64_t Size) {		llvm::CallInst *CreateMemCpyInline(Address Dest, Address Src, uint64_t Size) {
return CreateMemCpyInline(		return CreateMemCpyInline(
Dest.getPointer(), Dest.getAlignment().getAsAlign(), Src.getPointer(),		Dest.getPointer(), Dest.getAlignment().getAsAlign(), Src.getPointer(),
Src.getAlignment().getAsAlign(), getInt64(Size));		Src.getAlignment().getAsAlign(), getInt64(Size));
}		}

using CGBuilderBaseTy::CreateMemMove;		using CGBuilderBaseTy::CreateMemMove;
llvm::CallInst CreateMemMove(Address Dest, Address Src, llvm::Value Size,		llvm::CallInst CreateMemMove(Address Dest, Address Src, llvm::Value Size,
bool IsVolatile = false) {		bool IsVolatile = false) {
return CreateMemMove(Dest.getPointer(), Dest.getAlignment().getAsAlign(),		return CreateMemMove(Dest.getPointer(), Dest.getAlignment().getAsAlign(),
Src.getPointer(), Src.getAlignment().getAsAlign(),		Src.getPointer(), Src.getAlignment().getAsAlign(),
Size, IsVolatile);		Size, IsVolatile);
}		}

		using CGBuilderBaseTy::CreateElementUnorderedAtomicMemMove;
		llvm::CallInst *CreateElementUnorderedAtomicMemMove(Address Dest, Address Src,
		llvm::Value *Size,
		CharUnits ElementSize) {
		return CreateElementUnorderedAtomicMemMove(
		Dest.getPointer(), Dest.getAlignment().getAsAlign(), Src.getPointer(),
		Src.getAlignment().getAsAlign(), Size, ElementSize.getQuantity());
		}

using CGBuilderBaseTy::CreateMemSet;		using CGBuilderBaseTy::CreateMemSet;
llvm::CallInst CreateMemSet(Address Dest, llvm::Value Value,		llvm::CallInst CreateMemSet(Address Dest, llvm::Value Value,
llvm::Value *Size, bool IsVolatile = false) {		llvm::Value *Size, bool IsVolatile = false) {
return CreateMemSet(Dest.getPointer(), Value, Size,		return CreateMemSet(Dest.getPointer(), Value, Size,
Dest.getAlignment().getAsAlign(), IsVolatile);		Dest.getAlignment().getAsAlign(), IsVolatile);
}		}

		using CGBuilderBaseTy::CreateElementUnorderedAtomicMemSet;
		llvm::CallInst *CreateElementUnorderedAtomicMemSet(Address Dest,
		llvm::Value *Value,
		llvm::Value *Size,
		CharUnits ElementSize) {
		return CreateElementUnorderedAtomicMemSet(Dest.getPointer(), Value, Size,
		Dest.getAlignment().getAsAlign(),
		ElementSize.getQuantity());
		}

using CGBuilderBaseTy::CreatePreserveStructAccessIndex;		using CGBuilderBaseTy::CreatePreserveStructAccessIndex;
Address CreatePreserveStructAccessIndex(Address Addr,		Address CreatePreserveStructAccessIndex(Address Addr,
unsigned Index,		unsigned Index,
unsigned FieldIndex,		unsigned FieldIndex,
llvm::MDNode *DbgInfo) {		llvm::MDNode *DbgInfo) {
llvm::StructType *ElTy = cast<llvm::StructType>(Addr.getElementType());		llvm::StructType *ElTy = cast<llvm::StructType>(Addr.getElementType());
const llvm::DataLayout &DL = BB->getParent()->getParent()->getDataLayout();		const llvm::DataLayout &DL = BB->getParent()->getParent()->getDataLayout();
const llvm::StructLayout *Layout = DL.getStructLayout(ElTy);		const llvm::StructLayout *Layout = DL.getStructLayout(ElTy);
Show All 12 Lines

clang/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 619 Lines • ▼ Show 20 Lines for (const auto &Type : Types) {

if (Width < MinWidth) { if (Width < MinWidth) {

Width = MinWidth; Width = MinWidth;

} }

return {Width, Signed}; return {Width, Signed};

} }

static QualType getPtrArgType(CodeGenModule &CGM, const CallExpr *E,

unsigned ArgNo) {

QualType ArgTy = E->getArg(ArgNo)->IgnoreImpCasts()->getType();

if (ArgTy->isArrayType())

return CGM.getContext().getAsArrayType(ArgTy)->getElementType();

if (ArgTy->isObjCObjectPointerType())

return ArgTy->castAs<clang::ObjCObjectPointerType>()->getPointeeType();

return ArgTy->castAs<clang::PointerType>()->getPointeeType();

rsmithUnsubmitted

Done

I'm not sure this castAs is necessarily correct. If the operand is C++11 nullptr, we could perform a null-to-pointer implicit conversion, and ArgTy could be NullPtrTy after stripping that back off here.

It seems like maybe what we want to do is strip off implicit conversions until we hit a non-pointer type, and take the pointee type we found immediately before that?

rsmith: I'm not sure this `castAs` is necessarily correct. If the operand is C++11 `nullptr`, we could…

jfbAuthorUnsubmitted

Done

Ah good catch! The new functions I'm adding just disallow nullptr, but the older ones allow it. I've modified the code accordingly and added a test in CodeGen for nullptr.

jfb: Ah good catch! The new functions I'm adding just disallow nullptr, but the older ones allow it.

}

rjmccallUnsubmitted

Done

Since arrays are handled separately now, this is just getPointeeType(), but I don't know why you need to support ObjC object pointer types here at all.

rjmccall: Since arrays are handled separately now, this is just `getPointeeType()`, but I don't know why…

jfbAuthorUnsubmitted

Done

I'll remove ObjC handling for now, I added it because of code like what's in:
clang/test/CodeGenObjC/builtin-memfns.m

// PR13697
void cpy1(int *a, id b) {
  // CHECK-LABEL: @cpy1(
  // CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* {{.*}}, i8* {{.*}}, i64 8, i1 false)
  memcpy(a, b, 8);
}

Should we support this? It seems to me like yes, but you seem to think otherwise?

On arrays / ObjC being handled now: that's not really true... or rather, it now is for the builtins I'm adding, but not for the previously existing builtins. We can't just get the pointer argument type for this code:

// <rdar://problem/11314941>
// Make sure we don't over-estimate the alignment of fields of
// packed structs.
struct PS {
  int modes[4];
} __attribute__((packed));
struct PS ps;
void test8(int *arg) {
  // CHECK: @test8
  // CHECK: call void @llvm.memcpy{{.*}} align 4 {{.*}} align 1 {{.*}} 16, i1 false)
  __builtin_memcpy(arg, ps.modes, sizeof(struct PS));
}

Because __builtin_memcpy doesn't perform the conversion. Arguable a pre-existing bug, which I can patch here as I have, or fix in Sema if you'd rather see that? LMK.

jfb: I'll remove ObjC handling for now, I added it because of code like what's in…

Value *CodeGenFunction::EmitVAStartEnd(Value *ArgValue, bool IsStart) { Value *CodeGenFunction::EmitVAStartEnd(Value *ArgValue, bool IsStart) {

llvm::Type *DestType = Int8PtrTy; llvm::Type *DestType = Int8PtrTy;

if (ArgValue->getType() != DestType) if (ArgValue->getType() != DestType)

ArgValue = ArgValue =

Builder.CreateBitCast(ArgValue, DestType, ArgValue->getName().data()); Builder.CreateBitCast(ArgValue, DestType, ArgValue->getName().data());

rsmithUnsubmitted

Not Done

.isVolatileQualified();

- if (ArgTy->isObjCObjectPointerType())

- return ArgTy->castAs<clang::ObjCObjectPointerType>()

- ->getPointeeType()

- .isVolatileQualified();

- if (ArgTy->isPointerType())

- return ArgTy->castAs<clang::PointerType>()

- ->getPointeeType()

- .isVolatileQualified();

+ if (ArgTy->isAnyPointerType())

+ return ArgTy->getPointeeType().isVolatileQualified();

return false;

(Just a simplification, NFC.)

rsmith: (Just a simplification, NFC.)

Intrinsic::ID inst = IsStart ? Intrinsic::vastart : Intrinsic::vaend; Intrinsic::ID inst = IsStart ? Intrinsic::vastart : Intrinsic::vaend;

return Builder.CreateCall(CGM.getIntrinsic(inst), ArgValue); return Builder.CreateCall(CGM.getIntrinsic(inst), ArgValue);

} }

/// Checks if using the result of __builtin_object_size(p, @p From) in place of /// Checks if using the result of __builtin_object_size(p, @p From) in place of

/// __builtin_object_size(p, @p To) is correct /// __builtin_object_size(p, @p To) is correct

static bool areBOSTypesCompatible(int From, int To) { static bool areBOSTypesCompatible(int From, int To) {

// Note: Our __builtin_object_size implementation currently treats Type=0 and // Note: Our __builtin_object_size implementation currently treats Type=0 and

▲ Show 20 Lines • Show All 540 Lines • ▼ Show 20 Lines struct CallObjCArcUse final : EHScopeStack::Cleanup {

void Emit(CodeGenFunction &CGF, Flags flags) override { void Emit(CodeGenFunction &CGF, Flags flags) override {

CGF.EmitARCIntrinsicUse(object); CGF.EmitARCIntrinsicUse(object);

} }

}; };

} }

Value *CodeGenFunction::EmitCheckedArgForBuiltin(const Expr *E, Value *CodeGenFunction::EmitCheckedArgForBuiltin(const Expr *E,

BuiltinCheckKind Kind) { BuiltinCheck Kind) {

assert((Kind == BCK_CLZPassedZero || Kind == BCK_CTZPassedZero) assert((Kind == BuiltinCheck::CLZPassedZero ||

&& "Unsupported builtin check kind"); Kind == BuiltinCheck::CTZPassedZero) &&

"Unsupported builtin check kind");

Value *ArgValue = EmitScalarExpr(E); Value *ArgValue = EmitScalarExpr(E);

if (!SanOpts.has(SanitizerKind::Builtin) || !getTarget().isCLZForZeroUndef()) if (!SanOpts.has(SanitizerKind::Builtin) || !getTarget().isCLZForZeroUndef())

return ArgValue; return ArgValue;

SanitizerScope SanScope(this); SanitizerScope SanScope(this);

Value *Cond = Builder.CreateICmpNE( Value *Cond = Builder.CreateICmpNE(

ArgValue, llvm::Constant::getNullValue(ArgValue->getType())); ArgValue, llvm::Constant::getNullValue(ArgValue->getType()));

EmitCheck(std::make_pair(Cond, SanitizerKind::Builtin), EmitCheck(std::make_pair(Cond, SanitizerKind::Builtin),

SanitizerHandler::InvalidBuiltin, SanitizerHandler::InvalidBuiltin,

{EmitCheckSourceLocation(E->getExprLoc()), {EmitCheckSourceLocation(E->getExprLoc()),

llvm::ConstantInt::get(Builder.getInt8Ty(), Kind)}, llvm::ConstantInt::get(Builder.getInt8Ty(), (int)Kind)},

None); None);

return ArgValue; return ArgValue;

} }

/// Get the argument type for arguments to os_log_helper. /// Get the argument type for arguments to os_log_helper.

static CanQualType getOSLogArgType(ASTContext &C, int Size) { static CanQualType getOSLogArgType(ASTContext &C, int Size) {

QualType UnsignedTy = C.getIntTypeForBitwidth(Size * 8, /*Signed=*/false); QualType UnsignedTy = C.getIntTypeForBitwidth(Size * 8, /*Signed=*/false);

return C.getCanonicalType(UnsignedTy); return C.getCanonicalType(UnsignedTy);

▲ Show 20 Lines • Show All 290 Lines • ▼ Show 20 Lines EmitCheckedMixedSignMultiply(CodeGenFunction &CGF, const clang::Expr *Op1,

bool isVolatile = bool isVolatile =

ResultArg->getType()->getPointeeType().isVolatileQualified(); ResultArg->getType()->getPointeeType().isVolatileQualified();

CGF.Builder.CreateStore(CGF.EmitToMemory(Result, ResultQTy), ResultPtr, CGF.Builder.CreateStore(CGF.EmitToMemory(Result, ResultQTy), ResultPtr,

isVolatile); isVolatile);

return RValue::get(Overflow); return RValue::get(Overflow);

} }

static void EmitAtomicMemUBSanCheck(CodeGenFunction &CGF, unsigned BuiltinID,

const CallExpr *Call, Value *Dst,

Value *Src, Value *Size, CharUnits ElSz) {

if (!CGF.SanOpts.has(SanitizerKind::Builtin))

return;

CodeGenFunction::SanitizerScope SanScope(&CGF);

unsigned PtrBits = CGF.IntPtrTy->getIntegerBitWidth();

auto *ElSzI32 = llvm::Constant::getIntegerValue(

CGF.IntPtrTy, APInt(32, ElSz.getQuantity()));

auto *ElSzIPtr = llvm::Constant::getIntegerValue(

CGF.IntPtrTy, APInt(PtrBits, ElSz.getQuantity()));

auto *AlignMask = llvm::Constant::getIntegerValue(

CGF.IntPtrTy, APInt(PtrBits, ElSz.getQuantity() - 1));

auto *Zero = llvm::Constant::getNullValue(CGF.IntPtrTy);

auto *MisalignedFlag = llvm::ConstantInt::get(

CGF.Builder.getInt8Ty(),

(int)CodeGenFunction::BuiltinCheck::AtomicMemMisaligned);

auto *SizeFlag = llvm::ConstantInt::get(

CGF.Builder.getInt8Ty(),

(int)CodeGenFunction::BuiltinCheck::AtomicMemMismatchedSize);

// ((uintptr_t)Dst & (ElSz - 1)) == 0

auto *DstOK = CGF.Builder.CreateICmpEQ(

CGF.Builder.CreateAnd(CGF.Builder.CreatePtrToInt(Dst, CGF.IntPtrTy),

AlignMask),

Zero);

CGF.EmitCheck(std::make_pair(DstOK, SanitizerKind::Builtin),

SanitizerHandler::InvalidBuiltin,

{CGF.EmitCheckSourceLocation(Call->getArg(0)->getExprLoc()),

MisalignedFlag, ElSzI32},

{Dst});

// ((uintptr_t)Src & (ElSz - 1)) == 0

switch (BuiltinID) {

case Builtin::BI__builtin_memcpy_sized:

case Builtin::BI__builtin_memmove_sized: {

auto *SrcOK = CGF.Builder.CreateICmpEQ(

CGF.Builder.CreateAnd(CGF.Builder.CreatePtrToInt(Src, CGF.IntPtrTy),

AlignMask),

Zero);

CGF.EmitCheck(std::make_pair(SrcOK, SanitizerKind::Builtin),

SanitizerHandler::InvalidBuiltin,

{CGF.EmitCheckSourceLocation(Call->getArg(1)->getExprLoc()),

MisalignedFlag, ElSzI32},

{Src});

break;

}

case Builtin::BI__builtin_memset_sized:

// No source buffer on memset.

break;

default:

llvm_unreachable("unknown atomic mem builtin");

}

// (Size % ElSz) == 0

auto *SizeRem = CGF.Builder.CreateURem(Size, ElSzIPtr);

auto *SizeOK = CGF.Builder.CreateICmpEQ(SizeRem, Zero);

CGF.EmitCheck(std::make_pair(SizeOK, SanitizerKind::Builtin),

SanitizerHandler::InvalidBuiltin,

{CGF.EmitCheckSourceLocation(Call->getArg(2)->getExprLoc()),

SizeFlag, ElSzI32},

{Size});

}

static llvm::Value *dumpRecord(CodeGenFunction &CGF, QualType RType, static llvm::Value *dumpRecord(CodeGenFunction &CGF, QualType RType,

Value *&RecordPtr, CharUnits Align, Value *&RecordPtr, CharUnits Align,

llvm::FunctionCallee Func, int Lvl) { llvm::FunctionCallee Func, int Lvl) {

ASTContext &Context = CGF.getContext(); ASTContext &Context = CGF.getContext();

RecordDecl *RD = RType->castAs<RecordType>()->getDecl()->getDefinition(); RecordDecl *RD = RType->castAs<RecordType>()->getDecl()->getDefinition();

std::string Pad = std::string(Lvl * 4, ' '); std::string Pad = std::string(Lvl * 4, ' ');

Value *GString = Value *GString =

▲ Show 20 Lines • Show All 553 Lines • ▼ Show 20 Lines case Builtin::BI__builtin_clrsbll: {

Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true, Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,

"cast"); "cast");

return RValue::get(Result); return RValue::get(Result);

} }

case Builtin::BI__builtin_ctzs: case Builtin::BI__builtin_ctzs:

case Builtin::BI__builtin_ctz: case Builtin::BI__builtin_ctz:

case Builtin::BI__builtin_ctzl: case Builtin::BI__builtin_ctzl:

case Builtin::BI__builtin_ctzll: { case Builtin::BI__builtin_ctzll: {

Value *ArgValue = EmitCheckedArgForBuiltin(E->getArg(0), BCK_CTZPassedZero); Value *ArgValue =

EmitCheckedArgForBuiltin(E->getArg(0), BuiltinCheck::CTZPassedZero);

llvm::Type *ArgType = ArgValue->getType(); llvm::Type *ArgType = ArgValue->getType();

Function *F = CGM.getIntrinsic(Intrinsic::cttz, ArgType); Function *F = CGM.getIntrinsic(Intrinsic::cttz, ArgType);

llvm::Type *ResultType = ConvertType(E->getType()); llvm::Type *ResultType = ConvertType(E->getType());

Value *ZeroUndef = Builder.getInt1(getTarget().isCLZForZeroUndef()); Value *ZeroUndef = Builder.getInt1(getTarget().isCLZForZeroUndef());

Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef}); Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});

if (Result->getType() != ResultType) if (Result->getType() != ResultType)

Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true, Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,

"cast"); "cast");

return RValue::get(Result); return RValue::get(Result);

} }

case Builtin::BI__builtin_clzs: case Builtin::BI__builtin_clzs:

case Builtin::BI__builtin_clz: case Builtin::BI__builtin_clz:

case Builtin::BI__builtin_clzl: case Builtin::BI__builtin_clzl:

case Builtin::BI__builtin_clzll: { case Builtin::BI__builtin_clzll: {

Value *ArgValue = EmitCheckedArgForBuiltin(E->getArg(0), BCK_CLZPassedZero); Value *ArgValue =

EmitCheckedArgForBuiltin(E->getArg(0), BuiltinCheck::CLZPassedZero);

llvm::Type *ArgType = ArgValue->getType(); llvm::Type *ArgType = ArgValue->getType();

Function *F = CGM.getIntrinsic(Intrinsic::ctlz, ArgType); Function *F = CGM.getIntrinsic(Intrinsic::ctlz, ArgType);

llvm::Type *ResultType = ConvertType(E->getType()); llvm::Type *ResultType = ConvertType(E->getType());

Value *ZeroUndef = Builder.getInt1(getTarget().isCLZForZeroUndef()); Value *ZeroUndef = Builder.getInt1(getTarget().isCLZForZeroUndef());

Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef}); Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});

if (Result->getType() != ResultType) if (Result->getType() != ResultType)

▲ Show 20 Lines • Show All 512 Lines • ▼ Show 20 Lines case Builtin::BI__builtin_bzero: {

Value *SizeVal = EmitScalarExpr(E->getArg(1)); Value *SizeVal = EmitScalarExpr(E->getArg(1));

EmitNonNullArgCheck(RValue::get(Dest.getPointer()), E->getArg(0)->getType(), EmitNonNullArgCheck(RValue::get(Dest.getPointer()), E->getArg(0)->getType(),

E->getArg(0)->getExprLoc(), FD, 0); E->getArg(0)->getExprLoc(), FD, 0);

Builder.CreateMemSet(Dest, Builder.getInt8(0), SizeVal, false); Builder.CreateMemSet(Dest, Builder.getInt8(0), SizeVal, false);

return RValue::get(nullptr); return RValue::get(nullptr);

} }

case Builtin::BImemcpy: case Builtin::BImemcpy:

case Builtin::BI__builtin_memcpy: case Builtin::BI__builtin_memcpy:

case Builtin::BI__builtin_memcpy_sized:

case Builtin::BImempcpy: case Builtin::BImempcpy:

case Builtin::BI__builtin_mempcpy: { case Builtin::BI__builtin_mempcpy: {

QualType DestTy = getPtrArgType(CGM, E, 0);

QualType SrcTy = getPtrArgType(CGM, E, 1);

rsmithUnsubmitted

Done

Looking through implicit conversions in getPtrArgType here will change the code we generate for cases like:

void f(volatile void *p, volatile void *q) {
  memcpy(p, q, 4);
}

... (in C, where we permit such implicit conversions) to use a volatile memcpy intrinsic. Is that an intentional change?

rsmith: Looking through implicit conversions in `getPtrArgType` here will change the code we generate…

jfbAuthorUnsubmitted

Done

I'm confused... what's the difference that this makes for the pre-existing builtins? My intent was to get the QualType unconditionally, but I can conditionalize it if needed... However this ought to make no difference:

static QualType getPtrArgType(CodeGenModule &CGM, const CallExpr *E,
                              unsigned ArgNo) {
  QualType ArgTy = E->getArg(ArgNo)->IgnoreImpCasts()->getType();
  if (ArgTy->isArrayType())
    return CGM.getContext().getAsArrayType(ArgTy)->getElementType();
  if (ArgTy->isObjCObjectPointerType())
    return ArgTy->castAs<clang::ObjCObjectPointerType>()->getPointeeType();
  return ArgTy->castAs<clang::PointerType>()->getPointeeType();
}

and indeed I can't see the example you provided change in IR from one to the other. The issue I'm working around is that getting it unconditionally would make ObjC code sad when id is passed in as I outlined above.

jfb: I'm confused... what's the difference that this makes for the pre-existing builtins? My intent…

rsmithUnsubmitted

Done

The example I gave should produce a non-volatile memcpy, and used to do so (we passed false as the fourth parameter to CreateMemCpy). With this patch, getPtrArgTypewill strip off the implicit conversion from volatile void* to void* in the argument type, so isVolatile below will be true, so I think it will now create a volatile memcpy for the same testcase. If that's not what's happening, then I'd like to understand why not :)

I'm not saying that's necessarily a bad change, but it is a change, and it's one we should make intentionally if we make it at all.

rsmith: The example I gave should produce a non-volatile memcpy, and used to do so (we passed `false`…

jfbAuthorUnsubmitted

Done

Oh yes, sorry I thought you were talking about something that getPtrArgType did implicitly! Indeed the C code behaves differently in that it doesn't just strip volatile anymore.

I'm not super thrilled by the default C behavior, and I think this new behavior removes a gotcha, and is in fact what I was going for in the first iteration of the patch. Now that I've separated the builtin I agree that it's a bit odd... but it seems like the right thing to do anyways? But it no longer matches the C library function to do so.

FWIW, this currently "works as you'd expect":

void f(__attribute__((address_space(32))) void *dst, const void *src, int sz) {
    __builtin_memcpy(dst, src, sz);
}

https://godbolt.org/z/dcWxcK

and I think that's completely accidental because the C library function doesn't (and, as John pointed out earlier, the builtin is meant to act like the C function).

jfb: Oh yes, sorry I thought you were talking about something that `getPtrArgType` did implicitly!

rsmithUnsubmitted

Done

FWIW, this currently "works as you'd expect":

void f(__attribute__((address_space(32))) void *dst, const void *src, int sz) {
    __builtin_memcpy(dst, src, sz);
}

The same is true even if you remove the __builtin_ (and add a suitable include), and that seems like a bug to me. It looks like we have special logic that treats *all* builtins taking pointers as being overloaded on address space, which seems wrong at least for lib builtins. C TR 18037:2008 is quite explicit about this. Section 5.1.4 says:

"""
The standard C library (ISO/IEC 9899:1999 clause 7 - Libraries) is unchanged; the library's
functions and objects continue to be declared only with regard to the generic address space. One
consequence is that pointers into named address spaces cannot be passed as arguments to library
functions except in the special case that the named address spaces are subsets of the generic
address space.
"""

We could retain that special rule for __builtin_-spelled variants of lib builtins. If we do, then maybe we shouldn't be adding __builtin_memcpy_overloaded at all and should only extend the behavior of __builtin_memcpy to also propagate volatility (and add a new builtin for the atomic case).

Regarding volatile, consider:

void maybe_volatile_memcpy(volatile void *dst, const volatile void *src, int sz, _Bool is_volatile) {
  if (is_volatile) {
#ifdef __clang__
    __builtin_memcpy_overloaded(dst, src, sz);
#elif __GNUC__
    // ...
#else
    // volatile char copy loop
#endif
  }
  memcpy(dst, src, sz);
}

With this patch, the above code will always perform a volatile memcpy. I think that's a surprise. A call to memcpy should follow the C semantics, even if we choose to change the semantics of __builtin_memcpy.

rsmith: > FWIW, this currently "works as you'd expect": > > ``` > void f(__attribute__((address_space…

jfbAuthorUnsubmitted

Done

Yes, I believe that this is a pre-existing inconsistency with the non-__builtin_ variants.

jfb: Yes, I believe that this is a pre-existing inconsistency with the non-`__builtin_` variants.

Address Dest = EmitPointerWithAlignment(E->getArg(0)); Address Dest = EmitPointerWithAlignment(E->getArg(0));

Address Src = EmitPointerWithAlignment(E->getArg(1)); Address Src = EmitPointerWithAlignment(E->getArg(1));

bool isVolatile =

DestTy.isVolatileQualified() || SrcTy.isVolatileQualified();

bool isAtomic = BuiltinID == Builtin::BI__builtin_memcpy_sized;

Value *SizeVal = EmitScalarExpr(E->getArg(2)); Value *SizeVal = EmitScalarExpr(E->getArg(2));

EmitNonNullArgCheck(RValue::get(Dest.getPointer()), E->getArg(0)->getType(), EmitNonNullArgCheck(RValue::get(Dest.getPointer()), E->getArg(0)->getType(),

E->getArg(0)->getExprLoc(), FD, 0); E->getArg(0)->getExprLoc(), FD, 0);

EmitNonNullArgCheck(RValue::get(Src.getPointer()), E->getArg(1)->getType(), EmitNonNullArgCheck(RValue::get(Src.getPointer()), E->getArg(1)->getType(),

E->getArg(1)->getExprLoc(), FD, 1); E->getArg(1)->getExprLoc(), FD, 1);

Builder.CreateMemCpy(Dest, Src, SizeVal, false); if (isAtomic) {

auto ElSz =

CharUnits::fromQuantity(E->getArg(3)

->getIntegerConstantExpr(CGM.getContext())

->getLimitedValue());

EmitAtomicMemUBSanCheck(*this, BuiltinID, E, Dest.getPointer(),

Src.getPointer(), SizeVal, ElSz);

// Element unordered atomic memcpy requires aligned pointers. That's also

// a precondition of this builtin, which we optionally check with UBSan

// and then assume with the following adjustments.

if (Dest.getAlignment() < ElSz)

Dest = Address(Dest.getPointer(), ElSz);

if (Src.getAlignment() < ElSz)

Src = Address(Src.getPointer(), ElSz);

Builder.CreateElementUnorderedAtomicMemCpy(Dest, Src, SizeVal, ElSz);

} else

Builder.CreateMemCpy(Dest, Src, SizeVal, isVolatile);

if (BuiltinID == Builtin::BImempcpy || if (BuiltinID == Builtin::BImempcpy ||

BuiltinID == Builtin::BI__builtin_mempcpy) BuiltinID == Builtin::BI__builtin_mempcpy)

return RValue::get(Builder.CreateInBoundsGEP(Dest.getPointer(), SizeVal)); return RValue::get(Builder.CreateInBoundsGEP(Dest.getPointer(), SizeVal));

else else

return RValue::get(Dest.getPointer()); return RValue::get(Dest.getPointer());

} }

case Builtin::BI__builtin_memcpy_inline: { case Builtin::BI__builtin_memcpy_inline: {

▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines case Builtin::BI__builtin___memmove_chk: {

Address Dest = EmitPointerWithAlignment(E->getArg(0)); Address Dest = EmitPointerWithAlignment(E->getArg(0));

Address Src = EmitPointerWithAlignment(E->getArg(1)); Address Src = EmitPointerWithAlignment(E->getArg(1));

Value *SizeVal = llvm::ConstantInt::get(Builder.getContext(), Size); Value *SizeVal = llvm::ConstantInt::get(Builder.getContext(), Size);

Builder.CreateMemMove(Dest, Src, SizeVal, false); Builder.CreateMemMove(Dest, Src, SizeVal, false);

return RValue::get(Dest.getPointer()); return RValue::get(Dest.getPointer());

} }

case Builtin::BImemmove: case Builtin::BImemmove:

case Builtin::BI__builtin_memmove: { case Builtin::BI__builtin_memmove:

case Builtin::BI__builtin_memmove_sized: {

QualType DestTy = getPtrArgType(CGM, E, 0);

QualType SrcTy = getPtrArgType(CGM, E, 1);

Address Dest = EmitPointerWithAlignment(E->getArg(0)); Address Dest = EmitPointerWithAlignment(E->getArg(0));

Address Src = EmitPointerWithAlignment(E->getArg(1)); Address Src = EmitPointerWithAlignment(E->getArg(1));

bool isVolatile =

DestTy.isVolatileQualified() || SrcTy.isVolatileQualified();

bool isAtomic = BuiltinID == Builtin::BI__builtin_memmove_sized;

Value *SizeVal = EmitScalarExpr(E->getArg(2)); Value *SizeVal = EmitScalarExpr(E->getArg(2));

EmitNonNullArgCheck(RValue::get(Dest.getPointer()), E->getArg(0)->getType(), EmitNonNullArgCheck(RValue::get(Dest.getPointer()), E->getArg(0)->getType(),

E->getArg(0)->getExprLoc(), FD, 0); E->getArg(0)->getExprLoc(), FD, 0);

EmitNonNullArgCheck(RValue::get(Src.getPointer()), E->getArg(1)->getType(), EmitNonNullArgCheck(RValue::get(Src.getPointer()), E->getArg(1)->getType(),

E->getArg(1)->getExprLoc(), FD, 1); E->getArg(1)->getExprLoc(), FD, 1);

Builder.CreateMemMove(Dest, Src, SizeVal, false); if (isAtomic) {

auto ElSz =

CharUnits::fromQuantity(E->getArg(3)

->getIntegerConstantExpr(CGM.getContext())

->getLimitedValue());

EmitAtomicMemUBSanCheck(*this, BuiltinID, E, Dest.getPointer(),

Src.getPointer(), SizeVal, ElSz);

// Element unordered atomic memcpy requires aligned pointers. That's also

// a precondition of this builtin, which we optionally check with UBSan

// and then assume with the following adjustments.

if (Dest.getAlignment() < ElSz)

Dest = Address(Dest.getPointer(), ElSz);

if (Src.getAlignment() < ElSz)

Src = Address(Src.getPointer(), ElSz);

Builder.CreateElementUnorderedAtomicMemMove(Dest, Src, SizeVal, ElSz);

} else

Builder.CreateMemMove(Dest, Src, SizeVal, isVolatile);

return RValue::get(Dest.getPointer()); return RValue::get(Dest.getPointer());

} }

case Builtin::BImemset: case Builtin::BImemset:

case Builtin::BI__builtin_memset: { case Builtin::BI__builtin_memset:

case Builtin::BI__builtin_memset_sized: {

QualType DestTy = getPtrArgType(CGM, E, 0);

Address Dest = EmitPointerWithAlignment(E->getArg(0)); Address Dest = EmitPointerWithAlignment(E->getArg(0));

Value *ByteVal = Builder.CreateTrunc(EmitScalarExpr(E->getArg(1)), Value *ByteVal = Builder.CreateTrunc(EmitScalarExpr(E->getArg(1)),

Builder.getInt8Ty()); Builder.getInt8Ty());

bool isVolatile = DestTy.isVolatileQualified();

bool isAtomic = BuiltinID == Builtin::BI__builtin_memset_sized;

Value *SizeVal = EmitScalarExpr(E->getArg(2)); Value *SizeVal = EmitScalarExpr(E->getArg(2));

EmitNonNullArgCheck(RValue::get(Dest.getPointer()), E->getArg(0)->getType(), EmitNonNullArgCheck(RValue::get(Dest.getPointer()), E->getArg(0)->getType(),

E->getArg(0)->getExprLoc(), FD, 0); E->getArg(0)->getExprLoc(), FD, 0);

Builder.CreateMemSet(Dest, ByteVal, SizeVal, false); if (isAtomic) {

auto ElSz =

CharUnits::fromQuantity(E->getArg(3)

->getIntegerConstantExpr(CGM.getContext())

->getLimitedValue());

EmitAtomicMemUBSanCheck(*this, BuiltinID, E, Dest.getPointer(),

/*Src=*/nullptr, SizeVal, ElSz);

// Element unordered atomic memcpy requires aligned pointers. That's also

// a precondition of this builtin, which we optionally check with UBSan

// and then assume with the following adjustments.

if (Dest.getAlignment() < ElSz)

Dest = Address(Dest.getPointer(), ElSz);

Builder.CreateElementUnorderedAtomicMemSet(Dest, ByteVal, SizeVal, ElSz);

} else

Builder.CreateMemSet(Dest, ByteVal, SizeVal, isVolatile);

return RValue::get(Dest.getPointer()); return RValue::get(Dest.getPointer());

} }

case Builtin::BI__builtin___memset_chk: { case Builtin::BI__builtin___memset_chk: {

// fold __builtin_memset_chk(x, y, cst1, cst2) to memset iff cst1<=cst2. // fold __builtin_memset_chk(x, y, cst1, cst2) to memset iff cst1<=cst2.

Expr::EvalResult SizeResult, DstSizeResult; Expr::EvalResult SizeResult, DstSizeResult;

if (!E->getArg(2)->EvaluateAsInt(SizeResult, CGM.getContext()) || if (!E->getArg(2)->EvaluateAsInt(SizeResult, CGM.getContext()) ||

!E->getArg(3)->EvaluateAsInt(DstSizeResult, CGM.getContext())) !E->getArg(3)->EvaluateAsInt(DstSizeResult, CGM.getContext()))

break; break;

▲ Show 20 Lines • Show All 14,043 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenFunction.h

Show First 20 Lines • Show All 4,378 Lines • ▼ Show 20 Lines	llvm::Value EmitCheckedInBoundsGEP(llvm::Value Ptr,
ArrayRef<llvm::Value *> IdxList,		ArrayRef<llvm::Value *> IdxList,
bool SignedIndices,		bool SignedIndices,
bool IsSubtraction,		bool IsSubtraction,
SourceLocation Loc,		SourceLocation Loc,
const Twine &Name = "");		const Twine &Name = "");

/// Specifies which type of sanitizer check to apply when handling a		/// Specifies which type of sanitizer check to apply when handling a
/// particular builtin.		/// particular builtin.
enum BuiltinCheckKind {		enum class BuiltinCheck : unsigned char {
BCK_CTZPassedZero,		CTZPassedZero,
BCK_CLZPassedZero,		CLZPassedZero,
		AtomicMemMisaligned,
		AtomicMemMismatchedSize,
};		};

/// Emits an argument for a call to a builtin. If the builtin sanitizer is		/// Emits an argument for a call to a builtin. If the builtin sanitizer is
/// enabled, a runtime check specified by \p Kind is also emitted.		/// enabled, a runtime check specified by \p Kind is also emitted.
llvm::Value EmitCheckedArgForBuiltin(const Expr E, BuiltinCheckKind Kind);		llvm::Value EmitCheckedArgForBuiltin(const Expr E, BuiltinCheck Kind);

/// Emit a description of a type in a format suitable for passing to		/// Emit a description of a type in a format suitable for passing to
/// a runtime sanitizer handler.		/// a runtime sanitizer handler.
llvm::Constant *EmitCheckTypeDescriptor(QualType T);		llvm::Constant *EmitCheckTypeDescriptor(QualType T);

/// Convert a value into a format suitable for passing to a runtime		/// Convert a value into a format suitable for passing to a runtime
/// sanitizer handler.		/// sanitizer handler.
llvm::Value EmitCheckValue(llvm::Value V);		llvm::Value EmitCheckValue(llvm::Value V);
▲ Show 20 Lines • Show All 308 Lines • Show Last 20 Lines

clang/lib/Sema/SemaChecking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,268 Lines • ▼ Show 20 Lines
// OpenCL v2.0 s6.13.9 - Address space qualifier functions.		// OpenCL v2.0 s6.13.9 - Address space qualifier functions.
// Performs semantic analysis for the to_global/local/private call.		// Performs semantic analysis for the to_global/local/private call.
// \param S Reference to the semantic analyzer.		// \param S Reference to the semantic analyzer.
// \param BuiltinID ID of the builtin function.		// \param BuiltinID ID of the builtin function.
// \param Call A pointer to the builtin call.		// \param Call A pointer to the builtin call.
// \return True if a semantic error has been found, false otherwise.		// \return True if a semantic error has been found, false otherwise.
static bool SemaOpenCLBuiltinToAddr(Sema &S, unsigned BuiltinID,		static bool SemaOpenCLBuiltinToAddr(Sema &S, unsigned BuiltinID,
CallExpr *Call) {		CallExpr *Call) {
if (checkArgCount(S, Call, 1))		if (checkArgCount(S, Call, 1))
return true;		return true;

		rsmithUnsubmitted Done Reply Inline Actions There are a bunch of places in this file that do manual argument count checking and could use `checkArgCount` instead (search for `err_typecheck_call_too_` to find them). If you want to clean this up, please do so in a separate change. rsmith: There are a bunch of places in this file that do manual argument count checking and could use…
		jfbAuthorUnsubmitted Done Reply Inline Actions D84666 jfb: D84666
auto RT = Call->getArg(0)->getType();		auto RT = Call->getArg(0)->getType();
if (!RT->isPointerType() \|\| RT->getPointeeType()		if (!RT->isPointerType() \|\| RT->getPointeeType()
.getAddressSpace() == LangAS::opencl_constant) {		.getAddressSpace() == LangAS::opencl_constant) {
S.Diag(Call->getBeginLoc(), diag::err_opencl_builtin_to_addr_invalid_arg)		S.Diag(Call->getBeginLoc(), diag::err_opencl_builtin_to_addr_invalid_arg)
<< Call->getArg(0) << Call->getDirectCallee() << Call->getSourceRange();		<< Call->getArg(0) << Call->getDirectCallee() << Call->getSourceRange();
return true;		return true;
}		}

▲ Show 20 Lines • Show All 142 Lines • ▼ Show 20 Lines	bool Sema::CheckTSBuiltinFunctionCall(const TargetInfo &TI, unsigned BuiltinID,
}		}
}		}

ExprResult		ExprResult
Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,		Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
CallExpr *TheCall) {		CallExpr *TheCall) {
ExprResult TheCallResult(TheCall);		ExprResult TheCallResult(TheCall);

// Find out if any arguments are required to be integer constant expressions.		// Find out if any arguments are required to be integer constant expressions.
		erichkeaneUnsubmitted Done Reply Inline Actions Oh boy... all these lambdas are making me squeamish. erichkeane: Oh boy... all these lambdas are making me squeamish.
		jfbAuthorUnsubmitted Done Reply Inline Actions C++14 🎉 jfb: C++14 🎉
unsigned ICEArguments = 0;		unsigned ICEArguments = 0;
ASTContext::GetBuiltinTypeError Error;		ASTContext::GetBuiltinTypeError Error;
		erichkeaneUnsubmitted Done Reply Inline Actions What is wrong with CheckArgCount (static function at the top of the file)? It seems to do some nice additions here too. erichkeane: What is wrong with CheckArgCount (static function at the top of the file)? It seems to do some…
		jfbAuthorUnsubmitted Done Reply Inline Actions It is most wonderful and has now taken over for valiant `CheckArityIs`. I'd somehow not seen that! I had gripped for another error message and figured this was what I needed. jfb: It is most wonderful and has now taken over for valiant `CheckArityIs`. I'd somehow not seen…
Context.GetBuiltinType(BuiltinID, Error, &ICEArguments);		Context.GetBuiltinType(BuiltinID, Error, &ICEArguments);
if (Error != ASTContext::GE_None)		if (Error != ASTContext::GE_None)
ICEArguments = 0; // Don't diagnose previously diagnosed errors.		ICEArguments = 0; // Don't diagnose previously diagnosed errors.

// If any arguments are required to be ICE's, check and diagnose.		// If any arguments are required to be ICE's, check and diagnose.
for (unsigned ArgNo = 0; ICEArguments != 0; ++ArgNo) {		for (unsigned ArgNo = 0; ICEArguments != 0; ++ArgNo) {
// Skip arguments not required to be ICE's.		// Skip arguments not required to be ICE's.
if ((ICEArguments & (1 << ArgNo)) == 0) continue;		if ((ICEArguments & (1 << ArgNo)) == 0) continue;
		erichkeaneUnsubmitted Done Reply Inline Actions What is this doing that ->getType()->getPointeeOrArrayElementType() doesn't do? erichkeane: What is this doing that ->getType()->getPointeeOrArrayElementType() doesn't do?
		jfbAuthorUnsubmitted Done Reply Inline Actions It keeps the qualifiers 🙂 Maybe I can make a separate `QualType` helper that does this? jfb: It keeps the qualifiers 🙂 Maybe I can make a separate `QualType` helper that does this?

llvm::APSInt Result;		llvm::APSInt Result;
if (SemaBuiltinConstantArg(TheCall, ArgNo, Result))		if (SemaBuiltinConstantArg(TheCall, ArgNo, Result))
return true;		return true;
ICEArguments &= ~(1 << ArgNo);		ICEArguments &= ~(1 << ArgNo);
		erichkeaneUnsubmitted Done Reply Inline Actions Typically 'check' functions have the bool logic reversed. erichkeane: Typically 'check' functions have the bool logic reversed.
}		}

switch (BuiltinID) {		switch (BuiltinID) {
case Builtin::BI__builtin___CFStringMakeConstantString:		case Builtin::BI__builtin___CFStringMakeConstantString:
assert(TheCall->getNumArgs() == 1 &&		assert(TheCall->getNumArgs() == 1 &&
"Wrong # arguments to builtin CFStringMakeConstantString");		"Wrong # arguments to builtin CFStringMakeConstantString");
if (CheckObjCString(TheCall->getArg(0)))		if (CheckObjCString(TheCall->getArg(0)))
return ExprError();		return ExprError();
		erichkeaneUnsubmitted Done Reply Inline Actions Why is a value-dependent expression not OK? Typically you'd want to not bother with dependence in the checking, and just check it during instantiation (the 2nd time this is called). Because it seems to me that this will error during phase 1 when an integer template parameter (or 'auto' parameter?) would be fine later. erichkeane: Why is a value-dependent expression not OK? Typically you'd want to not bother with dependence…
		jfbAuthorUnsubmitted Done Reply Inline Actions bool Expr::EvaluateAsInt(EvalResult &Result, const ASTContext &Ctx, SideEffectsKind AllowSideEffects, bool InConstantContext) const { assert(!isValueDependent() && "Expression evaluator can't be called on a dependent expression."); EvalInfo Info(Ctx, Result, EvalInfo::EM_IgnoreSideEffects); Info.InConstantContext = InConstantContext; return ::EvaluateAsInt(this, Result, Ctx, AllowSideEffects, Info); } 😊 It seems pretty common to use that check when trying to get a value out. jfb: ``` bool Expr::EvaluateAsInt(EvalResult &Result, const ASTContext &Ctx…
break;		break;
case Builtin::BI__builtin_ms_va_start:		case Builtin::BI__builtin_ms_va_start:
case Builtin::BI__builtin_stdarg_start:		case Builtin::BI__builtin_stdarg_start:
case Builtin::BI__builtin_va_start:		case Builtin::BI__builtin_va_start:
if (SemaBuiltinVAStart(BuiltinID, TheCall))		if (SemaBuiltinVAStart(BuiltinID, TheCall))
return ExprError();		return ExprError();
break;		break;
		erichkeaneUnsubmitted Done Reply Inline Actions Put a comment in the message with some sort of hint as to what the '0' does. Typically a << /Whatever/ 0<< erichkeane: Put a comment in the message with some sort of hint as to what the '0' does. Typically a <<…
case Builtin::BI__va_start: {		case Builtin::BI__va_start: {
switch (Context.getTargetInfo().getTriple().getArch()) {		switch (Context.getTargetInfo().getTriple().getArch()) {
case llvm::Triple::aarch64:		case llvm::Triple::aarch64:
case llvm::Triple::arm:		case llvm::Triple::arm:
case llvm::Triple::thumb:		case llvm::Triple::thumb:
if (SemaBuiltinVAStartARMMicrosoft(TheCall))		if (SemaBuiltinVAStartARMMicrosoft(TheCall))
return ExprError();		return ExprError();
break;		break;
default:		default:
if (SemaBuiltinVAStart(BuiltinID, TheCall))		if (SemaBuiltinVAStart(BuiltinID, TheCall))
return ExprError();		return ExprError();
break;		break;
}		}
break;		break;
}		}

// The acquire, release, and no fence variants are ARM and AArch64 only.		// The acquire, release, and no fence variants are ARM and AArch64 only.
case Builtin::BI_interlockedbittestandset_acq:		case Builtin::BI_interlockedbittestandset_acq:
case Builtin::BI_interlockedbittestandset_rel:		case Builtin::BI_interlockedbittestandset_rel:
case Builtin::BI_interlockedbittestandset_nf:		case Builtin::BI_interlockedbittestandset_nf:
case Builtin::BI_interlockedbittestandreset_acq:		case Builtin::BI_interlockedbittestandreset_acq:
case Builtin::BI_interlockedbittestandreset_rel:		case Builtin::BI_interlockedbittestandreset_rel:
		erichkeaneUnsubmitted Done Reply Inline Actions Can you invert this logic? !Ptr && !Array? erichkeane: Can you invert this logic? !Ptr && !Array?
case Builtin::BI_interlockedbittestandreset_nf:		case Builtin::BI_interlockedbittestandreset_nf:
if (CheckBuiltinTargetSupport(		if (CheckBuiltinTargetSupport(
*this, BuiltinID, TheCall,		*this, BuiltinID, TheCall,
{llvm::Triple::arm, llvm::Triple::thumb, llvm::Triple::aarch64}))		{llvm::Triple::arm, llvm::Triple::thumb, llvm::Triple::aarch64}))
return ExprError();		return ExprError();
break;		break;

// The 64-bit bittest variants are x64, ARM, and AArch64 only.		// The 64-bit bittest variants are x64, ARM, and AArch64 only.
case Builtin::BI_bittest64:		case Builtin::BI_bittest64:
case Builtin::BI_bittestandcomplement64:		case Builtin::BI_bittestandcomplement64:
case Builtin::BI_bittestandreset64:		case Builtin::BI_bittestandreset64:
case Builtin::BI_bittestandset64:		case Builtin::BI_bittestandset64:
case Builtin::BI_interlockedbittestandreset64:		case Builtin::BI_interlockedbittestandreset64:
case Builtin::BI_interlockedbittestandset64:		case Builtin::BI_interlockedbittestandset64:
if (CheckBuiltinTargetSupport(*this, BuiltinID, TheCall,		if (CheckBuiltinTargetSupport(*this, BuiltinID, TheCall,
		erichkeaneUnsubmitted Done Reply Inline Actions What if 1 of them is of these types? Is that OK? erichkeane: What if 1 of them is of these types? Is that OK?
		jfbAuthorUnsubmitted Done Reply Inline Actions It's to avoid weird corner cases where this check isn't super relevant, but subsequent ones are. It avoids making `isVolatileQualified` below sad because e.g. `void` makes the `QualType` null. That one can't be `_Atomic`, and it can be `volatile` but then the size won't match the `_Atomic`'s size. jfb: It's to avoid weird corner cases where this check isn't super relevant, but subsequent ones are.
{llvm::Triple::x86_64, llvm::Triple::arm,		{llvm::Triple::x86_64, llvm::Triple::arm,
llvm::Triple::thumb, llvm::Triple::aarch64}))		llvm::Triple::thumb, llvm::Triple::aarch64}))
return ExprError();		return ExprError();
break;		break;

case Builtin::BI__builtin_isgreater:		case Builtin::BI__builtin_isgreater:
case Builtin::BI__builtin_isgreaterequal:		case Builtin::BI__builtin_isgreaterequal:
case Builtin::BI__builtin_isless:		case Builtin::BI__builtin_isless:
case Builtin::BI__builtin_islessequal:		case Builtin::BI__builtin_islessequal:
case Builtin::BI__builtin_islessgreater:		case Builtin::BI__builtin_islessgreater:
case Builtin::BI__builtin_isunordered:		case Builtin::BI__builtin_isunordered:
if (SemaBuiltinUnorderedCompare(TheCall))		if (SemaBuiltinUnorderedCompare(TheCall))
return ExprError();		return ExprError();
break;		break;
case Builtin::BI__builtin_fpclassify:		case Builtin::BI__builtin_fpclassify:
if (SemaBuiltinFPClassification(TheCall, 6))		if (SemaBuiltinFPClassification(TheCall, 6))
return ExprError();		return ExprError();
break;		break;
		erichkeaneUnsubmitted Done Reply Inline Actions Same question as above. Is there other checks that need to happen here? Also, is there any ability to reuse some of the logic between these funcitons? erichkeane: Same question as above. Is there other checks that need to happen here? Also, is there any…
		jfbAuthorUnsubmitted Done Reply Inline Actions I don't think so here either. jfb: I don't think so here either.
case Builtin::BI__builtin_isfinite:		case Builtin::BI__builtin_isfinite:
case Builtin::BI__builtin_isinf:		case Builtin::BI__builtin_isinf:
case Builtin::BI__builtin_isinf_sign:		case Builtin::BI__builtin_isinf_sign:
case Builtin::BI__builtin_isnan:		case Builtin::BI__builtin_isnan:
case Builtin::BI__builtin_isnormal:		case Builtin::BI__builtin_isnormal:
case Builtin::BI__builtin_signbit:		case Builtin::BI__builtin_signbit:
		erichkeaneUnsubmitted Done Reply Inline Actions invert these please. erichkeane: invert these please.
case Builtin::BI__builtin_signbitf:		case Builtin::BI__builtin_signbitf:
case Builtin::BI__builtin_signbitl:		case Builtin::BI__builtin_signbitl:
if (SemaBuiltinFPClassification(TheCall, 1))		if (SemaBuiltinFPClassification(TheCall, 1))
return ExprError();		return ExprError();
break;		break;
case Builtin::BI__builtin_shufflevector:		case Builtin::BI__builtin_shufflevector:
return SemaBuiltinShuffleVector(TheCall);		return SemaBuiltinShuffleVector(TheCall);
// TheCall will be freed by the smart pointer here, but that's fine, since		// TheCall will be freed by the smart pointer here, but that's fine, since
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
case Builtin::BI__sync_fetch_and_add_2:		case Builtin::BI__sync_fetch_and_add_2:
case Builtin::BI__sync_fetch_and_add_4:		case Builtin::BI__sync_fetch_and_add_4:
case Builtin::BI__sync_fetch_and_add_8:		case Builtin::BI__sync_fetch_and_add_8:
case Builtin::BI__sync_fetch_and_add_16:		case Builtin::BI__sync_fetch_and_add_16:
case Builtin::BI__sync_fetch_and_sub:		case Builtin::BI__sync_fetch_and_sub:
case Builtin::BI__sync_fetch_and_sub_1:		case Builtin::BI__sync_fetch_and_sub_1:
case Builtin::BI__sync_fetch_and_sub_2:		case Builtin::BI__sync_fetch_and_sub_2:
case Builtin::BI__sync_fetch_and_sub_4:		case Builtin::BI__sync_fetch_and_sub_4:
case Builtin::BI__sync_fetch_and_sub_8:		case Builtin::BI__sync_fetch_and_sub_8:
		rjmccallUnsubmitted Done Reply Inline Actions I am not a fan of this lambda style, not because I dislike lambdas, but because you've pulled a ton of code that's supporting one or two cases (that could easily be handled together) into a much wider scope. Your helper code are doing a ton of redundant type checks and is probably not as general as you think it is. You need to call `DefaultFunctionArrayLvalueConversion` on the pointer arguments, after which you can just check for a pointer type. You also need to convert the size argument to a `size_t` as if initializing a parameter. If you do these things, the IRGen code will get much simpler because e.g. it will not need to specially handle arrays anymore. You will also start magically doing the right thing w.r.t ODR-uses of constexpr variables. rjmccall: I am not a fan of this lambda style, not because I dislike lambdas, but because you've pulled a…
case Builtin::BI__sync_fetch_and_sub_16:		case Builtin::BI__sync_fetch_and_sub_16:
case Builtin::BI__sync_fetch_and_or:		case Builtin::BI__sync_fetch_and_or:
case Builtin::BI__sync_fetch_and_or_1:		case Builtin::BI__sync_fetch_and_or_1:
case Builtin::BI__sync_fetch_and_or_2:		case Builtin::BI__sync_fetch_and_or_2:
case Builtin::BI__sync_fetch_and_or_4:		case Builtin::BI__sync_fetch_and_or_4:
case Builtin::BI__sync_fetch_and_or_8:		case Builtin::BI__sync_fetch_and_or_8:
case Builtin::BI__sync_fetch_and_or_16:		case Builtin::BI__sync_fetch_and_or_16:
case Builtin::BI__sync_fetch_and_and:		case Builtin::BI__sync_fetch_and_and:
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	case Builtin::BI__builtin_memcpy_inline: {
if (SizeOp->isValueDependent())		if (SizeOp->isValueDependent())
break;		break;
if (!SizeOp->EvaluateKnownConstInt(Context).isNullValue()) {		if (!SizeOp->EvaluateKnownConstInt(Context).isNullValue()) {
CheckNonNullArgument(*this, TheCall->getArg(0), TheCall->getExprLoc());		CheckNonNullArgument(*this, TheCall->getArg(0), TheCall->getExprLoc());
CheckNonNullArgument(*this, TheCall->getArg(1), TheCall->getExprLoc());		CheckNonNullArgument(*this, TheCall->getArg(1), TheCall->getExprLoc());
}		}
break;		break;
}		}
		case Builtin::BI__builtin_memcpy_sized:
		case Builtin::BI__builtin_memmove_sized:
		return SemaBuiltinMemcpySized(TheCallResult);
		case Builtin::BI__builtin_memset_sized:
		return SemaBuiltinMemsetSized(TheCallResult);
#define BUILTIN(ID, TYPE, ATTRS)		#define BUILTIN(ID, TYPE, ATTRS)
#define ATOMIC_BUILTIN(ID, TYPE, ATTRS) \		#define ATOMIC_BUILTIN(ID, TYPE, ATTRS) \
case Builtin::BI##ID: \		case Builtin::BI##ID: \
return SemaAtomicOpsOverloaded(TheCallResult, AtomicExpr::AO##ID);		return SemaAtomicOpsOverloaded(TheCallResult, AtomicExpr::AO##ID);
#include "clang/Basic/Builtins.def"		#include "clang/Basic/Builtins.def"
case Builtin::BI__annotation:		case Builtin::BI__annotation:
if (SemaBuiltinMSVCAnnotation(*this, TheCall))		if (SemaBuiltinMSVCAnnotation(*this, TheCall))
return ExprError();		return ExprError();
▲ Show 20 Lines • Show All 3,795 Lines • ▼ Show 20 Lines	ExprResult Sema::SemaBuiltinNontemporalOverloaded(ExprResult TheCallResult) {
if (ValArg.isInvalid())		if (ValArg.isInvalid())
return ExprError();		return ExprError();

TheCall->setArg(0, ValArg.get());		TheCall->setArg(0, ValArg.get());
TheCall->setType(Context.VoidTy);		TheCall->setType(Context.VoidTy);
return TheCallResult;		return TheCallResult;
}		}

		/// Perform semantic checking for __builtin_memcpy_sized and
		/// __builtin_memmove_sized, which are overloaded on the pointer types of the
		/// destination and source arguments, and have an extra 4th parameter for access
		/// size.
		ExprResult Sema::SemaBuiltinMemcpySized(ExprResult TheCallResult) {
		CallExpr TheCall = (CallExpr )TheCallResult.get();

		if (checkArgCount(*this, TheCall, 4))
		return ExprError();

		ExprResult DstPtr = DefaultFunctionArrayLvalueConversion(TheCall->getArg(0));
		if (DstPtr.isInvalid())
		return ExprError();
		clang::Expr *DstOp = DstPtr.get();
		TheCall->setArg(0, DstOp);

		ExprResult SrcPtr = DefaultFunctionArrayLvalueConversion(TheCall->getArg(1));
		if (SrcPtr.isInvalid())
		rjmccallUnsubmitted Done Reply Inline Actions Do you ever write these back into the call? rjmccall: Do you ever write these back into the call?
		return ExprError();
		clang::Expr *SrcOp = SrcPtr.get();
		TheCall->setArg(1, SrcOp);

		const PointerType *DstTy = DstOp->getType()->getAs<PointerType>();
		const PointerType *SrcTy = SrcOp->getType()->getAs<PointerType>();
		if (!DstTy)
		return ExprError(
		Diag(TheCall->getBeginLoc(), diag::err_init_conversion_failed)
		<< InitializedEntity::EK_Parameter << Context.VoidPtrTy
		<< DstOp->isLValue() << DstOp->getType() << /no difference/ 0
		<< DstOp->getSourceRange());
		if (!SrcTy)
		return ExprError(
		Diag(TheCall->getBeginLoc(), diag::err_init_conversion_failed)
		<< InitializedEntity::EK_Parameter << Context.VoidPtrTy
		<< SrcOp->isLValue() << SrcOp->getType() << /no difference/ 0
		<< SrcOp->getSourceRange());

		QualType DstValTy = DstTy->getPointeeType();
		QualType SrcValTy = SrcTy->getPointeeType();

		rjmccallUnsubmitted Done Reply Inline Actions You already know that DstTy and SrcTy are non-null here. Why do you need to support atomic types for these operations anyway? It just seems treacherous and unnecessary. rjmccall: You already know that DstTy and SrcTy are non-null here. Why do you need to support atomic…
		jfbAuthorUnsubmitted Done Reply Inline Actions Leftover from the refactoring :) It's useful to get atomic memcpy, see https://wg21.link/P1478 It's also part of "support overloaded memcpy" which is what doing more than `volatile` implies. jfb: Leftover from the refactoring :) It's useful to get atomic memcpy, see https://wg21.link/P1478…
		if (DstValTy.isConstQualified())
		return ExprError(Diag(TheCall->getBeginLoc(), PDiag(diag::err_const_arg))
		<< DstValTy << DstOp->getSourceRange());
		if (DstValTy->isAtomicType())
		return ExprError(
		Diag(TheCall->getBeginLoc(), PDiag(diag::err_atomic_qualifier_invalid))
		<< DstValTy << DstOp->getSourceRange());
		if (SrcValTy->isAtomicType())
		return ExprError(
		Diag(TheCall->getBeginLoc(), PDiag(diag::err_atomic_qualifier_invalid))
		<< SrcValTy << SrcOp->getSourceRange());

		ExprResult SizeRes(TheCall->getArg(2));
		InitializedEntity SizeEntity = InitializedEntity::InitializeParameter(
		Context, Context.getSizeType(), false);
		SizeRes = PerformCopyInitialization(SizeEntity, SourceLocation(), SizeRes);
		if (SizeRes.isInvalid())
		return ExprError();
		TheCall->setArg(2, SizeRes.get());

		bool IsNonZero;
		if (!SizeRes.get()->isValueDependent() &&
		SizeRes.get()->EvaluateAsBooleanCondition(IsNonZero, Context) &&
		IsNonZero) {
		CheckNonNullArgument(*this, DstOp, TheCall->getExprLoc());
		CheckNonNullArgument(*this, SrcOp, TheCall->getExprLoc());
		}

		clang::Expr *ElSzArg = TheCall->getArg(3);

		(void)isCompleteType(DstOp->getBeginLoc(), DstValTy);
		(void)isCompleteType(SrcOp->getBeginLoc(), SrcValTy);
		if (!DstValTy.isTriviallyCopyableType(Context) && !DstValTy->isVoidType())
		return ExprError(Diag(TheCall->getBeginLoc(),
		PDiag(diag::err_atomic_op_needs_trivial_copy))
		rsmithUnsubmitted Done Reply Inline Actions Generally, I'm a little uncomfortable about producing an error if a type is complete but allowing the construct if the type is incomplete -- that seems like a situation where a warning would be more appropriate to me. It's surprising and largely unprecedented that providing more information about a type would change the program from valid to invalid. Do we really need the protection of an error here rather than an enabled-by-default warning? Moreover, don't we already have a warning for `memcpy` of a non-trivially-copyable object somewhere? If not, then I think we should add such a thing that also applies to the real `memcpy`, rather than only warning on the builtin. rsmith: Generally, I'm a little uncomfortable about producing an error if a type is complete but…
		jfbAuthorUnsubmitted Done Reply Inline Actions That rationale makes sense, but it's pre-existing behavior for atomic. I can change all of them in a follow-up if that's OK? We don't have such a check for other builtins. I can do a second follow-up to then adopt these warnings for them too? jfb: That rationale makes sense, but it's pre-existing behavior for atomic. I can change all of them…
		rsmithUnsubmitted Not Done Reply Inline Actions The pre-existing behavior for atomic builtins is to reject if the type is incomplete. Eg; <stdin>:1:33: error: address argument to atomic operation must be a pointer to a trivially-copyable type ('struct A ' invalid) struct A; void f(struct A p) { __atomic_store(p, p, 0); } ^ ~ We should do the same here. (Though I'd suggest calling `RequireCompleteType` instead to get a more meaningful diagnostic.) These days I think we should check for unsized types too, eg: if (RequireCompleteSizedType(ScrOp->getBeginLoc(), SrcValTy)) return true; if (!SrcValTy.isTriviallyCopyableType(Context) && !SrcValTy->isVoidType()) return ExprError(...); rsmith: The pre-existing behavior for atomic builtins is to reject if the type is incomplete. Eg; ```…
		<< DstValTy << DstOp->getSourceRange());
		if (!SrcValTy.isTriviallyCopyableType(Context) && !SrcValTy->isVoidType())
		return ExprError(Diag(TheCall->getBeginLoc(),
		PDiag(diag::err_atomic_op_needs_trivial_copy))
		<< SrcValTy << SrcOp->getSourceRange());
		if (DstValTy.isVolatileQualified())
		rsmithUnsubmitted Done Reply Inline Actions Do we need this constraint? If one side is atomic and the other is not, then we can do all of the operations with the atomic width. If both sides are atomic, then one side is 2^N times the size of the other; we can do 2^N operations on one side for each operation on the other side. Maybe the second case is not worth the effort, but permitting (for example) a memcpy from an `_Atomic int` to a `char` seems useful and there doesn't seem to be a good reason to disallow it. rsmith: Do we need this constraint? - If one side is atomic and the other is not, then we can do all…
		jfbAuthorUnsubmitted Done Reply Inline Actions Based on @rjmccall's feedback, I'm disallowing `_Atomic` qualification, and keying off the optional `element_size` parameter to determine atomicity. I'm also only taking in one size, not two, since as discussed it might be useful to allow two but I haven't heard that anyone actually wants it at the moment. jfb: Based on @rjmccall's feedback, I'm disallowing `_Atomic` qualification, and keying off the…
		return ExprError(Diag(TheCall->getBeginLoc(),
		PDiag(diag::err_sized_volatile_unsupported))
		<< DstValTy << DstOp->getSourceRange());
		if (SrcValTy.isVolatileQualified())
		return ExprError(Diag(TheCall->getBeginLoc(),
		PDiag(diag::err_sized_volatile_unsupported))
		<< SrcValTy << SrcOp->getSourceRange());

		if (!ElSzArg->isValueDependent()) {
		llvm::APSInt ElSz;
		ExprResult ElSzRes(VerifyIntegerConstantExpression(ElSzArg, &ElSz));
		if (ElSzRes.isInvalid())
		return ExprError();
		TheCall->setArg(3, ElSzRes.get());

		if (!ElSz.isStrictlyPositive() \|\| !ElSz.isPowerOf2())
		return ExprError(
		Diag(TheCall->getBeginLoc(), diag::err_argument_not_power_of_2)
		<< ElSzArg->getSourceRange());
		int InlineWidth = Context
		.toCharUnitsFromBits(
		Context.getTargetInfo().getMaxAtomicInlineWidth())
		.getQuantity();
		if (ElSz.ugt(InlineWidth))
		return ExprError(
		Diag(TheCall->getBeginLoc(), PDiag(diag::err_elsz_must_be_lock_free))
		<< (int)ElSz.getLimitedValue() << InlineWidth
		<< ElSzArg->getSourceRange());
		}

		return TheCallResult;
		}

		/// Perform semantic checking for __builtin_memset_sized, which is overloaded
		/// based on the pointer type of the destination argument, and has an extra 4th
		/// parameter for access size.
		ExprResult Sema::SemaBuiltinMemsetSized(ExprResult TheCallResult) {
		CallExpr TheCall = (CallExpr )TheCallResult.get();

		if (checkArgCount(*this, TheCall, 4))
		return ExprError();

		ExprResult DstPtr = DefaultFunctionArrayLvalueConversion(TheCall->getArg(0));
		if (DstPtr.isInvalid())
		return ExprError();
		clang::Expr *DstOp = DstPtr.get();
		TheCall->setArg(0, DstOp);

		const PointerType *DstTy = DstOp->getType()->getAs<PointerType>();
		if (!DstTy)
		return ExprError(
		Diag(TheCall->getBeginLoc(), diag::err_init_conversion_failed)
		jfbAuthorUnsubmitted Done Reply Inline Actions I'm re-thinking these checks: if (ElSz->urem(DstElSz)) return ExprError( Diag(TheCall->getBeginLoc(), PDiag(diag::err_atomic_builtin_ext_size_mismatches_el)) << (int)ElSz->getLimitedValue() << DstElSz << DstValTy << DstOp->getSourceRange() << Arg->getSourceRange()); I'm not sure we ought to have them anymore. We know that the types are trivially copyable, it therefore doesn't really matter if you're copying with operations smaller than the type itself. For example: struct Data { int a, b, c, d; }; It ought to be fine to do 4-byte copies of `Data`, if whatever your algorithm is is happy with that. I therefore think I'll remove these checks based on the dst / src element types. The only thing that seems to make sense is making sure that you don't straddle object boundaries with element size. I removed sizeless types: we'll codegen whatever you ask for. jfb: I'm re-thinking these checks: ``` if (ElSz->urem(DstElSz)) return ExprError…
		jfbAuthorUnsubmitted Done Reply Inline Actions They're gone, we now only check that size and element size match up. jfb: They're gone, we now only check that size and element size match up.
		<< InitializedEntity::EK_Parameter << Context.VoidPtrTy
		<< DstOp->isLValue() << DstOp->getType() << /no difference/ 0
		<< DstOp->getSourceRange());

		QualType DstValTy = DstTy->getPointeeType();
		if (DstValTy.isConstQualified())
		return ExprError(Diag(TheCall->getBeginLoc(), PDiag(diag::err_const_arg))
		<< DstValTy << DstOp->getSourceRange());
		if (DstValTy->isAtomicType())
		return ExprError(
		Diag(TheCall->getBeginLoc(), PDiag(diag::err_atomic_qualifier_invalid))
		<< DstValTy << DstOp->getSourceRange());

		ExprResult ValRes(TheCall->getArg(1));
		InitializedEntity ValEntity = InitializedEntity::InitializeParameter(
		Context, Context.UnsignedCharTy, false);
		ValRes = PerformCopyInitialization(ValEntity, SourceLocation(), ValRes);
		if (ValRes.isInvalid())
		return ExprError();
		TheCall->setArg(1, ValRes.get());

		ExprResult SizeRes(TheCall->getArg(2));
		InitializedEntity SizeEntity = InitializedEntity::InitializeParameter(
		Context, Context.getSizeType(), false);
		SizeRes = PerformCopyInitialization(SizeEntity, SourceLocation(), SizeRes);
		if (SizeRes.isInvalid())
		return ExprError();
		TheCall->setArg(2, SizeRes.get());

		bool IsNonZero;
		if (!SizeRes.get()->isValueDependent() &&
		SizeRes.get()->EvaluateAsBooleanCondition(IsNonZero, Context) &&
		IsNonZero)
		CheckNonNullArgument(*this, DstOp, TheCall->getExprLoc());

		clang::Expr *ElSzArg = TheCall->getArg(3);

		(void)isCompleteType(DstOp->getBeginLoc(), DstValTy);
		if (!DstValTy.isTriviallyCopyableType(Context) && !DstValTy->isVoidType())
		return ExprError(Diag(TheCall->getBeginLoc(),
		PDiag(diag::err_atomic_op_needs_trivial_copy))
		<< DstValTy << DstOp->getSourceRange());
		if (DstValTy.isVolatileQualified())
		return ExprError(Diag(TheCall->getBeginLoc(),
		PDiag(diag::err_sized_volatile_unsupported))
		<< DstValTy << DstOp->getSourceRange());

		if (!ElSzArg->isValueDependent()) {
		llvm::APSInt ElSz;
		ExprResult ElSzRes(VerifyIntegerConstantExpression(ElSzArg, &ElSz));
		if (ElSzRes.isInvalid())
		return ExprError();
		TheCall->setArg(3, ElSzRes.get());

		if (!ElSz.isStrictlyPositive() \|\| !ElSz.isPowerOf2())
		return ExprError(
		Diag(TheCall->getBeginLoc(), diag::err_argument_not_power_of_2)
		<< ElSzArg->getSourceRange());
		int InlineWidth = Context
		.toCharUnitsFromBits(
		Context.getTargetInfo().getMaxAtomicInlineWidth())
		.getQuantity();
		if (ElSz.ugt(InlineWidth))
		return ExprError(
		Diag(TheCall->getBeginLoc(), PDiag(diag::err_elsz_must_be_lock_free))
		rsmithUnsubmitted Done Reply Inline Actions You need to call `Sema::isCompleteType` first before asking this question, in order to trigger class instantiation when necessary in C++. (Likewise for the checks in the previous function.) rsmith: You need to call `Sema::isCompleteType` first before asking this question, in order to trigger…
		jfbAuthorUnsubmitted Done Reply Inline Actions Before the condition, right? LMK if I added the right thing! jfb: Before the condition, right? LMK if I added the right thing!
		rsmithUnsubmitted Done Reply Inline Actions It would be more correct from a modules perspective to use if (isCompleteType(Loc, T) && !T.isTriviallyCopyableType(Context)) That way, if the definition of the type is in some loaded-but-not-imported module file, we'll treat it the same as if the definition of the type is entirely unknown. (That also removes the need to check for the `void` case.) But given that this only allows us to accept code that is wrong in some sense, I'm not sure it really matters much. rsmith: It would be more correct from a modules perspective to use ``` if (isCompleteType(Loc, T) && !
		<< (int)ElSz.getLimitedValue() << InlineWidth
		<< ElSzArg->getSourceRange());
		}

		return TheCallResult;
		}

/// CheckObjCString - Checks that the argument to the builtin		/// CheckObjCString - Checks that the argument to the builtin
/// CFString constructor is correct		/// CFString constructor is correct
/// Note: It might also make sense to do the UTF-16 conversion here (would		/// Note: It might also make sense to do the UTF-16 conversion here (would
/// simplify the backend).		/// simplify the backend).
bool Sema::CheckObjCString(Expr *Arg) {		bool Sema::CheckObjCString(Expr *Arg) {
Arg = Arg->IgnoreParenCasts();		Arg = Arg->IgnoreParenCasts();
StringLiteral *Literal = dyn_cast<StringLiteral>(Arg);		StringLiteral *Literal = dyn_cast<StringLiteral>(Arg);

▲ Show 20 Lines • Show All 10,190 Lines • Show Last 20 Lines

clang/test/CodeGen/builtin-memfns.c

Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
// CHECK: @test6		// CHECK: @test6
// CHECK: call void @llvm.memcpy		// CHECK: call void @llvm.memcpy
int test6(char *X) {		int test6(char *X) {
return __builtin___memcpy_chk(X, X, 42, 42) != 0;		return __builtin___memcpy_chk(X, X, 42, 42) != 0;
}		}

// CHECK: @test7		// CHECK: @test7
// PR12094		// PR12094
int test7(int *p) {		void test7(int *p) {
struct snd_pcm_hw_params_t* hwparams; // incomplete type.		struct snd_pcm_hw_params_t* hwparams; // incomplete type.

// CHECK: call void @llvm.memset{{.}} align 4 {{.}}256, i1 false)		// CHECK: call void @llvm.memset{{.}} align 4 {{.}}256, i1 false)
__builtin_memset(p, 0, 256); // Should be alignment = 4		__builtin_memset(p, 0, 256); // Should be alignment = 4

// CHECK: call void @llvm.memset{{.}} align 1 {{.}}256, i1 false)		// CHECK: call void @llvm.memset{{.}} align 1 {{.}}256, i1 false)
__builtin_memset((char*)p, 0, 256); // Should be alignment = 1		__builtin_memset((char*)p, 0, 256); // Should be alignment = 1

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	void test12() {
memcpy(&dest_array, &dest_array, 2);		memcpy(&dest_array, &dest_array, 2);
}		}

// CHECK-LABEL: @test13		// CHECK-LABEL: @test13
void test13(char d, char s, int c, size_t n) {		void test13(char d, char s, int c, size_t n) {
// CHECK: call i8* @memccpy		// CHECK: call i8* @memccpy
memccpy(d, s, c, n);		memccpy(d, s, c, n);
}		}

		// CHECK-LABEL: volatile_dst_cpy_void(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_cpy_void(volatile void dst, const void src, size_t size) { __builtin_memcpy(dst, src, size); }

		// CHECK-LABEL: volatile_dst_move_void(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_move_void(volatile void dst, const void src, size_t size) { __builtin_memmove(dst, src, size); }

		// CHECK-LABEL: volatile_dst_set_void(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 0, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_set_void(volatile void *dst, size_t size) { __builtin_memset(dst, 0, size); }

		// CHECK-LABEL: volatile_src_cpy_void(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_src_cpy_void(void dst, volatile const void src, size_t size) { __builtin_memcpy(dst, src, size); }

		// CHECK-LABEL: volatile_src_move_void(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_src_move_void(void dst, volatile const void src, size_t size) { __builtin_memmove(dst, src, size); }

		// CHECK-LABEL: volatile_dstsrc_cpy_void(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_dstsrc_cpy_void(volatile void dst, volatile const void src, size_t size) { __builtin_memcpy(dst, src, size); }

		// CHECK-LABEL: volatile_dstsrc_move_void(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_dstsrc_move_void(volatile void dst, volatile const void src, size_t size) { __builtin_memmove(dst, src, size); }

		// CHECK-LABEL: volatile_dst_cpy_char(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_cpy_char(volatile char dst, const char src, size_t size) { __builtin_memcpy(dst, src, size); }

		// CHECK-LABEL: volatile_dst_move_char(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_move_char(volatile char dst, const char src, size_t size) { __builtin_memmove(dst, src, size); }

		// CHECK-LABEL: volatile_dst_set_char(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 0, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_set_char(volatile char *dst, size_t size) { __builtin_memset(dst, 0, size); }

		// CHECK-LABEL: volatile_dst_cpy_int(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %{{[0-9]}}, i8 align 4 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_cpy_int(volatile int dst, const int src, size_t size) { __builtin_memcpy(dst, src, size); }

		// CHECK-LABEL: volatile_dst_move_int(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 4 %{{[0-9]}}, i8 align 4 %{{[0-9]}}, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_move_int(volatile int dst, const int src, size_t size) { __builtin_memmove(dst, src, size); }

		// CHECK-LABEL: volatile_dst_set_int(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 4 %{{[0-9]}}, i8 0, i32 %{{[0-9]}}, i1 true)
		void volatile_dst_set_int(volatile int *dst, size_t size) { __builtin_memset(dst, 0, size); }

		// CHECK-LABEL: addrspace_srcdst_cpy_char(
		// CHECK: call void @llvm.memcpy.p32i8.p32i8.i32(i8 addrspace(32)* align 1 %{{[0-9]}}, i8 addrspace(32) align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 false)
		void addrspace_srcdst_cpy_char(__attribute__((address_space(32))) char dst, __attribute__((address_space(32))) const char src, size_t size) { __builtin_memcpy(dst, src, size); }

		// CHECK-LABEL: addrspace_srcdst_move_char(
		// CHECK: call void @llvm.memmove.p32i8.p32i8.i32(i8 addrspace(32)* align 1 %{{[0-9]}}, i8 addrspace(32) align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 false)
		void addrspace_srcdst_move_char(__attribute__((address_space(32))) char dst, __attribute__((address_space(32))) const char src, size_t size) { __builtin_memmove(dst, src, size); }

		// CHECK-LABEL: addrspace_dst_set_char(
		// CHECK: call void @llvm.memset.p32i8.i32(i8 addrspace(32)* align 1 %{{[0-9]}}, i8 0, i32 %{{[0-9]}}, i1 false)
		void addrspace_dst_set_char(__attribute__((address_space(32))) char *dst, size_t size) { __builtin_memset(dst, 0, size); }

		// CHECK-LABEL: vla_srcdst_cpy_char(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %{{[0-9a-z]}}, i8 align 1 %{{[0-9a-z]}}, i32 %{{[0-9]}}, i1 true)
		void vla_srcdst_cpy_char(size_t size) {
		volatile char dst[size];
		const volatile char src[size];
		__builtin_memcpy(dst, src, size);
		}

		// CHECK-LABEL: vla_srcdst_move_char(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 %{{[0-9a-z]}}, i8 align 1 %{{[0-9a-z]}}, i32 %{{[0-9]}}, i1 true)
		void vla_srcdst_move_char(size_t size) {
		volatile char dst[size];
		const volatile char src[size];
		__builtin_memmove(dst, src, size);
		}

		// CHECK-LABEL: vla_dst_set_char(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 1 %{{[0-9a-z]}}, i8 0, i32 %{{[0-9]}}, i1 true)
		void vla_dst_set_char(size_t size) {
		volatile char dst[size];
		__builtin_memset(dst, 0, size);
		}

		// CHECK-LABEL: static_srcdst_cpy_char(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 false)
		void static_srcdst_cpy_char(char dst[static 42], const char src[static 42], size_t size) {
		__builtin_memcpy(dst, src, size);
		}

		// CHECK-LABEL: static_srcdst_move_char(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i32 %{{[0-9]}}, i1 false)
		void static_srcdst_move_char(char dst[static 42], const char src[static 42], size_t size) {
		__builtin_memmove(dst, src, size);
		}

		// CHECK-LABEL: static_dst_set_char(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 1 %{{[0-9]}}, i8 0, i32 %{{[0-9]}}, i1 false)
		void static_dst_set_char(char dst[static 42], size_t size) {
		__builtin_memset(dst, 0, size);
		}

		extern char dst_unsized[];
		extern volatile char dst_vunsized[];
		extern const char src_cunsized[];
		extern const volatile char src_cvunsized[];

		// CHECK-LABEL: array_volatile_unsized_dst_cpy(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 getelementptr {{.}}, i8 align 1 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_unsized_dst_cpy(size_t size) { __builtin_memcpy(dst_vunsized, src_cunsized, size); }

		// CHECK-LABEL: array_volatile_unsized_dst_move(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 getelementptr {{.}}, i8 align 1 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_unsized_dst_move(size_t size) { __builtin_memmove(dst_vunsized, src_cunsized, size); }

		// CHECK-LABEL: array_volatile_unsized_dst_set(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 1 getelementptr {{.}}, i8 0, i32 %{{[0-9]}}, i1 true)
		void array_volatile_unsized_dst_set(size_t size) { __builtin_memset(dst_vunsized, 0, size); }

		// CHECK-LABEL: array_volatile_unsized_src_cpy(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 getelementptr {{.}}, i8 align 1 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_unsized_src_cpy(size_t size) { __builtin_memcpy(dst_unsized, src_cvunsized, size); }

		// CHECK-LABEL: array_volatile_unsized_src_move(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 getelementptr {{.}}, i8 align 1 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_unsized_src_move(size_t size) { __builtin_memmove(dst_unsized, src_cvunsized, size); }

		// CHECK-LABEL: array_volatile_unsized_dstsrc_cpy(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 getelementptr {{.}}, i8 align 1 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_unsized_dstsrc_cpy(size_t size) { __builtin_memcpy(dst_vunsized, src_cvunsized, size); }

		// CHECK-LABEL: array_volatile_unsized_dstsrc_move(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 1 getelementptr {{.}}, i8 align 1 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_unsized_dstsrc_move(size_t size) { __builtin_memmove(dst_vunsized, src_cvunsized, size); }

		extern __attribute__((aligned(128))) char dst_512[512];
		extern __attribute__((aligned(128))) volatile char dst_v512[512];
		extern __attribute__((aligned(128))) const char src_c512[512];
		extern __attribute__((aligned(128))) const volatile char src_cv512[512];

		// CHECK-LABEL: array_volatile_dst_cpy(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 align 128 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_dst_cpy(size_t size) { __builtin_memcpy(dst_v512, src_c512, size); }

		// CHECK-LABEL: array_volatile_dst_move(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 align 128 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_dst_move(size_t size) { __builtin_memmove(dst_v512, src_c512, size); }

		// CHECK-LABEL: array_volatile_dst_set(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 0, i32 %{{[0-9]}}, i1 true)
		void array_volatile_dst_set(size_t size) { __builtin_memset(dst_v512, 0, size); }

		// CHECK-LABEL: array_volatile_src_cpy(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 align 128 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_src_cpy(size_t size) { __builtin_memcpy(dst_512, src_cv512, size); }

		// CHECK-LABEL: array_volatile_src_move(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 align 128 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_src_move(size_t size) { __builtin_memmove(dst_512, src_cv512, size); }

		// CHECK-LABEL: array_volatile_dstsrc_cpy(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 align 128 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_dstsrc_cpy(size_t size) { __builtin_memcpy(dst_v512, src_cv512, size); }

		// CHECK-LABEL: array_volatile_dstsrc_move(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 align 128 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void array_volatile_dstsrc_move(size_t size) { __builtin_memmove(dst_v512, src_cv512, size); }

		extern __attribute__((aligned(128))) volatile char dst_v512_32[512][32];
		extern __attribute__((aligned(128))) const volatile char src_cv512_32[512][32];

		// CHECK-LABEL: multiarray_volatile_dstsrc_cpy(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 align 128 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void multiarray_volatile_dstsrc_cpy(size_t size) { __builtin_memcpy(dst_v512_32, src_cv512_32, size); }

		// CHECK-LABEL: multiarray_volatile_dstsrc_move(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 align 128 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void multiarray_volatile_dstsrc_move(size_t size) { __builtin_memmove(dst_v512_32, src_cv512_32, size); }

		// CHECK-LABEL: multiarray_volatile_dst_set(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 128 getelementptr {{.}}, i8 0, i32 %{{[0-9]}}, i1 true)
		void multiarray_volatile_dst_set(size_t size) { __builtin_memset(dst_v512_32, 0, size); }

		// CHECK-LABEL: multiarray_idx_volatile_dstsrc_cpy(
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 32 getelementptr {{.}}, i8 align 32 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void multiarray_idx_volatile_dstsrc_cpy(size_t size) { __builtin_memcpy(dst_v512_32[1], src_cv512_32[1], size); }

		// CHECK-LABEL: multiarray_idx_volatile_dstsrc_move(
		// CHECK: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 32 getelementptr {{.}}, i8 align 32 getelementptr {{.}}, i32 %{{[0-9]}}, i1 true)
		void multiarray_idx_volatile_dstsrc_move(size_t size) { __builtin_memmove(dst_v512_32[1], src_cv512_32[1], size); }

		// CHECK-LABEL: multiarray_idx_volatile_dst_set(
		// CHECK: call void @llvm.memset.p0i8.i32(i8* align 32 getelementptr {{.}}, i8 0, i32 %{{[0-9]}}, i1 true)
		void multiarray_idx_volatile_dst_set(size_t size) { __builtin_memset(dst_v512_32[1], 0, size); }

clang/test/CodeGen/builtin-sized-memfns.c

This file was added.

				// RUN: %clang_cc1 -triple arm64-unknown-unknown -fms-extensions -emit-llvm < %s\| FileCheck %s

				typedef __SIZE_TYPE__ size_t;

				// CHECK-LABEL: addrspace_srcdst_cpy_char(
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p32i8.p32i8.i64(i8 addrspace(32)* align 1 %{{[0-9]}}, i8 addrspace(32) align 1 %{{[0-9]}}, i64 %{{[0-9]}}, i32 1)
				void addrspace_srcdst_cpy_char(__attribute__((address_space(32))) char dst, __attribute__((address_space(32))) const char src, size_t size) { __builtin_memcpy_sized(dst, src, size, 1); }

				// CHECK-LABEL: addrspace_srcdst_move_char(
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p32i8.p32i8.i64(i8 addrspace(32)* align 1 %{{[0-9]}}, i8 addrspace(32) align 1 %{{[0-9]}}, i64 %{{[0-9]}}, i32 1)
				void addrspace_srcdst_move_char(__attribute__((address_space(32))) char dst, __attribute__((address_space(32))) const char src, size_t size) { __builtin_memmove_sized(dst, src, size, 1); }

				// CHECK-LABEL: addrspace_dst_set_char(
				// CHECK: call void @llvm.memset.element.unordered.atomic.p32i8.i64(i8 addrspace(32)* align 1 %{{[0-9]}}, i8 0, i64 %{{[0-9]}}, i32 1)
				void addrspace_dst_set_char(__attribute__((address_space(32))) char *dst, size_t size) { __builtin_memset_sized(dst, 0, size, 1); }

				// CHECK-LABEL: srcdst_cpy_char(
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i64 %{{[0-9]}}, i32 1)
				void srcdst_cpy_char(char dst, const char src, size_t size) { __builtin_memcpy_sized(dst, src, size, 1); }

				// CHECK-LABEL: srcdst_cpy_char_big(
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* align 16 %{{[0-9]}}, i8 align 16 %{{[0-9]}}, i64 %{{[0-9]}}, i32 16)
				void srcdst_cpy_char_big(char dst, const char src, size_t size) { __builtin_memcpy_sized(dst, src, size, 16); }

				// CHECK-LABEL: srcdst_move_char(
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i64 %{{[0-9]}}, i32 1)
				void srcdst_move_char(char dst, const char src, size_t size) { __builtin_memmove_sized(dst, src, size, 1); }

				// CHECK-LABEL: srcdst_move_char_big(
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* align 16 %{{[0-9]}}, i8 align 16 %{{[0-9]}}, i64 %{{[0-9]}}, i32 16)
				void srcdst_move_char_big(char dst, const char src, size_t size) { __builtin_memmove_sized(dst, src, size, 16); }

				// CHECK-LABEL: dst_set_char(
				// CHECK: call void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* align 1 %{{[0-9]}}, i8 0, i64 %{{[0-9]}}, i32 1)
				void dst_set_char(char *dst, size_t size) { __builtin_memset_sized(dst, 0, size, 1); }

				// CHECK-LABEL: dst_set_char_big(
				// CHECK: call void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* align 16 %{{[0-9]}}, i8 0, i64 %{{[0-9]}}, i32 16)
				void dst_set_char_big(char *dst, size_t size) { __builtin_memset_sized(dst, 0, size, 16); }

				// CHECK-LABEL: srcdst_cpy_int(
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* align 4 %{{[0-9]}}, i8 align 4 %{{[0-9]}}, i64 %{{[0-9]}}, i32 4)
				void srcdst_cpy_int(int dst, const int src, size_t size) { __builtin_memcpy_sized(dst, src, size, 4); }

				// CHECK-LABEL: srcdst_move_int(
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* align 4 %{{[0-9]}}, i8 align 4 %{{[0-9]}}, i64 %{{[0-9]}}, i32 4)
				void srcdst_move_int(int dst, const int src, size_t size) { __builtin_memmove_sized(dst, src, size, 4); }

				// CHECK-LABEL: dst_set_int(
				// CHECK: call void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* align 4 %{{[0-9]}}, i8 0, i64 %{{[0-9]}}, i32 4)
				void dst_set_int(int *dst, size_t size) { __builtin_memset_sized(dst, 0, size, 4); }

				// CHECK-LABEL: srcdst_cpy_longlong(
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* align 8 %{{[0-9]}}, i8 align 8 %{{[0-9]}}, i64 %{{[0-9]}}, i32 8)
				void srcdst_cpy_longlong(long long dst, const long long src, size_t size) { __builtin_memcpy_sized(dst, src, size, sizeof(long long)); }

				// CHECK-LABEL: srcdst_move_longlong(
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* align 8 %{{[0-9]}}, i8 align 8 %{{[0-9]}}, i64 %{{[0-9]}}, i32 8)
				void srcdst_move_longlong(long long dst, const long long src, size_t size) { __builtin_memmove_sized(dst, src, size, sizeof(long long)); }

				// CHECK-LABEL: dst_set_longlong(
				// CHECK: call void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* align 8 %{{[0-9]}}, i8 0, i64 %{{[0-9]}}, i32 8)
				void dst_set_longlong(long long *dst, size_t size) { __builtin_memset_sized(dst, 0, size, sizeof(long long)); }

				// CHECK-LABEL: static_srcdst_cpy_char(
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i64 %{{[0-9]}}, i32 1)
				void static_srcdst_cpy_char(char dst[static 2], const char src[2], size_t size) { __builtin_memcpy_sized(dst, src, size, 1); }

				// CHECK-LABEL: static_srcdst_move_char(
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* align 1 %{{[0-9]}}, i8 align 1 %{{[0-9]}}, i64 %{{[0-9]}}, i32 1)
				void static_srcdst_move_char(char dst[static 2], const char src[2], size_t size) { __builtin_memmove_sized(dst, src, size, 1); }

				// CHECK-LABEL: static_dst_set_char(
				// CHECK: call void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* align 1 %{{[0-9]}}, i8 0, i64 %{{[0-9]}}, i32 1)
				void static_dst_set_char(char dst[static 2], size_t size) { __builtin_memset_sized(dst, 0, size, 1); }

				extern char dst_atomic[2];
				extern const char src_atomic[2];

				// CHECK-LABEL: array_srcdst_cpy_char(
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* align 1 getelementptr {{.}}, i8 align 1 getelementptr {{.}}, i64 %{{[0-9]}}, i32 1)
				void array_srcdst_cpy_char(size_t size) { __builtin_memcpy_sized(dst_atomic, src_atomic, size, 1); }

				// CHECK-LABEL: array_srcdst_move_char(
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* align 1 getelementptr {{.}}, i8 align 1 getelementptr {{.}}, i64 %{{[0-9]}}, i32 1)
				void array_srcdst_move_char(size_t size) { __builtin_memmove_sized(dst_atomic, src_atomic, size, 1); }

				// CHECK-LABEL: array_dst_set_char(
				// CHECK: call void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* align 1 getelementptr {{.}}, i8 0, i64 %{{[0-9]}}, i32 1)
				void array_dst_set_char(size_t size) { __builtin_memset_sized(dst_atomic, 0, size, 1); }

				// CHECK-LABEL: local_srcdst_cpy_char(
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* align 4 %{{[0-9]}}, i8 align 4 %{{[0-9]*}}, i64 4, i32 4)
				void local_srcdst_cpy_char(size_t size) {
				int dst;
				const int src;
				__builtin_memcpy_sized(&dst, &src, sizeof(dst), sizeof(dst));
				}

				// CHECK-LABEL: local_srcdst_move_char(
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* align 4 %{{[0-9]}}, i8 align 4 %{{[0-9]*}}, i64 4, i32 4)
				void local_srcdst_move_char(size_t size) {
				int dst;
				const int src;
				__builtin_memmove_sized(&dst, &src, sizeof(dst), sizeof(dst));
				}

				// CHECK-LABEL: local_dst_set_char(
				// CHECK: call void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* align 4 %{{[0-9]*}}, i8 0, i64 4, i32 4)
				void local_dst_set_char(size_t size) {
				int dst;
				__builtin_memset_sized(&dst, 0, sizeof(dst), sizeof(dst));
				}

clang/test/CodeGen/ubsan-builtin-checks.c

This file was moved to clang/test/CodeGen/ubsan-builtin-ctz-clz.c.

clang/test/CodeGen/ubsan-builtin-ctz-clz.c

This file was moved from clang/test/CodeGen/ubsan-builtin-checks.c.

	// RUN: %clang_cc1 -triple x86_64-apple-darwin10 -w -emit-llvm -o - %s -fsanitize=builtin \| FileCheck %s			// RUN: %clang_cc1 -triple x86_64-apple-darwin10 -w -emit-llvm -o - %s -fsanitize=builtin \| FileCheck %s --enable-var-scope
	// RUN: %clang_cc1 -triple arm64-none-linux-gnu -w -emit-llvm -o - %s -fsanitize=builtin \| FileCheck %s --check-prefix=NOT-UB			// RUN: %clang_cc1 -triple arm64-none-linux-gnu -w -emit-llvm -o - %s -fsanitize=builtin \| FileCheck %s --enable-var-scope --check-prefix=NOT-UB

	// NOT-UB-NOT: __ubsan_handle_invalid_builtin			// NOT-UB-NOT: __ubsan_handle_invalid_builtin

	// CHECK: define void @check_ctz			// CHECK-LABEL: define void @check_ctz(
	void check_ctz(int n) {			void check_ctz(int n) {
	// CHECK: [[NOT_ZERO:%.]] = icmp ne i32 [[N:%.]], 0, !nosanitize			// CHECK: [[NOT_ZERO:%.]] = icmp ne i32 [[N:%.]], 0, !nosanitize
	// CHECK-NEXT: br i1 [[NOT_ZERO]]			// CHECK-NEXT: br i1 [[NOT_ZERO]]
	//			//
	// Handler block:			// Handler block:
	// CHECK: call void @__ubsan_handle_invalid_builtin			// CHECK: call void @__ubsan_handle_invalid_builtin
	// CHECK-NEXT: unreachable			// CHECK-NEXT: unreachable
	//			//
	// Continuation block:			// Continuation block:
	// CHECK: call i32 @llvm.cttz.i32(i32 [[N]], i1 true)			// CHECK: call i32 @llvm.cttz.i32(i32 [[N]], i1 true)
	__builtin_ctz(n);			__builtin_ctz(n);

	// CHECK: call void @__ubsan_handle_invalid_builtin			// CHECK: call void @__ubsan_handle_invalid_builtin
	__builtin_ctzl(n);			__builtin_ctzl(n);

	// CHECK: call void @__ubsan_handle_invalid_builtin			// CHECK: call void @__ubsan_handle_invalid_builtin
	__builtin_ctzll(n);			__builtin_ctzll(n);
	}			}

	// CHECK: define void @check_clz			// CHECK-LABEL: define void @check_clz(
	void check_clz(int n) {			void check_clz(int n) {
	// CHECK: [[NOT_ZERO:%.]] = icmp ne i32 [[N:%.]], 0, !nosanitize			// CHECK: [[NOT_ZERO:%.]] = icmp ne i32 [[N:%.]], 0, !nosanitize
	// CHECK-NEXT: br i1 [[NOT_ZERO]]			// CHECK-NEXT: br i1 [[NOT_ZERO]]
	//			//
	// Handler block:			// Handler block:
	// CHECK: call void @__ubsan_handle_invalid_builtin			// CHECK: call void @__ubsan_handle_invalid_builtin
	// CHECK-NEXT: unreachable			// CHECK-NEXT: unreachable
	//			//
	Show All 10 Lines

clang/test/CodeGen/ubsan-builtin-mem_sized.c

This file was added.

				// RUN: %clang_cc1 -triple x86_64-apple-darwin10 -w -emit-llvm -o - %s -fsanitize=builtin \| FileCheck %s --enable-var-scope
				// RUN: %clang_cc1 -triple arm64-none-linux-gnu -w -emit-llvm -o - %s -fsanitize=builtin \| FileCheck %s --enable-var-scope

				typedef __SIZE_TYPE__ size_t;

				// CHECK-LABEL: define void @check_memcpy(
				void check_memcpy(char dst, const char src, size_t sz) {
				// CHECK: [[DSTINT:%.]] = ptrtoint i8 [[DST:%.*]] to i64, !nosanitize
				// CHECK: [[DSTMASK:%.*]] = and i64 [[DSTINT]], 3, !nosanitize
				// CHECK: [[DSTOK:%.*]] = icmp eq i64 [[DSTMASK]], 0, !nosanitize
				// CHECK: br i1 [[DSTOK]], label %[[CONT0:.]], label %[[DSTFAILED:[^,]]]

				// CHECK: [[DSTFAILED]]:
				// CHECK: [[DSTINT2:%.]] = ptrtoint i8 [[DST]] to i64, !nosanitize
				// CHECK: call void @__ubsan_handle_invalid_builtin_abort({{.*}}, i64 [[DSTINT2]])
				// CHECK: unreachable, !nosanitize

				// CHECK: [[CONT0]]:
				// CHECK: [[SRCINT:%.]] = ptrtoint i8 [[SRC:%.*]] to i64, !nosanitize
				// CHECK: [[SRCMASK:%.*]] = and i64 [[SRCINT]], 3, !nosanitize
				// CHECK: [[SRCOK:%.*]] = icmp eq i64 [[SRCMASK]], 0, !nosanitize
				// CHECK: br i1 [[SRCOK]], label %[[CONT1:.]], label %[[SRCFAILED:[^,]]]

				// CHECK: [[SRCFAILED]]:
				// CHECK: [[SRCINT2:%.]] = ptrtoint i8 [[SRC]] to i64, !nosanitize
				// CHECK: call void @__ubsan_handle_invalid_builtin_abort({{.*}}, i64 [[SRCINT2]])
				// CHECK: unreachable, !nosanitize

				// CHECK: [[CONT1]]:
				// CHECK: [[SZREM:%.]] = urem i64 [[SZ:%.]], 4, !nosanitize
				// CHECK: [[SZOK:%.*]] = icmp eq i64 [[SZREM]], 0, !nosanitize
				// CHECK: br i1 [[SZOK]], label %[[CONT2:.]], label %[[SZFAILED:[^,]]]

				// CHECK: [[SZFAILED]]:
				// CHECK: call void @__ubsan_handle_invalid_builtin_abort({{.*}}, i64 [[SZ]])
				// CHECK: unreachable, !nosanitize

				// CHECK: [[CONT2]]:
				// CHECK: call void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* align 4 [[DST]], i8* align 4 [[SRC]], i64 [[SZ]], i32 4)
				__builtin_memcpy_sized(dst, src, sz, 4);
				}

				// CHECK-LABEL: define void @check_memmove(
				void check_memmove(char dst, const char src, size_t sz) {
				// CHECK: [[DSTINT:%.]] = ptrtoint i8 [[DST:%.*]] to i64, !nosanitize
				// CHECK: [[DSTMASK:%.*]] = and i64 [[DSTINT]], 3, !nosanitize
				// CHECK: [[DSTOK:%.*]] = icmp eq i64 [[DSTMASK]], 0, !nosanitize
				// CHECK: br i1 [[DSTOK]], label %[[CONT0:.]], label %[[DSTFAILED:[^,]]]

				// CHECK: [[DSTFAILED]]:
				// CHECK: [[DSTINT2:%.]] = ptrtoint i8 [[DST]] to i64, !nosanitize
				// CHECK: call void @__ubsan_handle_invalid_builtin_abort({{.*}}, i64 [[DSTINT2]])
				// CHECK: unreachable, !nosanitize

				// CHECK: [[CONT0]]:
				// CHECK: [[SRCINT:%.]] = ptrtoint i8 [[SRC:%.*]] to i64, !nosanitize
				// CHECK: [[SRCMASK:%.*]] = and i64 [[SRCINT]], 3, !nosanitize
				// CHECK: [[SRCOK:%.*]] = icmp eq i64 [[SRCMASK]], 0, !nosanitize
				// CHECK: br i1 [[SRCOK]], label %[[CONT1:.]], label %[[SRCFAILED:[^,]]]

				// CHECK: [[SRCFAILED]]:
				// CHECK: [[SRCINT2:%.]] = ptrtoint i8 [[SRC]] to i64, !nosanitize
				// CHECK: call void @__ubsan_handle_invalid_builtin_abort({{.*}}, i64 [[SRCINT2]])
				// CHECK: unreachable, !nosanitize

				// CHECK: [[CONT1]]:
				// CHECK: [[SZREM:%.]] = urem i64 [[SZ:%.]], 4, !nosanitize
				// CHECK: [[SZOK:%.*]] = icmp eq i64 [[SZREM]], 0, !nosanitize
				// CHECK: br i1 [[SZOK]], label %[[CONT2:.]], label %[[SZFAILED:[^,]]]

				// CHECK: [[SZFAILED]]:
				// CHECK: call void @__ubsan_handle_invalid_builtin_abort({{.*}}, i64 [[SZ]])
				// CHECK: unreachable, !nosanitize

				// CHECK: [[CONT2]]:
				// CHECK: call void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* align 4 [[DST]], i8* align 4 [[SRC]], i64 [[SZ]], i32 4)
				__builtin_memmove_sized(dst, src, sz, 4);
				}

				// CHECK-LABEL: define void @check_memset(
				void check_memset(char *dst, size_t sz) {
				// CHECK: [[DSTINT:%.]] = ptrtoint i8 [[DST:%.*]] to i64, !nosanitize
				// CHECK: [[DSTMASK:%.*]] = and i64 [[DSTINT]], 3, !nosanitize
				// CHECK: [[DSTOK:%.*]] = icmp eq i64 [[DSTMASK]], 0, !nosanitize
				// CHECK: br i1 [[DSTOK]], label %[[CONT0:.]], label %[[DSTFAILED:[^,]]]

				// CHECK: [[DSTFAILED]]:
				// CHECK: [[DSTINT2:%.]] = ptrtoint i8 [[DST]] to i64, !nosanitize
				// CHECK: call void @__ubsan_handle_invalid_builtin_abort({{.*}}, i64 [[DSTINT2]])
				// CHECK: unreachable, !nosanitize

				// CHECK: [[CONT0]]:
				// CHECK: [[SZREM:%.]] = urem i64 [[SZ:%.]], 4, !nosanitize
				// CHECK: [[SZOK:%.*]] = icmp eq i64 [[SZREM]], 0, !nosanitize
				// CHECK: br i1 [[SZOK]], label %[[CONT1:.]], label %[[SZFAILED:[^,]]]

				// CHECK: [[SZFAILED]]:
				// CHECK: call void @__ubsan_handle_invalid_builtin_abort({{.*}}, i64 [[SZ]])
				// CHECK: unreachable, !nosanitize

				// CHECK: [[CONT1]]:
				// CHECK: call void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* align 4 [[DST]], i8 42, i64 [[SZ]], i32 4)
				__builtin_memset_sized(dst, 42, sz, 4);
				}

clang/test/CodeGenObjC/builtin-memfns.m

	// RUN: %clang_cc1 -triple x86_64-apple-macosx10.8.0 -emit-llvm -o - %s \| FileCheck %s			// RUN: %clang_cc1 -triple x86_64-apple-macosx10.8.0 -emit-llvm -o - %s \| FileCheck %s

	void memcpy(void restrict s1, const void *restrict s2, unsigned long n);			typedef __SIZE_TYPE__ size_t;

				void memcpy(void restrict s1, const void *restrict s2, size_t n);
				void memmove(void restrict s1, const void *restrict s2, size_t n);
				void memset(void s1, int v, size_t n);

	// PR13697			// PR13697
	void test1(int *a, id b) {			void cpy1(int *a, id b) {
	// CHECK: @test1			// CHECK-LABEL: @cpy1(
				// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* {{.}}, i8 {{.*}}, i64 8, i1 false)
				memcpy(a, b, 8);
				}

				void cpy2(id a, int *b) {
				// CHECK-LABEL: @cpy2(
	// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* {{.}}, i8 {{.*}}, i64 8, i1 false)			// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* {{.}}, i8 {{.*}}, i64 8, i1 false)
	memcpy(a, b, 8);			memcpy(a, b, 8);
	}			}

				void move1(int *a, id b) {
				// CHECK-LABEL: @move1(
				// CHECK: call void @llvm.memmove.p0i8.p0i8.i64(i8* {{.}}, i8 {{.*}}, i64 8, i1 false)
				memmove(a, b, 8);
				}

				void move2(id a, int *b) {
				// CHECK-LABEL: @move2(
				// CHECK: call void @llvm.memmove.p0i8.p0i8.i64(i8* {{.}}, i8 {{.*}}, i64 8, i1 false)
				memmove(a, b, 8);
				}

				void set(id a) {
				// CHECK-LABEL: @set(
				// CHECK: call void @llvm.memset.p0i8.i64(i8* {{.*}}, i8 42, i64 8, i1 false)
				memset(a, 42, 8);
				}

clang/test/Sema/builtin-sized-memfns.cpp

This file was added.

				// RUN: %clang_cc1 %s -verify -fsyntax-only -triple=arm64-unknown-unknown -fms-extensions -DCPY=1
				// RUN: %clang_cc1 %s -verify -fsyntax-only -triple=arm64-unknown-unknown -fms-extensions -DCPY=0

				// Test memcpy and memmove with the same code, since they're basically the same constraints.
				#if CPY
				#define MEM(...) __builtin_memcpy_sized(__VA_ARGS__)
				#else
				#define MEM(...) __builtin_memmove_sized(__VA_ARGS__)
				#endif

				#define NULL (void *)0
				#define nullptr __nullptr
				using size_t = __SIZE_TYPE__;
				using sizeless_t = __SVInt8_t;
				using float4 = float __attribute__((ext_vector_type(4)));
				struct Intish {
				int i;
				};
				struct NotLockFree {
				char buf[512];
				};
				struct TrivialCpy {
				char buf[8];
				TrivialCpy();
				TrivialCpy(const TrivialCpy &) = default;
				};
				struct NotTrivialCpy {
				char buf[8];
				NotTrivialCpy();
				NotTrivialCpy(const NotTrivialCpy &);
				};

				constexpr int CONSTEXPR_ONE = 1;

				void arg_count() {
				MEM(); // expected-error {{too few arguments to function call, expected 3, have 0}}
				MEM(0); // expected-error {{too few arguments to function call, expected 3, have 1}}
				MEM(0, 0); // expected-error {{too few arguments to function call, expected 3, have 2}}
				MEM(0, 0, 0, 0, 0); // expected-error {{too many arguments to function call, expected 4, have 5}}
				__builtin_memset_sized(); // expected-error {{too few arguments to function call, expected 3, have 0}}
				__builtin_memset_sized(0); // expected-error {{too few arguments to function call, expected 3, have 1}}
				__builtin_memset_sized(0, 0); // expected-error {{too few arguments to function call, expected 3, have 2}}
				__builtin_memset_sized(0, 0, 0, 0, 0); // expected-error {{too many arguments to function call, expected 4, have 5}}
				}

				void null(char dst, const char src, size_t size) {
				MEM(0, src, 0); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(0, src, size); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(dst, 0, 0); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(dst, 0, size); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				__builtin_memset_sized(0, 0, 0); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				__builtin_memset_sized(0, 0, size); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(dst, 0, 42); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(dst, 0, 42); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(dst, NULL, 42); // expected-warning {{null passed to a callee that requires a non-null argument}}
				MEM(dst, nullptr, 42); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'nullptr_t'}}
				MEM(0, src, 42); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(NULL, src, 42); // expected-warning {{null passed to a callee that requires a non-null argument}}
				MEM(nullptr, src, 42); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'nullptr_t'}}
				__builtin_memset_sized(0, 0, 42); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				__builtin_memset_sized(NULL, 0, 42); // expected-warning {{null passed to a callee that requires a non-null argument}}
				__builtin_memset_sized(nullptr, 0, 42); // expected-error{{cannot initialize a parameter of type 'void *' with an rvalue of type 'nullptr_t'}}
				}

				void good_arg_types(char dst, const char src, size_t size) {
				MEM(dst, src, 0);
				MEM(dst, dst, ~(size_t)0);
				MEM(dst, src, 42);
				MEM(dst, src, size);
				MEM(dst, (char *)src, size);
				MEM(dst, (const void *)src, size);
				MEM((void *)dst, src, size);
				MEM(dst, (volatile const char *)src, size);
				MEM((volatile char *)dst, src, size);
				MEM(dst, (__unaligned const char *)src, size);
				MEM((__unaligned char *)dst, src, size);

				MEM(dst, (const __attribute__((address_space(32))) char *)src, size);
				MEM((__attribute__((address_space(32))) char *)dst, src, size);
				MEM((__attribute__((address_space(32))) char )dst, (const __attribute__((address_space(64))) char )src, size);
				MEM(dst, (__attribute__((address_space(32))) __unaligned const volatile void *)src, size);
				MEM((__attribute__((address_space(32))) __unaligned volatile void *)dst, src, size);

				MEM(dst, (const char *)src, size, 1);
				MEM(dst, (const char *)src, size, 2);
				MEM(dst, (const char *)src, size, 4);
				MEM(dst, (const char *)src, size, 8);
				MEM(dst, (const char *)src, size, 16);
				MEM((void *)dst, src, size, 1);
				MEM(dst, (const void *)src, size, 1);
				MEM((void *)dst, src, size, 4);
				MEM(dst, (const void *)src, size, 4);
				MEM((int )dst, (const Intish )src, size, 4);
				MEM((Intish )dst, (const int )src, size, 4);
				MEM((int *)dst, src, size, 1);
				MEM(dst, (const int *)src, size, 1);
				MEM((int *)dst, src, size, 2);
				MEM(dst, (const int *)src, size, 2);
				MEM((int *)dst, src, size, 8);
				MEM(dst, (const int *)src, size, 8);
				MEM(dst, src, size, CONSTEXPR_ONE);

				__builtin_memset_sized(dst, 0, 0);
				__builtin_memset_sized(dst, 0, ~(size_t)0);
				__builtin_memset_sized(dst, 0, 42);
				__builtin_memset_sized(dst, 0, size);
				__builtin_memset_sized((void *)dst, 0, size);
				__builtin_memset_sized((volatile char *)dst, 0, size);
				__builtin_memset_sized((__unaligned char *)dst, 0, size);
				__builtin_memset_sized((int *)dst, 0, size);
				__builtin_memset_sized((__attribute__((address_space(32))) char *)dst, 0, size);
				__builtin_memset_sized((__attribute__((address_space(32))) __unaligned volatile void *)dst, 0, size);

				__builtin_memset_sized((char *)dst, 0, size, 1);
				__builtin_memset_sized((char *)dst, 0, size, 2);
				__builtin_memset_sized((char *)dst, 0, size, 4);
				__builtin_memset_sized((char *)dst, 0, size, 8);
				__builtin_memset_sized((char *)dst, 0, size, 16);
				__builtin_memset_sized((void *)dst, 0, size, 1);
				__builtin_memset_sized((void *)dst, 0, size, 4);
				__builtin_memset_sized((Intish *)dst, 0, size, 4);
				__builtin_memset_sized((int *)dst, 0, size, 1);
				__builtin_memset_sized((int *)dst, 0, size, 2);
				__builtin_memset_sized((int *)dst, 0, size, 8);
				__builtin_memset_sized(dst, 0, size, CONSTEXPR_ONE);
				}

				// expected-note@+1 2 {{declared here}}
				void bad_arg_types(char dst, const char src, size_t size) {
				MEM(dst, 42, size); // expected-error {{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(42, src, size); // expected-error {{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				MEM(dst, src, dst); // expected-error {{cannot initialize a parameter of type 'unsigned long' with an lvalue of type 'char *'}}
				MEM((const char *)dst, src, size); // expected-error {{argument must be non-const, got 'const char'}}
				MEM((__attribute__((address_space(32))) __unaligned const volatile char *)dst, src, size); // expected-error {{argument must be non-const, got 'const volatile __unaligned __attribute__((address_space(32))) char'}}

				MEM(dst, (_Atomic const char *)src, size); // expected-error{{parameter cannot have the _Atomic qualifier ('const _Atomic(char)' invalid)}}
				MEM((_Atomic char *)dst, src, size); // expected-error{{parameter cannot have the _Atomic qualifier ('_Atomic(char)' invalid)}}
				MEM((int )dst, (_Atomic const Intish )src, size); // expected-error{{parameter cannot have the _Atomic qualifier ('const _Atomic(Intish)' invalid)}}
				MEM((_Atomic Intish )dst, (const int )src, size); // expected-error{{parameter cannot have the _Atomic qualifier ('_Atomic(Intish)' invalid)}}
				MEM((void )dst, (_Atomic const int )src, size); // expected-error{{parameter cannot have the _Atomic qualifier ('const _Atomic(int)' invalid)}}
				MEM((_Atomic int )dst, (const void )src, size); // expected-error{{parameter cannot have the _Atomic qualifier ('_Atomic(int)' invalid)}}

				// expected-note@+1 {{read of non-const variable 'size' is not allowed in a constant expression}}
				MEM(dst, src, size, size); // expected-error{{expression is not an integral constant expression}}
				MEM(dst, src, size, -1); // expected-error{{argument should be a power of 2}}
				MEM(dst, src, size, 0); // expected-error{{argument should be a power of 2}}
				MEM(dst, src, size, 3); // expected-error{{argument should be a power of 2}}
				MEM(dst, src, size, 32); // expected-error{{lock-free}}
				MEM((NotLockFree )dst, (const NotLockFree )src, size, sizeof(NotLockFree)); // expected-error{{element size must be a lock-free size, 512 exceeds 16 bytes}}
				MEM(dst, (volatile const char *)src, size, 1); // expected-error{{specifying an access size for volatile memory operations is unsupported ('const volatile char' is volatile)}}
				MEM((volatile char *)dst, src, size, 1); // expected-error{{specifying an access size for volatile memory operations is unsupported ('volatile char' is volatile)}}

				__builtin_memset_sized(42, 0, size); // expected-error {{cannot initialize a parameter of type 'void *' with an rvalue of type 'int'}}
				__builtin_memset_sized((const char *)dst, 0, size); // expected-error {{argument must be non-const, got 'const char'}}
				__builtin_memset_sized((__attribute__((address_space(32))) __unaligned const volatile char *)dst, 0, size); // expected-error {{argument must be non-const, got 'const volatile __unaligned __attribute__((address_space(32))) char'}}
				__builtin_memset_sized((_Atomic char *)dst, 0, size); // expected-error{{parameter cannot have the _Atomic qualifier ('_Atomic(char)' invalid)}}
				__builtin_memset_sized((_Atomic Intish *)dst, 0, size); // expected-error{{parameter cannot have the _Atomic qualifier ('_Atomic(Intish)' invalid)}}

				// expected-note@+1 {{read of non-const variable 'size' is not allowed in a constant expression}}
				__builtin_memset_sized(dst, 0, size, size); // expected-error{{expression is not an integral constant expression}}
				__builtin_memset_sized(dst, 0, size, -1); // expected-error{{argument should be a power of 2}}
				__builtin_memset_sized(dst, 0, size, 0); // expected-error{{argument should be a power of 2}}
				__builtin_memset_sized(dst, 0, size, 3); // expected-error{{argument should be a power of 2}}
				__builtin_memset_sized(dst, 0, size, 32); // expected-error{{lock-free}}
				__builtin_memset_sized((volatile char *)dst, 0, size, 1); // expected-error{{specifying an access size for volatile memory operations is unsupported ('volatile char' is volatile)}}
				__builtin_memset_sized((NotLockFree *)dst, 0, size, sizeof(NotLockFree)); // expected-error{{element size must be a lock-free size, 512 exceeds 16 bytes}}
				}

				void array_arg_types() {
				extern char adst[512];
				extern volatile char avdst[512];
				extern const char asrc[512];
				extern const volatile char avsrc[512];

				MEM(adst, asrc, sizeof(adst));
				MEM(avdst, avsrc, sizeof(avdst));
				MEM(asrc, asrc, sizeof(adst)); // expected-error {{argument must be non-const, got 'const char'}}
				MEM(adst, asrc, sizeof(adst) + 1); // TODO diagnose size overflow?
				__builtin_memset_sized(adst, 0, sizeof(adst));
				__builtin_memset_sized(avdst, 0, sizeof(avdst));
				__builtin_memset_sized(asrc, 0, sizeof(asrc)); // expected-error {{argument must be non-const, got 'const char'}}
				__builtin_memset_sized(adst, 0, sizeof(adst) + 1); // TODO diagnose size overflow?
				}

				void atomic_array_arg_types() {
				extern char adst[512];
				extern volatile char avdst[512];
				extern const char asrc[512];
				extern const volatile char avsrc[512];

				MEM(adst, asrc, sizeof(adst), 1);
				MEM(avdst, asrc, sizeof(adst), 1); // expected-error{{specifying an access size for volatile memory operations is unsupported ('volatile char' is volatile)}}
				MEM(adst, avsrc, sizeof(adst), 1); // expected-error{{specifying an access size for volatile memory operations is unsupported ('const volatile char' is volatile)}}
				__builtin_memset_sized(adst, 0, sizeof(adst), 1);
				__builtin_memset_sized(avdst, 0, sizeof(avdst), 1); // expected-error{{specifying an access size for volatile memory operations is unsupported ('volatile char' is volatile)}}
				}

				void trivial_arg_types() {
				TrivialCpy trivialDst;
				const TrivialCpy trivialSrc;
				MEM(&trivialDst, &trivialSrc, sizeof(TrivialCpy));
				MEM((__attribute__((address_space(32))) __unaligned volatile TrivialCpy )&trivialDst, (__attribute__((address_space(64))) __unaligned const volatile TrivialCpy )&trivialSrc, sizeof(TrivialCpy));
				__builtin_memset_sized(&trivialDst, 0, sizeof(trivialDst));
				__builtin_memset_sized((__attribute__((address_space(32))) __unaligned volatile TrivialCpy *)&trivialDst, 0, sizeof(trivialDst));

				TrivialCpy trivialDstArr[2];
				const TrivialCpy trivialSrcArr[2];
				MEM(trivialDstArr, trivialSrcArr, sizeof(TrivialCpy) * 2);
				__builtin_memset_sized(trivialDstArr, 0, sizeof(TrivialCpy) * 2);
				}

				void nontrivial_arg_types() {
				NotTrivialCpy notTrivialDst;
				const NotTrivialCpy notTrivialSrc;
				MEM(&notTrivialDst, &notTrivialSrc, sizeof(NotTrivialCpy), sizeof(NotTrivialCpy)); // expected-error{{address argument to atomic operation must be a pointer to a trivially-copyable type ('NotTrivialCpy' invalid)}}
				__builtin_memset_sized(&notTrivialDst, 0, sizeof(NotTrivialCpy), sizeof(NotTrivialCpy)); // expected-error{{address argument to atomic operation must be a pointer to a trivially-copyable type ('NotTrivialCpy' invalid)}}

				NotTrivialCpy notTrivialDstArr[2];
				const NotTrivialCpy notTrivialSrcArr[2];
				MEM(notTrivialDstArr, notTrivialSrcArr, sizeof(NotTrivialCpy) * 2, sizeof(NotTrivialCpy)); // expected-error{{address argument to atomic operation must be a pointer to a trivially-copyable type ('NotTrivialCpy' invalid)}}
				__builtin_memset_sized(notTrivialDstArr, 0, sizeof(NotTrivialCpy) * 2, sizeof(NotTrivialCpy)); // expected-error{{address argument to atomic operation must be a pointer to a trivially-copyable type ('NotTrivialCpy' invalid)}}
				}

				class Incomplete;
				void inclomplete_arg_types(char dst, const char src, size_t size) {
				MEM((Incomplete *)dst, src, size, 1); // expected-error{{address argument to atomic operation must be a pointer to a trivially-copyable type ('Incomplete' invalid)}}
				MEM(dst, (const Incomplete *)src, size, 1); // expected-error{{address argument to atomic operation must be a pointer to a trivially-copyable type ('const Incomplete' invalid)}}
				__builtin_memset_sized((Incomplete *)dst, 0, size, 1); // expected-error{{address argument to atomic operation must be a pointer to a trivially-copyable type ('Incomplete' invalid)}}
				}

				void sizeless_arg_types(char dst, const char src, size_t size) {
				MEM((sizeless_t *)dst, src, size);
				MEM(dst, (const sizeless_t *)src, size);
				__builtin_memset_sized((sizeless_t *)dst, 0, size);

				MEM((sizeless_t *)dst, src, size, 1);
				MEM(dst, (const sizeless_t *)src, size, 1);
				__builtin_memset_sized((sizeless_t *)dst, 0, size, 1);
				}

				void vector_arg_types(char dst, const char src, size_t size) {
				MEM((float4 *)dst, src, size);
				MEM(dst, (const float4 *)src, size);
				__builtin_memset_sized((float4 *)dst, 0, size);

				MEM((float4 )dst, (const float4 )src, size, sizeof(float4));
				MEM((float4 )dst, (const float4 )src, size, sizeof(float4));
				__builtin_memset_sized((float4 *)dst, 0, size, sizeof(float4));
				}

				void extint_arg_types(char dst, const char src, size_t size) {
				MEM((_ExtInt(2) *)dst, src, size);
				MEM(dst, (const _ExtInt(2) *)src, size);
				__builtin_memset_sized((_ExtInt(2) *)dst, 0, size);

				MEM((_ExtInt(8) )dst, (const _ExtInt(8) )src, size, 1);
				__builtin_memset_sized((_ExtInt(8) *)dst, 0, size, 1);
				}

clang/test/SemaCXX/constexpr-string.cpp

Show First 20 Lines • Show All 669 Lines • ▼ Show 20 Lines	#define fold(x) (__builtin_constant_p(0) ? (x) : (x))
constexpr bool test_address_of_incomplete_struct_type() { // expected-error {{never produces a constant}}		constexpr bool test_address_of_incomplete_struct_type() { // expected-error {{never produces a constant}}
struct Incomplete;		struct Incomplete;
extern Incomplete x, y;		extern Incomplete x, y;
__builtin_memcpy(&x, &x, 4);		__builtin_memcpy(&x, &x, 4);
// expected-note@-1 2{{cannot constant evaluate 'memcpy' between objects of incomplete type 'Incomplete'}}		// expected-note@-1 2{{cannot constant evaluate 'memcpy' between objects of incomplete type 'Incomplete'}}
return true;		return true;
}		}
static_assert(test_address_of_incomplete_struct_type()); // expected-error {{constant}} expected-note {{in call}}		static_assert(test_address_of_incomplete_struct_type()); // expected-error {{constant}} expected-note {{in call}}

		template <typename T, int ElNum>
		constexpr auto test_memcpy_sized(int dst_off, int src_off, int num) {
		T dst[4] = {0, 0, 0, 0};
		const T src[4] = {1, 2, 3, 4};
		// expected-note@+2 {{size parameter is 12, expected a size that is evenly divisible by element size 8}}
		// expected-note@+1 {{size parameter is 4, expected a size that is evenly divisible by element size 8}}
		__builtin_memcpy_sized(dst + dst_off, src + src_off, num * sizeof(T), ElNum * sizeof(T));
		return result(dst);
		}

		static_assert(test_memcpy_sized<int, 1>(0, 0, 1) == 1000);
		static_assert(test_memcpy_sized<int, 1>(0, 0, 2) == 1200);
		static_assert(test_memcpy_sized<int, 1>(0, 0, 3) == 1230);
		static_assert(test_memcpy_sized<int, 1>(0, 0, 4) == 1234);
		static_assert(test_memcpy_sized<int, 2>(0, 0, 4) == 1234);

		// expected-error@+1 {{static_assert expression is not an integral constant expression}}
		static_assert(test_memcpy_sized<int, 2>(0, 0, 3) == 1234); // expected-note {{in call to 'test_memcpy_sized(0, 0, 3)'}}
		// expected-error@+1 {{static_assert expression is not an integral constant expression}}
		static_assert(test_memcpy_sized<int, 2>(0, 0, 1) == 1234); // expected-note {{in call to 'test_memcpy_sized(0, 0, 1)'}}
}		}

compiler-rt/lib/ubsan/ubsan_handlers.h

	Show First 20 Lines • Show All 149 Lines • ▼ Show 20 Lines
	};			};

	/// \brief Implict conversion that changed the value.			/// \brief Implict conversion that changed the value.
	RECOVERABLE(implicit_conversion, ImplicitConversionData *Data, ValueHandle Src,			RECOVERABLE(implicit_conversion, ImplicitConversionData *Data, ValueHandle Src,
	ValueHandle Dst)			ValueHandle Dst)

	/// Known builtin check kinds.			/// Known builtin check kinds.
	/// Keep in sync with the enum of the same name in CodeGenFunction.h			/// Keep in sync with the enum of the same name in CodeGenFunction.h
	enum BuiltinCheckKind : unsigned char {			enum BuiltinCheck : unsigned char {
	BCK_CTZPassedZero,			CTZPassedZero,
	BCK_CLZPassedZero,			CLZPassedZero,
				AtomicMemMisaligned,
				AtomicMemMismatchedSize,
	};			};

	struct InvalidBuiltinData {			struct InvalidBuiltinData {
	SourceLocation Loc;			SourceLocation Loc;
	unsigned char Kind;			unsigned char Kind;
				unsigned char UnusedPadding;
				unsigned ElementSize;
	};			};

	/// Handle a builtin called in an invalid way.			/// Handle a builtin called in an invalid way.
	RECOVERABLE(invalid_builtin, InvalidBuiltinData *Data)			RECOVERABLE(invalid_builtin, InvalidBuiltinData *Data, ValueHandle PtrOrSize)

	struct InvalidObjCCast {			struct InvalidObjCCast {
	SourceLocation Loc;			SourceLocation Loc;
	const TypeDescriptor &ExpectedType;			const TypeDescriptor &ExpectedType;
	};			};

	/// Handle an invalid ObjC cast.			/// Handle an invalid ObjC cast.
	RECOVERABLE(invalid_objc_cast, InvalidObjCCast *Data, ValueHandle Pointer)			RECOVERABLE(invalid_objc_cast, InvalidObjCCast *Data, ValueHandle Pointer)
	▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

compiler-rt/lib/ubsan/ubsan_handlers.cpp

	Show First 20 Lines • Show All 611 Lines • ▼ Show 20 Lines
	}			}
	void __ubsan::__ubsan_handle_implicit_conversion_abort(			void __ubsan::__ubsan_handle_implicit_conversion_abort(
	ImplicitConversionData *Data, ValueHandle Src, ValueHandle Dst) {			ImplicitConversionData *Data, ValueHandle Src, ValueHandle Dst) {
	GET_REPORT_OPTIONS(true);			GET_REPORT_OPTIONS(true);
	handleImplicitConversion(Data, Opts, Src, Dst);			handleImplicitConversion(Data, Opts, Src, Dst);
	Die();			Die();
	}			}

	static void handleInvalidBuiltin(InvalidBuiltinData *Data, ReportOptions Opts) {			static void handleInvalidBuiltin(InvalidBuiltinData *Data, ReportOptions Opts,
				uptr PtrOrSize) {
	SourceLocation Loc = Data->Loc.acquire();			SourceLocation Loc = Data->Loc.acquire();
	ErrorType ET = ErrorType::InvalidBuiltin;			ErrorType ET = ErrorType::InvalidBuiltin;

	if (ignoreReport(Loc, Opts, ET))			if (ignoreReport(Loc, Opts, ET))
	return;			return;

	ScopedReport R(Opts, Loc, ET);			ScopedReport R(Opts, Loc, ET);

				switch (Data->Kind) {
				case BuiltinCheck::CTZPassedZero:
				Diag(Loc, DL_Error, ET,
				"passing zero to ctz(), which is not a valid argument");
				break;
				case BuiltinCheck::CLZPassedZero:
				Diag(Loc, DL_Error, ET,
				"passing zero to clz(), which is not a valid argument");
				break;
				case BuiltinCheck::AtomicMemMisaligned:
				Diag(Loc, DL_Error, ET,
				"passing pointer %0 with invalid alignment %1 into "
				"__builtin_mem*_sized, element size %2")
				<< (void *)PtrOrSize << ((Data->ElementSize - 1) & PtrOrSize)
				<< Data->ElementSize;
				break;
				case BuiltinCheck::AtomicMemMismatchedSize:
	Diag(Loc, DL_Error, ET,			Diag(Loc, DL_Error, ET,
	"passing zero to %0, which is not a valid argument")			"passing an invalid size %0 with element size %1 to "
	<< ((Data->Kind == BCK_CTZPassedZero) ? "ctz()" : "clz()");			"__builtin_mem*_sized")
				<< PtrOrSize << Data->ElementSize;
				break;
				default:
				UNREACHABLE("unexpected builtin kind!");
				}
	}			}

	void __ubsan::__ubsan_handle_invalid_builtin(InvalidBuiltinData *Data) {			void __ubsan::__ubsan_handle_invalid_builtin(InvalidBuiltinData *Data,
				Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function '__ubsan_handle_invalid_builtin' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function '__ubsan_handle_invalid_builtin'…
				vskUnsubmitted Done Reply Inline Actions It looks like `__ubsan_handle_invalid_builtin` is meant to be recoverable, so I think this should be `GET_REPORT_OPTIONS(false)`. Marking this unrecoverable makes it impossible to suppress redundant diagnostics at the same source location. It looks this isn't code you've added: feel free to punt this to me. If you don't mind folding in a fix, adding a test would be simple (perform UB in a loop and verify only one diagnostic is printed). vsk: It looks like `__ubsan_handle_invalid_builtin` is meant to be recoverable, so I think this…
				jfbAuthorUnsubmitted Done Reply Inline Actions I folded this into the patch. jfb: I folded this into the patch.
	GET_REPORT_OPTIONS(true);			uptr PtrOrSize) {
	handleInvalidBuiltin(Data, Opts);			GET_REPORT_OPTIONS(false);
				handleInvalidBuiltin(Data, Opts, PtrOrSize);
	}			}
	void __ubsan::__ubsan_handle_invalid_builtin_abort(InvalidBuiltinData *Data) {			void __ubsan::__ubsan_handle_invalid_builtin_abort(InvalidBuiltinData *Data,
				Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function '__ubsan_handle_invalid_builtin_abort' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function '__ubsan_handle_invalid_builtin_abort'…
				uptr PtrOrSize) {
	GET_REPORT_OPTIONS(true);			GET_REPORT_OPTIONS(true);
	handleInvalidBuiltin(Data, Opts);			handleInvalidBuiltin(Data, Opts, PtrOrSize);
	Die();			Die();
	}			}

	static void handleInvalidObjCCast(InvalidObjCCast *Data, ValueHandle Pointer,			static void handleInvalidObjCCast(InvalidObjCCast *Data, ValueHandle Pointer,
	ReportOptions Opts) {			ReportOptions Opts) {
	SourceLocation Loc = Data->Loc.acquire();			SourceLocation Loc = Data->Loc.acquire();
	ErrorType ET = ErrorType::InvalidObjCCast;			ErrorType ET = ErrorType::InvalidObjCCast;

	▲ Show 20 Lines • Show All 270 Lines • Show Last 20 Lines

compiler-rt/test/ubsan/TestCases/Misc/builtins-ctz-clz.cpp

This file was added.

				// REQUIRES: arch=x86_64
				//
				jfbAuthorUnsubmitted Done Reply Inline Actions Phab is confused.... I did a git rename of `compiler-rt/test/ubsan/TestCases/Misc/builtins.cpp` and it thinks this is new, and I deleted the other. jfb: Phab is confused.... I did a git rename of `compiler-rt/test/ubsan/TestCases/Misc/builtins.cpp`…
				// RUN: %clangxx -fsanitize=builtin -w %s -O3 -o %t
				// RUN: %run %t 2>&1 \| FileCheck %s --check-prefix=RECOVER
				// RUN: %clangxx -fsanitize=builtin -fno-sanitize-recover=builtin -w %s -O3 -o %t.abort
				// RUN: not %run %t.abort 2>&1 \| FileCheck %s --check-prefix=ABORT

				void check_ctz(int n) {
				// ABORT: builtins-ctz-clz.cpp:[[@LINE+2]]:17: runtime error: passing zero to ctz(), which is not a valid argument
				// RECOVER: builtins-ctz-clz.cpp:[[@LINE+1]]:17: runtime error: passing zero to ctz(), which is not a valid argument
				__builtin_ctz(n);

				// RECOVER: builtins-ctz-clz.cpp:[[@LINE+1]]:18: runtime error: passing zero to ctz(), which is not a valid argument
				__builtin_ctzl(n);

				// RECOVER: builtins-ctz-clz.cpp:[[@LINE+1]]:19: runtime error: passing zero to ctz(), which is not a valid argument
				__builtin_ctzll(n);
				}

				void check_clz(int n) {
				// RECOVER: builtins-ctz-clz.cpp:[[@LINE+1]]:17: runtime error: passing zero to clz(), which is not a valid argument
				__builtin_clz(n);

				// RECOVER: builtins-ctz-clz.cpp:[[@LINE+1]]:18: runtime error: passing zero to clz(), which is not a valid argument
				__builtin_clzl(n);

				// RECOVER: builtins-ctz-clz.cpp:[[@LINE+1]]:19: runtime error: passing zero to clz(), which is not a valid argument
				__builtin_clzll(n);
				}

				void check_recoverable(int n) {
				// Print out only once.
				for (int i = 0; i != 100; ++i) {
				// RECOVER: builtins-ctz-clz.cpp:[[@LINE+1]]:19: runtime error: passing zero to clz(), which is not a valid argument
				__builtin_clz(n);
				}
				}

				int main() {
				check_ctz(0);
				check_clz(0);
				check_recoverable(0);
				return 0;
				}

compiler-rt/test/ubsan/TestCases/Misc/builtins-mem_sized.cpp

This file was added.

				// RUN: %clangxx -fsanitize=builtin -w %s -O3 -o %t
				// RUN: %run %t 2>&1 \| FileCheck %s

				using uintptr_t = __UINTPTR_TYPE__;
				using size_t = __SIZE_TYPE__;

				void check_memcpy_align(char dst_aligned, char dst_misaligned, const char src_aligned, const char src_misaligned, size_t sz) {
				// OK.
				__builtin_memcpy_sized(dst_aligned, src_aligned, sz, 2);
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:26: runtime error: passing pointer 0x{{[0-9a-f]}} with invalid alignment 1 into __builtin_mem_sized, element size 2
				__builtin_memcpy_sized(dst_misaligned, src_aligned, sz, 2);
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:39: runtime error: passing pointer 0x{{[0-9a-f]}} with invalid alignment 1 into __builtin_mem_sized, element size 2
				__builtin_memcpy_sized(dst_aligned, src_misaligned, sz, 2);
				}

				void check_memmove_align(char dst_aligned, char dst_misaligned, const char src_aligned, const char src_misaligned, size_t sz) {
				// OK.
				__builtin_memmove_sized(dst_aligned, src_aligned, sz, 2);
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:27: runtime error: passing pointer 0x{{[0-9a-f]}} with invalid alignment 1 into __builtin_mem_sized, element size 2
				__builtin_memmove_sized(dst_misaligned, src_aligned, sz, 2);
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:40: runtime error: passing pointer 0x{{[0-9a-f]}} with invalid alignment 1 into __builtin_mem_sized, element size 2
				__builtin_memmove_sized(dst_aligned, src_misaligned, sz, 2);
				}

				void check_memset_align(char dst_aligned, char dst_misaligned, size_t sz) {
				// OK.
				__builtin_memset_sized(dst_aligned, 0, sz, 2);
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:26: runtime error: passing pointer 0x{{[0-9a-f]}} with invalid alignment 1 into __builtin_mem_sized, element size 2
				__builtin_memset_sized(dst_misaligned, 0, sz, 2);
				}

				void check_memcpy_size(char dst, char src) {
				// OK.
				__builtin_memcpy_sized(dst, src, 32, 2);
				__builtin_memcpy_sized(dst, src, 2, 2);
				__builtin_memcpy_sized(dst, src, 0, 2);
				volatile size_t small_bad_sz = 1;
				volatile size_t big_bad_sz = 43;
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:36: runtime error: passing an invalid size 1 with element size 2 to __builtin_mem*_sized
				__builtin_memcpy_sized(dst, src, small_bad_sz, 2);
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:36: runtime error: passing an invalid size 43 with element size 2 to __builtin_mem*_sized
				__builtin_memcpy_sized(dst, src, big_bad_sz, 2);
				}

				void check_memmove_size(char dst, char src) {
				// OK.
				__builtin_memmove_sized(dst, src, 32, 2);
				__builtin_memmove_sized(dst, src, 2, 2);
				__builtin_memmove_sized(dst, src, 0, 2);
				volatile size_t small_bad_sz = 1;
				volatile size_t big_bad_sz = 43;
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:37: runtime error: passing an invalid size 1 with element size 2 to __builtin_mem*_sized
				__builtin_memmove_sized(dst, src, small_bad_sz, 2);
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:37: runtime error: passing an invalid size 43 with element size 2 to __builtin_mem*_sized
				__builtin_memmove_sized(dst, src, big_bad_sz, 2);
				}

				void check_memset_size(char *dst) {
				// OK.
				__builtin_memset_sized(dst, 0, 32, 2);
				__builtin_memset_sized(dst, 0, 2, 2);
				__builtin_memset_sized(dst, 0, 0, 2);
				volatile size_t small_bad_sz = 1;
				volatile size_t big_bad_sz = 43;
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:34: runtime error: passing an invalid size 1 with element size 2 to __builtin_mem*_sized
				__builtin_memset_sized(dst, 0, small_bad_sz, 2);
				// CHECK: builtins-mem_sized.cpp:[[@LINE+1]]:34: runtime error: passing an invalid size 43 with element size 2 to __builtin_mem*_sized
				__builtin_memset_sized(dst, 0, big_bad_sz, 2);
				}

				int main() {
				char dst0[128] = {0};
				char dst1[128] = {0};
				char src0[128] = {0};
				char src1[128] = {0};
				char *dst_aligned = ((uintptr_t)dst0 & 0x1) ? (dst0 + 1) : dst0;
				char *dst_misaligned = ((uintptr_t)dst1 & 0x1) ? dst1 : (dst1 + 1);
				char *src_aligned = ((uintptr_t)src0 & 0x1) ? (src0 + 1) : src0;
				char *src_misaligned = ((uintptr_t)src1 & 0x1) ? src1 : (src1 + 1);
				check_memcpy_align(dst_aligned, dst_misaligned, src_aligned, src_misaligned, 32);
				check_memmove_align(dst_aligned, dst_misaligned, src_aligned, src_misaligned, 32);
				check_memset_align(dst_aligned, dst_misaligned, 32);
				check_memcpy_size(dst_aligned, src_aligned);
				check_memmove_size(dst_aligned, src_aligned);
				check_memset_size(dst_aligned);

				return 0;
				}

				// The stubs don't actually need to do anything since we're not checking their behavior.
				extern "C" void __llvm_memcpy_element_unordered_atomic_2(void dst, void *src, size_t sz) { return nullptr; }
				extern "C" void __llvm_memmove_element_unordered_atomic_2(void , void *, size_t) { return nullptr; }
				extern "C" void __llvm_memset_element_unordered_atomic_2(volatile short , int, size_t) { return nullptr; }

compiler-rt/test/ubsan/TestCases/Misc/builtins.cpp

This file was deleted.

	// REQUIRES: arch=x86_64
	//
	// RUN: %clangxx -fsanitize=builtin -w %s -O3 -o %t
	// RUN: %run %t 2>&1 \| FileCheck %s --check-prefix=RECOVER
	// RUN: %clangxx -fsanitize=builtin -fno-sanitize-recover=builtin -w %s -O3 -o %t.abort
	// RUN: not %run %t.abort 2>&1 \| FileCheck %s --check-prefix=ABORT

	void check_ctz(int n) {
	// ABORT: builtins.cpp:[[@LINE+2]]:17: runtime error: passing zero to ctz(), which is not a valid argument
	// RECOVER: builtins.cpp:[[@LINE+1]]:17: runtime error: passing zero to ctz(), which is not a valid argument
	__builtin_ctz(n);

	// RECOVER: builtins.cpp:[[@LINE+1]]:18: runtime error: passing zero to ctz(), which is not a valid argument
	__builtin_ctzl(n);

	// RECOVER: builtins.cpp:[[@LINE+1]]:19: runtime error: passing zero to ctz(), which is not a valid argument
	__builtin_ctzll(n);
	}

	void check_clz(int n) {
	// RECOVER: builtins.cpp:[[@LINE+1]]:17: runtime error: passing zero to clz(), which is not a valid argument
	__builtin_clz(n);

	// RECOVER: builtins.cpp:[[@LINE+1]]:18: runtime error: passing zero to clz(), which is not a valid argument
	__builtin_clzl(n);

	// RECOVER: builtins.cpp:[[@LINE+1]]:19: runtime error: passing zero to clz(), which is not a valid argument
	__builtin_clzll(n);
	}

	int main() {
	check_ctz(0);
	check_clz(0);
	return 0;
	}

This is an archive of the discontinued LLVM Phabricator instance.

Add overloaded versions of builtin mem* functionsNeeds RevisionPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 285414

clang/docs/LanguageExtensions.rst

clang/include/clang/Basic/Builtins.def

clang/include/clang/Basic/DiagnosticASTKinds.td

clang/include/clang/Basic/DiagnosticSemaKinds.td

clang/include/clang/Sema/Sema.h

clang/lib/AST/ExprConstant.cpp

clang/lib/CodeGen/CGBuilder.h

clang/lib/CodeGen/CGBuiltin.cpp

clang/lib/CodeGen/CodeGenFunction.h

clang/lib/Sema/SemaChecking.cpp

clang/test/CodeGen/builtin-memfns.c

clang/test/CodeGen/builtin-sized-memfns.c

clang/test/CodeGen/ubsan-builtin-checks.c

clang/test/CodeGen/ubsan-builtin-ctz-clz.c

clang/test/CodeGen/ubsan-builtin-mem_sized.c

clang/test/CodeGenObjC/builtin-memfns.m

clang/test/Sema/builtin-sized-memfns.cpp

clang/test/SemaCXX/constexpr-string.cpp

compiler-rt/lib/ubsan/ubsan_handlers.h

compiler-rt/lib/ubsan/ubsan_handlers.cpp

compiler-rt/test/ubsan/TestCases/Misc/builtins-ctz-clz.cpp

compiler-rt/test/ubsan/TestCases/Misc/builtins-mem_sized.cpp

compiler-rt/test/ubsan/TestCases/Misc/builtins.cpp

Add overloaded versions of builtin mem* functions
Needs RevisionPublic