Page MenuHomePhabricator
Feed Advanced Search

Yesterday

rjmccall added a comment to D75574: RFC: Implement objc_direct_protocol attribute to remove protocol metadata.

This feature looks generally useful. A few small suggestions:

  • This is really a way of transforming a formal protocol into an informal protocol. Objective-C has had a convention of informal protocols since the '80s, but they're implemented as categories on the root class with no @implementation. I'd suggest that __attribute__((objc_informal_protocol)) or similar might be a better spelling for this, explicitly bringing the informal notion into the language. A lot of the informal protocols in Cocoa could be better expressed using this and @optional methods than as categories on NSObject.
Fri, Aug 7, 10:02 AM · Restricted Project

Thu, Aug 6

rjmccall added a comment to D82317: [Clang/Test]: Update tests where `noundef` attribute is necessary.

Are you seriously adding an attribute to literally every argument and return value? Why is this the right representation?

This adds an attribute to every argument and return value where the language rules denote it cannot be undef (which is ideally in most places, but definitely not everywhere).

Thu, Aug 6, 2:52 PM · Restricted Project
rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

They would have to declare it as well (because C code can #undef memcpy and expect to then be able to call a real function), so the #define would be pointless.

Thu, Aug 6, 2:11 PM · Restricted Project, Restricted Project
rjmccall added a comment to D82317: [Clang/Test]: Update tests where `noundef` attribute is necessary.

Are you seriously adding an attribute to literally every argument and return value? Why is this the right representation?

Thu, Aug 6, 12:29 PM · Restricted Project
rjmccall accepted D79744: clang: Use byref for aggregate kernel arguments.

Thanks, LGTM.

Thu, Aug 6, 12:26 PM
rjmccall added a reviewer for D75574: RFC: Implement objc_direct_protocol attribute to remove protocol metadata: theraven.

One thing that's come up so far: you generally need to be looking through non-runtime protocols, not ignoring them. This matters when non-runtime protocols inherit from ordinary protocols. It may be useful to provide a generic function that walks an array of protocols and calls a callback with the unique ordinary protocols it implies.

Thu, Aug 6, 10:53 AM · Restricted Project

Wed, Aug 5

rjmccall accepted D85113: [ABI][NFC] Fix the confusion of ByVal and ByRef argument names.

LGTM

Wed, Aug 5, 12:09 PM · Restricted Project
rjmccall added a comment to D83325: [Sema] Be more thorough when unpacking the AS-qualified pointee for a pointer conversion..

There's no way to do that, no. Stripping sugar down to the point where you don't have that qualifier anymore is the best we can do.

Wed, Aug 5, 12:08 PM · Restricted Project
rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.

I thought part of the point of __builtin_memcpy was so that C library headers could do #define memcpy(x, y, z) __builtin_memcpy(x, y, z). If so, the conformance issue touches __builtin_memcpy as well, not just calls to the library builtin.

Wed, Aug 5, 12:01 PM · Restricted Project, Restricted Project
rjmccall added a comment to D85319: [analyzer][RFC] Get info from the LLVM IR for precision.

Thanks. It'd be a good idea to mention that this is contingent on that discussion in the patch summary.

Wed, Aug 5, 11:59 AM · Restricted Project
rjmccall added inline comments to D79744: clang: Use byref for aggregate kernel arguments.
Wed, Aug 5, 11:46 AM
rjmccall requested changes to D85319: [analyzer][RFC] Get info from the LLVM IR for precision.

This seems a huge architectural change that we need to talk about.

Wed, Aug 5, 11:36 AM · Restricted Project
rjmccall added inline comments to D79279: Add overloaded versions of builtin mem* functions.
Wed, Aug 5, 11:30 AM · Restricted Project, Restricted Project
rjmccall added inline comments to D79744: clang: Use byref for aggregate kernel arguments.
Wed, Aug 5, 11:25 AM

Tue, Aug 4

rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.

Patch looks basically okay to me, although I'll second Richard's concern that we shouldn't absent-mindedly start producing overloaded memcpys for ordinary __builtin_memcpy.

Tue, Aug 4, 8:57 PM · Restricted Project, Restricted Project
rjmccall added a comment to D75574: RFC: Implement objc_direct_protocol attribute to remove protocol metadata.

Sorry, this slipped out of my mind. I've started the process internally.

Tue, Aug 4, 10:53 AM · Restricted Project

Mon, Aug 3

rjmccall added a comment to D44536: Avoid segfault when destructor is not yet known.

Would you like to pick it back up? We laid out an implementation path: we need to track the fact that a delete was of an incomplete class type in the AST and then unconditionally treat such operations as trivial to destroy in IRGen.

Mon, Aug 3, 10:40 AM
rjmccall accepted D84540: [CodeGen][ObjC] Mark calls to objc_unsafeClaimAutoreleasedReturnValue as notail on x86-64.

LGTM

Mon, Aug 3, 10:33 AM · Restricted Project
rjmccall added inline comments to D79744: clang: Use byref for aggregate kernel arguments.
Mon, Aug 3, 10:31 AM
rjmccall accepted D84878: [FPEnv] IRBuilder fails to add strictfp attribute.

LGTM.

Mon, Aug 3, 9:56 AM · Restricted Project

Tue, Jul 28

rjmccall added a comment to D83997: [os_log] Improve the way we extend the lifetime of objects passed to __builtin_os_log_format.

The use case for this is a macro in which the call to __builtin_os_log_format that writes to the buffer and the call that uses the buffer appear in two different statements. For example:

__builtin_os_log_format(buf, "%@", getObj());
...
use_buffer(buf);

The object returned by the call to getObj has to be kept alive until use_buffer is called, but currently it gets destructed at the end of the full expression. I think an alternate solution would be to provide users a means to tell ARC optimizer not to move the release call for a local variable past any calls, i.e., something that is stricter than NS_VALID_UNTIL_END_OF_SCOPE, but that places more burden on the users.

In the os_log macro, the result of the call to __builtin_os_log_format is passed directly to the call that uses the buffer, so it doesn't require any lifetime extension as you pointed out.

Tue, Jul 28, 11:27 AM · Restricted Project

Mon, Jul 27

rjmccall added a comment to D84602: [MSP430] Expose msp430_builtin calling convention to C code.

Is there only one special calling convention, or is there any chance that different builtin functions would use different conventions?

Mon, Jul 27, 11:27 AM · Restricted Project, Restricted Project

Fri, Jul 24

rjmccall added inline comments to D79744: clang: Use byref for aggregate kernel arguments.
Fri, Jul 24, 9:30 PM

Thu, Jul 23

rjmccall added a comment to D82999: [CodeGen] Check the cleanup flag before destructing lifetime-extended temporaries created in conditional expressions.

I agree, that can be done separately.

Thu, Jul 23, 2:55 PM · Restricted Project
rjmccall added a comment to D83325: [Sema] Be more thorough when unpacking the AS-qualified pointee for a pointer conversion..

removeAddrSpaceQualType should guarantee that it removes the address space qualifier; you shouldn't need to do something special here. That means it needs to iteratively desugar and collect qualifiers as long as the type is still address-space-qualified.

Thu, Jul 23, 2:54 PM · Restricted Project
rjmccall accepted D84343: [AST] Keep FP options in trailing storage of CallExpr.

LGTM.

Thu, Jul 23, 2:46 PM · Restricted Project
rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.
In D79279#2170187, @jfb wrote:

I think the argument is treated as if it were 1 if not given. That's all that ordinary memcpy formally guarantees, which seems to work fine (semantically, if not performance-wise) for pretty much everything today.

I'm not sure that's true: consider a memcpy implementation which copies some bytes twice (at different access size, there's an overlap because somehow it's more efficient). That would probably violate the programmer's expectations, and I don't think volatile nor atomic memcpy allow this (but regular memcpy does).

Thu, Jul 23, 11:25 AM · Restricted Project, Restricted Project
rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.

I think the argument is treated as if it were 1 if not given. That's all that ordinary memcpy formally guarantees, which seems to work fine (semantically, if not performance-wise) for pretty much everything today. I don't think you need any restrictions on element size. It's probably sensible to require the pointers to be dynamically aligned to a multiple of the access width, but I don't think you can enforce that statically. And of course the length needs to be a multiple of the access size.

Thu, Jul 23, 10:43 AM · Restricted Project, Restricted Project
rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.

I don't think any of these should allow _Atomic unless we're going to give it some sort of consistent atomic semantics (which is hard to imagine being useful), and I think you should just take an extra argument of the minimum access width on all of them uniformly if you think that's important. Builtins can have optional arguments.

Thu, Jul 23, 10:21 AM · Restricted Project, Restricted Project
rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.
In D79279#2169522, @jfb wrote:
In D79279#2168533, @jfb wrote:

Is there a need for an atomic memcpy at all? Why is it useful to allow this operation to take on "atomic" semantics — which aren't actually atomic because the loads and stores to elements are torn — with hardcoded memory ordering and somewhat arbitrary rules about what the atomic size is?

Hans lays out a rationale for usefulness in his paper, but what I've implemented is more useful: it's *unordered* so you can fence as you desire around it, yet it guarantees a minimum memory access size based on the pointer parameters. For example, copying an atomic int will be 4 byte operations which are single-copy-atomic, but the accesses from one int to the next aren't performed in any guaranteed order (or observable in any guaranteed order either). I talked about this with him a while ago but IIRC he wasn't sure about implementation among other things, so when you asked me to widen my original volatile-only memcpy to also do other qualifiers, I realized that it was a neat way to do atomic as well as other qualifiers. I've talked to a few SG1 folks about this, and I believe (for other reasons too) it's where the design will end up for Hans' paper.

I can see the usefulness of this operation, but it seems like a odd semantic mismatch for what is basically just a memcpy where one of the pointers happens to have _Atomic type, like you're shoe-horning it into this builtin just to avoid declaring a different one.

I'm following the discussion we had here regarding overloading:

There are other qualifiers that can meaningfully contribute to the operation here besides volatile, such as restrict and (more importantly) address spaces. And again, for the copy operations these might differ between the two pointer types.

In both cases, I’d say that the logical design is to allow the pointers to be to arbitrarily-qualified types. We can then propagate that information from the builtin into the LLVM intrinsic call as best as we’re allowed. So I think you should make builtins called something like __builtin_overloaded_memcpy (name to be decided) and just have their semantics be type-directed.

Ah yes, I’d like to hear what others think of this. I hadn’t thought about it before you brought it up, and it sounds like a good idea.

As you noted earlier, for memcpy you probably want to express differences in destination and source qualification, even if today IR can't express e.g. volatile source and non-volatile destination. You were talking about volatile, but this applies to the entire combination of dst+src qualified with zero-to-five volatile, _Atomic, __unaligned, restrict, and address space. Pulling the entire combination space out into different functions would create way too many functions. Right now the implementation has a few limitations: it treats both dst and src as volatile if either are, it can't do _Atomic with volatile so we diagnose, and it ignores restrict. Otherwise it supports all combinations.

Thu, Jul 23, 10:08 AM · Restricted Project, Restricted Project

Wed, Jul 22

rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.
In D79279#2168533, @jfb wrote:

Is there a need for an atomic memcpy at all? Why is it useful to allow this operation to take on "atomic" semantics — which aren't actually atomic because the loads and stores to elements are torn — with hardcoded memory ordering and somewhat arbitrary rules about what the atomic size is?

Hans lays out a rationale for usefulness in his paper, but what I've implemented is more useful: it's *unordered* so you can fence as you desire around it, yet it guarantees a minimum memory access size based on the pointer parameters. For example, copying an atomic int will be 4 byte operations which are single-copy-atomic, but the accesses from one int to the next aren't performed in any guaranteed order (or observable in any guaranteed order either). I talked about this with him a while ago but IIRC he wasn't sure about implementation among other things, so when you asked me to widen my original volatile-only memcpy to also do other qualifiers, I realized that it was a neat way to do atomic as well as other qualifiers. I've talked to a few SG1 folks about this, and I believe (for other reasons too) it's where the design will end up for Hans' paper.

Wed, Jul 22, 11:11 PM · Restricted Project, Restricted Project
rjmccall added a comment to D83997: [os_log] Improve the way we extend the lifetime of objects passed to __builtin_os_log_format.

Why is the lifetime extended to the enclosing block scope anyway? I understand why we need a clang.arc.use — the optimizer can't reasonably understand that the object has to live within the buffer — but isn't the buffer only used for the duration of the call? Why is extension necessary?

Wed, Jul 22, 8:05 PM · Restricted Project
rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.

Is there a need for an atomic memcpy at all? Why is it useful to allow this operation to take on "atomic" semantics — which aren't actually atomic because the loads and stores to elements are torn — with hardcoded memory ordering and somewhat arbitrary rules about what the atomic size is?

Wed, Jul 22, 7:47 PM · Restricted Project, Restricted Project
rjmccall added a comment to D79279: Add overloaded versions of builtin mem* functions.

You need to add user docs for these builtins.

Wed, Jul 22, 1:01 PM · Restricted Project, Restricted Project

Tue, Jul 21

rjmccall added a comment to D79744: clang: Use byref for aggregate kernel arguments.

Arguably we should add this attribute to all indirect arguments. I can understand not wanting to update all the test cases, but you could probably avoid adding a new IndirectByRef kind of ABIArgInfo by treating kernels specially in ConstructAttributeList.

Tue, Jul 21, 10:03 PM
rjmccall accepted D83812: [clang][RelativeVTablesABI] Do not emit stubs for architectures that support a PLT relocation.

Thanks, LGTM.

Tue, Jul 21, 10:31 AM · Restricted Project

Mon, Jul 20

rjmccall accepted D84147: Use typedef to represent storage type in FPOption and FPOptionsOverride.

LGTM.

Mon, Jul 20, 9:16 PM · Restricted Project

Fri, Jul 17

rjmccall added inline comments to D83812: [clang][RelativeVTablesABI] Do not emit stubs for architectures that support a PLT relocation.
Fri, Jul 17, 11:07 AM · Restricted Project

Wed, Jul 15

rjmccall added a comment to D82663: [CodeGen] Have CodeGen for fixed-point unsigned with padding emit signed operations..

Would it be sensible to use a technical design more like what the matrix folks are doing, where LLVM provides a small interface for emitting operations with various semantics? FixedPointSemantics would move to that header, and Clang would just call into it. That way you get a lot more flexibility in how you generate code, and the Clang IRGen logic is still transparently correct. If you want to add intrinsics or otherwise change the IR patterns used for various operations, you don't have to rewrite a bunch of Clang IRGen logic every time, you just have to update the tests. It'd then be pretty straightforward to have internal helper functions in that interface for computing things like whether you should use signed or unsigned intrinsics given the desired FixedPointSemantics.

This seems like a reasonable thing to do for other reasons as well. Also moving the actual APFixedPoint class to LLVM would make it easier to reuse the fixedpoint calculation code for constant folding in LLVM, for example.

Just to say "I told you so", I'm pretty sure I told people this would happen. :)

Well, transferring the fixed point concept over to LLVM felt like it would happen sooner or later, for the reasons we've discussed here as well as for other reasons. I'm not sure that the discrepancies between the Clang and LLVM semantics were predicted to be the driving factor behind the move, though.

My interest here is mainly in (1) keeping IRGen's logic as obviously correct as possible, (2) not hard-coding a bunch of things that really feel like workarounds for backend limitations, and (3) not complicating core abstractions like FixedPointSemantics with unnecessary extra rules for appropriate use, like having to pass an extra "for codegen" flag to get optimal codegen. If IRGen can just pass down the high-level semantics it wants to some library that will make intelligent decisions about how to emit IR, that seems best.

Just to clarify something here; would the interface in LLVM still emit signed operations for unsigned with padding?

If that's the best IR pattern to emit, yes.

If so, why does dealing with the padding bit detail in LLVM rather than Clang make more sense?

Because frontends should be able to just say "I have a value of a type with these semantics, I need you to do these operations, go do them". The whole purpose of this interface would be to go down a level of abstraction by picking the best IR to represent those operations.

Maybe we're not in agreement about what this interface looks like — I'm imagining something like

struct FixedPointEmitter {
  IRBuilder &B;
  FixedPointEmitter(IRBuilder &B) : B(B) {}

  Value *convert(Value *src, FixedPointSemantics srcSemantics, FixedPointSemantics destSemantics);
  Value *add(Value *lhs, FixedPointSemantics lhsSemantics, Value *rhs, FixedPointSemantics rhsSemantics)
};

I've spent some time going over this and trying to figure out how it would work. I think the interface seems fine on the surface, but I don't see how it directly solves the issues at hand. Regardless of whether this is factored out to LLVM, we still have the issue that we have to massage the semantic somewhere in order to get different behavior for certain kinds of semantics during binop codegen.

Since the binop functions take two different semantics, it must perform conversions internally to get the values to match up before the operation. This would probably just be to the common semantic between the two, and it would then return the Value in the common semantic (since we don't know what to convert back to).

In order for the binop functions to have special behavior for padded unsigned, they would need to modify the common semantic internally in order to get the conversion right. This means that the semantic of the returned Value will not be what you would normally get from getCommonSemantic, so the caller of the function will have no idea what the semantic of the returned value is.

Even if we only treat it as an internal detail of the binop functions and never expose this 'modified' semantic externally, this means we might end up with superfluous operations since (for padded saturating unsigned) we will be forced to trunc the result by one bit to match the real common semantic before we return.

The only solution I can think of is to also return the semantic of the result Value, which feels like it makes the interface pretty bulky.

Wed, Jul 15, 10:57 AM · Restricted Project

Tue, Jul 14

GitHub <noreply@github.com> committed rG8c7c96c078f5: Merge pull request #1413 from zoecarver/apple/stable/20200108 (authored by rjmccall).
Merge pull request #1413 from zoecarver/apple/stable/20200108
Tue, Jul 14, 4:56 PM
rjmccall committed rG695048428034: Expose IRGen API to add the default IR attributes to a function definition. (authored by rjmccall).
Expose IRGen API to add the default IR attributes to a function definition.
Tue, Jul 14, 4:51 PM
GitHub <noreply@github.com> committed rG504402111f1b: Merge pull request #1245 from rjmccall/ir-atttributes-api (authored by rjmccall).
Merge pull request #1245 from rjmccall/ir-atttributes-api
Tue, Jul 14, 4:51 PM
GitHub <noreply@github.com> committed rG018c4f966dfe: Merge pull request #1240 from martinboehme/cherry-pick-4c09289 (authored by rjmccall).
Merge pull request #1240 from martinboehme/cherry-pick-4c09289
Tue, Jul 14, 4:51 PM
GitHub <noreply@github.com> committed rG2b2bff348b6e: Merge pull request #981 from rjmccall/continuation-alignment-5.2 (authored by rjmccall).
Merge pull request #981 from rjmccall/continuation-alignment-5.2
Tue, Jul 14, 4:37 PM
rjmccall committed rG5830dc52cd5e: Use optimal layout and preserve alloca alignment in coroutine frames. (authored by rjmccall).
Use optimal layout and preserve alloca alignment in coroutine frames.
Tue, Jul 14, 4:37 PM
rjmccall committed rGab18ed9d0b3c: Add an algorithm for performing "optimal" layout of a struct. (authored by rjmccall).
Add an algorithm for performing "optimal" layout of a struct.
Tue, Jul 14, 4:37 PM
GitHub <noreply@github.com> committed rGd343c8b88cd4: Merge pull request #778 from rjmccall/fix-objc-type-param-reentrance-bug (authored by rjmccall).
Merge pull request #778 from rjmccall/fix-objc-type-param-reentrance-bug
Tue, Jul 14, 4:23 PM
rjmccall committed rG1a5e28fe0296: Fix a reentrance bug with deserializing ObjC type parameters. (authored by rjmccall).
Fix a reentrance bug with deserializing ObjC type parameters.
Tue, Jul 14, 4:23 PM

Fri, Jul 10

rjmccall accepted D82513: [CodeGen] Store the return value of the target function call to the thunk's return value slot directly when the return type is an aggregate instead of doing so via a temporary.

Thanks! LGTM.

Fri, Jul 10, 4:51 PM · Restricted Project
rjmccall accepted D83502: Change behavior with zero-sized static array extents.

Thanks, LGTM.

Fri, Jul 10, 12:47 PM
rjmccall added inline comments to D83502: Change behavior with zero-sized static array extents.
Fri, Jul 10, 10:22 AM
rjmccall added a comment to D82663: [CodeGen] Have CodeGen for fixed-point unsigned with padding emit signed operations..

Would it be sensible to use a technical design more like what the matrix folks are doing, where LLVM provides a small interface for emitting operations with various semantics? FixedPointSemantics would move to that header, and Clang would just call into it. That way you get a lot more flexibility in how you generate code, and the Clang IRGen logic is still transparently correct. If you want to add intrinsics or otherwise change the IR patterns used for various operations, you don't have to rewrite a bunch of Clang IRGen logic every time, you just have to update the tests. It'd then be pretty straightforward to have internal helper functions in that interface for computing things like whether you should use signed or unsigned intrinsics given the desired FixedPointSemantics.

This seems like a reasonable thing to do for other reasons as well. Also moving the actual APFixedPoint class to LLVM would make it easier to reuse the fixedpoint calculation code for constant folding in LLVM, for example.

Fri, Jul 10, 10:04 AM · Restricted Project

Jul 9 2020

rjmccall added a comment to D82513: [CodeGen] Store the return value of the target function call to the thunk's return value slot directly when the return type is an aggregate instead of doing so via a temporary.

I agree that avoiding the copy is best. However, at the very least, if that function isn't going to handle the aggregate case correctly, it should assert that it isn't in it.

Jul 9 2020, 7:32 PM · Restricted Project
rjmccall added inline comments to D83502: Change behavior with zero-sized static array extents.
Jul 9 2020, 2:12 PM
rjmccall added inline comments to D80858: [CUDA][HIP] Support accessing static device variable in host code for -fno-gpu-rdc.
Jul 9 2020, 2:05 PM · Restricted Project
rjmccall added a comment to D82513: [CodeGen] Store the return value of the target function call to the thunk's return value slot directly when the return type is an aggregate instead of doing so via a temporary.

This seems fine. I do wonder if the "real" bug is that this ought to be handled properly in EmitReturnFromThunk, but regardless, the fix seems acceptable.

Jul 9 2020, 2:00 PM · Restricted Project
rjmccall added a comment to D79730: [NFCi] Switch ordering of ParseLangArgs and ParseCodeGenArgs..

Either way, I think.

Jul 9 2020, 12:54 PM · Restricted Project
rjmccall added a comment to D82663: [CodeGen] Have CodeGen for fixed-point unsigned with padding emit signed operations..

Would it be sensible to use a technical design more like what the matrix folks are doing, where LLVM provides a small interface for emitting operations with various semantics? FixedPointSemantics would move to that header, and Clang would just call into it. That way you get a lot more flexibility in how you generate code, and the Clang IRGen logic is still transparently correct. If you want to add intrinsics or otherwise change the IR patterns used for various operations, you don't have to rewrite a bunch of Clang IRGen logic every time, you just have to update the tests. It'd then be pretty straightforward to have internal helper functions in that interface for computing things like whether you should use signed or unsigned intrinsics given the desired FixedPointSemantics.

Jul 9 2020, 12:53 PM · Restricted Project

Jul 8 2020

rjmccall added a comment to D81583: Update SystemZ ABI to handle C++20 [[no_unique_address]] attribute.

I agree with Eli that this should be considered a bugfix in the implementation of a recent language change and should just be rolled out consistently for all targets.

Jul 8 2020, 12:30 PM · Restricted Project

Jul 7 2020

rjmccall accepted D83317: [Sema] Teach -Wcast-align to compute alignment of CXXThisExpr.

LGTM

Jul 7 2020, 9:54 AM · Restricted Project

Jul 2 2020

rjmccall added inline comments to D79279: Add overloaded versions of builtin mem* functions.
Jul 2 2020, 11:10 PM · Restricted Project, Restricted Project
rjmccall added a comment to D82999: [CodeGen] Check the cleanup flag before destructing lifetime-extended temporaries created in conditional expressions.

In test case test13 in clang/test/CodeGenCXX/exceptions.cpp, I think you can turn invoke void @_ZN6test131AC1Ev into call void @_ZN6test131AC1Ev, no? If the false expression throws, there is nothing to clean up in the false expression and also nothing in the true expression has to be cleaned up.

Jul 2 2020, 11:10 PM · Restricted Project
rjmccall added a comment to D82999: [CodeGen] Check the cleanup flag before destructing lifetime-extended temporaries created in conditional expressions.

After-full-expression cleanup looks fine to me. pushCleanupAfterFullExpr sets the flags and saves the values when it's in a conditional branch.

I think ideally we shouldn't have to clean up anything in the true expression when the false expression throws, but I wasn't able to come up with an easy way to make IRGen avoid that.

Jul 2 2020, 2:03 PM · Restricted Project
rjmccall added a comment to D82999: [CodeGen] Check the cleanup flag before destructing lifetime-extended temporaries created in conditional expressions.

Please adjust the commit message to be clear that this is about lifetime-extended temporaries; it's not like we got this wrong for all temporaries.

Jul 2 2020, 10:47 AM · Restricted Project

Jul 1 2020

rjmccall added a comment to D82781: [OpenCL] Fix missing address space deduction in template variables.

Seems like you shouldn't do it earlier if the type is dependent.

Jul 1 2020, 11:20 AM · Restricted Project
rjmccall added a comment to D82663: [CodeGen] Have CodeGen for fixed-point unsigned with padding emit signed operations..

Can the missing bit just be added? It seems to me that frontends ought to be able to emit the obvious intrinsic for the semantic operation here rather than having to second-guess the backend.

Jul 1 2020, 11:20 AM · Restricted Project
rjmccall accepted D82392: [CodeGen] Add public function to emit C++ destructor call..

LGTM

Jul 1 2020, 10:48 AM · Restricted Project

Jun 29 2020

rjmccall added a comment to D82392: [CodeGen] Add public function to emit C++ destructor call..

Can we do a design more like what we did with constructors?

Jun 29 2020, 9:28 PM · Restricted Project
rjmccall added a comment to D72770: Add matrix types extension tests ..

This looks reasonable to me, but I don't know much about this test suite.

Jun 29 2020, 11:53 AM · Restricted Project

Jun 26 2020

rjmccall added a comment to D82663: [CodeGen] Have CodeGen for fixed-point unsigned with padding emit signed operations..

Why not legalize to the signed operation?

Jun 26 2020, 10:55 AM · Restricted Project
rjmccall added a comment to D81166: [Matrix] Use nuw/nsw operand bundles for matrix.multiply..

Seems reasonable to me.

Jun 26 2020, 10:22 AM · Restricted Project

Jun 24 2020

rjmccall added a comment to D81869: Modify FPFeatures to use delta not absolute settings to solve PCH compatibility problems.

I decided that I shouldn't make float options that define a macro, like -ffast-math, as BENIGN_LANGOPT, I made ffp-contract= , fp-exception-behavior and rounding-mode BENIGN

Jun 24 2020, 2:07 PM · Restricted Project, Restricted Project
rjmccall accepted D82473: [Matrix] Use 1st/2nd instead of first/second in matrix diags..

LGTM

Jun 24 2020, 2:07 PM · Restricted Project

Jun 23 2020

rjmccall added a comment to D81869: Modify FPFeatures to use delta not absolute settings to solve PCH compatibility problems.

Just a bunch of minor suggestions. LGTM if you get all the tests worked out and it actually works the way you want on the tests.

Jun 23 2020, 10:39 PM · Restricted Project, Restricted Project

Jun 22 2020

rjmccall added inline comments to D81869: Modify FPFeatures to use delta not absolute settings to solve PCH compatibility problems.
Jun 22 2020, 8:25 PM · Restricted Project, Restricted Project

Jun 18 2020

rjmccall accepted D81311: [RFC] LangRef: Define byref parameter attribute.

This LGTM.

Jun 18 2020, 10:54 AM · Restricted Project

Jun 17 2020

rjmccall accepted D72782: [Matrix] Add __builtin_matrix_column_store to Clang..

LGTM

Jun 17 2020, 11:18 AM · Restricted Project
rjmccall added a comment to D81960: [Matrix] Use alignment info when lowering loads/stores..

Thanks, LGTM.

Jun 17 2020, 9:40 AM · Restricted Project
rjmccall accepted D81960: [Matrix] Use alignment info when lowering loads/stores..
Jun 17 2020, 9:40 AM · Restricted Project

Jun 16 2020

rjmccall added a comment to D81960: [Matrix] Use alignment info when lowering loads/stores..

Otherwise LGTM

Jun 16 2020, 10:23 PM · Restricted Project
rjmccall added inline comments to D81869: Modify FPFeatures to use delta not absolute settings to solve PCH compatibility problems.
Jun 16 2020, 2:18 PM · Restricted Project, Restricted Project
rjmccall accepted D81795: [clang] Enable -mms-bitfields by default for mingw targets.

Okay, thanks. LGTM, then.

Jun 16 2020, 11:00 AM · Restricted Project

Jun 15 2020

rjmccall accepted D81857: Fix ConstantAggregateBuilderBase::getRelativeOffset.
Jun 15 2020, 11:31 AM · Restricted Project
rjmccall added a comment to D81857: Fix ConstantAggregateBuilderBase::getRelativeOffset.

LGTM.

Jun 15 2020, 11:31 AM · Restricted Project
rjmccall added a comment to D81795: [clang] Enable -mms-bitfields by default for mingw targets.

Seems reasonable; GCC is the "system compiler" for this platform. Does isWindowsGNUEnvironment exactly track the condition that GCC uses? It's just MinGW, not Cygwin?

Jun 15 2020, 8:40 AM · Restricted Project

Jun 12 2020

rjmccall accepted D81420: Fix size for _ExtInt types with builtins.

LGTM, thanks!

Jun 12 2020, 2:14 PM · Restricted Project

Jun 11 2020

rjmccall added inline comments to D81420: Fix size for _ExtInt types with builtins.
Jun 11 2020, 11:01 PM · Restricted Project
rjmccall committed rG7fac1acc6171: Set the LLVM FP optimization flags conservatively. (authored by rjmccall).
Set the LLVM FP optimization flags conservatively.
Jun 11 2020, 3:28 PM
rjmccall closed D80462: Fix floating point math function attributes definition..

To ssh://github.com/llvm/llvm-project

a98d618f6e5f..7fac1acc6171  master -> master
Jun 11 2020, 3:26 PM · Restricted Project
rjmccall added a comment to D81311: [RFC] LangRef: Define byref parameter attribute.

Do we allow inmem to be used for other purposes? I would assume the answer is yes, as we do not forbid it.

I don't know what else we might use it for off-hand, but yes, I think the frontend could put this down on all value arguments that are actually passed indirectly.

Where does it say it is limited to indirectly passed arguments?

The argument does have to be a pointer. And passes aren't allowed to infer this or it becomes useless for the original purpose.

That is what I'm trying to get at. As of right now, I don't see any reason a pass could not add this, or a front-end for that matter, for any call, assuming they now it won't mess with the ABI for the target. We might want to add language to this end?

Jun 11 2020, 3:26 PM · Restricted Project
rjmccall added a comment to D81472: [Matrix] Update load/store intrinsics..

My immediate concern is just that I think the memory layout of the matrix should be orthogonal to the component layout of the vector. If you want the matrix intrinsics to support a variety of vector layouts, you should pass down the expected layout as a constant argument to the intrinsic rather than picking it up purely from whether the matrix is being loaded from a row-major or column-major layout in memory. I would guess that making that constant an i32 is probably sufficiently future-proof; if you ever need more structure than that, you'd probably be better off biting the bullet and adding an llvm::MatrixType.

Hm I understand the appeal of having a single very powerful intrinsic. Selecting the different variants by a single parameter is convenient in terms of maintaining backwards compatibility, but personally I find it more readable to include some of the variant information in the name. Of course there's a limit to the number of variants for which that approach is feasible. I think it is important to have this discussion, but I am not sure if it is in scope for this patch (which only adds a few smallish improvements to the naming/arguments of the intrinsics) and it might be better to discuss that once work on row-major versions of the intrinsics starts?

Jun 11 2020, 11:32 AM · Restricted Project, Restricted Project
rjmccall added a comment to D81311: [RFC] LangRef: Define byref parameter attribute.

Do we allow inmem to be used for other purposes? I would assume the answer is yes, as we do not forbid it.

I don't know what else we might use it for off-hand, but yes, I think the frontend could put this down on all value arguments that are actually passed indirectly.

Where does it say it is limited to indirectly passed arguments?

Jun 11 2020, 11:32 AM · Restricted Project

Jun 10 2020

rjmccall accepted D81624: [CodeGen] Simplify the way lifetime of block captures is extended.

This is a great improvement, thanks!

Jun 10 2020, 9:34 PM · Restricted Project
rjmccall added inline comments to D81420: Fix size for _ExtInt types with builtins.
Jun 10 2020, 8:30 PM · Restricted Project
rjmccall added a comment to D81472: [Matrix] Update load/store intrinsics..

My immediate concern is just that I think the memory layout of the matrix should be orthogonal to the component layout of the vector. If you want the matrix intrinsics to support a variety of vector layouts, you should pass down the expected layout as a constant argument to the intrinsic rather than picking it up purely from whether the matrix is being loaded from a row-major or column-major layout in memory. I would guess that making that constant an i32 is probably sufficiently future-proof; if you ever need more structure than that, you'd probably be better off biting the bullet and adding an llvm::MatrixType.

Jun 10 2020, 7:57 PM · Restricted Project, Restricted Project
rjmccall added a comment to D81311: [RFC] LangRef: Define byref parameter attribute.

Do we allow inmem to be used for other purposes? I would assume the answer is yes, as we do not forbid it.

Jun 10 2020, 4:08 PM · Restricted Project
rjmccall added inline comments to D81420: Fix size for _ExtInt types with builtins.
Jun 10 2020, 4:08 PM · Restricted Project
rjmccall added a comment to D81472: [Matrix] Update load/store intrinsics..

[snip]

Oh right, the 'special value' to indicate row/column major would be setting either stride to 1. As long as exactly one of those is 1, the layout of the result/operand should be clear. Personally I find including layout included in the name a bit easier to follow, as it is more explicit. But it might be preferable to have a single variant that handles row/column major depending on the strides (as long as we enforce that exactly one stride has to be 1.), once we add those variants.

Why have the restriction that exactly one stride has to be 1? If you can optimize that as a constant, great, do it, but otherwise just do the separate loads/stores, and impose an UB restriction that the strides have to make them non-overlapping.

Besides making assumptions about the layout of the access memory, the intrinsic also specifies the layout of the loaded/stored values (=layout in the flattened vector). If either stride is constant 1, we could use that to determine the layout of the loaded/stored value. I may be missing something, but if both are != 1 or arbitrary values, it is not clear what we should pick for the in-vector layout.

Jun 10 2020, 3:38 PM · Restricted Project, Restricted Project
rjmccall added a comment to D81472: [Matrix] Update load/store intrinsics..

I like the name change, although I wonder if you could just have a single intrinsic that takes both a row stride and a column stride and recognizes the common patterns. Presumably even with column-major ordering you already want to optimize the case where the stride is a constant equal to the row count, so this would just be a generalization of that.

I am not sure about having a single intrinsic. The column.major part in the name signifies that the resulting matrix is in column-major layout (whic, but is then used internally during he lowering). I suppose it would be possible to have both row & column strides and use a special value to indicate what the leading dimension is, but it seems to me that having dedicated intrinsics would be more explicit.

This is just Fortran array slices. You don't need a special value, the two strides are sufficient. M[i][j] is at p[i * rowStride + j * columnStride]. To not have overlapping storage, you need either rowStride * rowCount <= columnStride or vice-versa. Row-major means rowStride >= columnCount && columnStride == 1; column-major means rowStride == 1 && columnStride >= rowCount. You get better locality from doing the smaller stride in the inner loop (which may not actually be a loop, of course), but it's not wrong to do either way.

Anyway, it's up to you, but I think the two-stride representation is more flexible and avoids ultimately needing three separate intrinsics with optimizations that turn the general one into the more specific ones. And it may have benefits for frontends like Flang that have to support these strided multi-dimensional slices.

Oh right, the 'special value' to indicate row/column major would be setting either stride to 1. As long as exactly one of those is 1, the layout of the result/operand should be clear. Personally I find including layout included in the name a bit easier to follow, as it is more explicit. But it might be preferable to have a single variant that handles row/column major depending on the strides (as long as we enforce that exactly one stride has to be 1.), once we add those variants.

Jun 10 2020, 1:55 PM · Restricted Project, Restricted Project
rjmccall added a comment to D81472: [Matrix] Update load/store intrinsics..

I like the name change, although I wonder if you could just have a single intrinsic that takes both a row stride and a column stride and recognizes the common patterns. Presumably even with column-major ordering you already want to optimize the case where the stride is a constant equal to the row count, so this would just be a generalization of that.

I am not sure about having a single intrinsic. The column.major part in the name signifies that the resulting matrix is in column-major layout (whic, but is then used internally during he lowering). I suppose it would be possible to have both row & column strides and use a special value to indicate what the leading dimension is, but it seems to me that having dedicated intrinsics would be more explicit.

Jun 10 2020, 11:41 AM · Restricted Project, Restricted Project