- User Since
- Jan 18 2013, 11:30 AM (439 w, 3 d)
Mon, Jun 14
When __block variables get moved to the heap, they're supposed to be moved if possible, not copied. It looks like PerformMoveOrCopyInitialization requires NRVO info to be passed in to ever do a move. Maybe it's being passed in wrong when building __block copy expressions in some situation.
Sat, Jun 12
Sun, Jun 6
Fri, Jun 4
Thu, Jun 3
If you want a way to say "some 64-bit type" in the builtins file, we could add that.
Wed, Jun 2
Hmm. CC'ing Akira. Akira, do you know why we're building a fake FunctionDecl as part of emitting these helper functions? Is it something we can avoid?
Fri, May 28
Thu, May 27
I would be uncomfortable with an invariant that all calls from swifttailcc to swifttailcc must be musttail because we might well want to make non-tail calls in some circumstances. However, I think it would be fine to annotate all such calls as either musttail or a hypothetical notail — at least, I think we could achieve that coming out of the frontend, and we ought to be able to maintain it in IR transformations. To do that, you'd need to actually add a notail, though.
Wed, May 26
There's no reliable way to report this with UBSan because in general we can't ever know that a loop will not terminate. That said, we could report *obviously* trivial loops, either with UBSan or just with a diagnostic. If the body of your loop really is empty, and the conditions are trivial, that ought to be enough to catch it.
Thanks, that seems to work out cleanly.
Tue, May 25
Do you want to generate these even at -O0?
Alright, that's fine with me, too.
Mon, May 24
Top-level qualifiers aren't normally meaningful on pr-values.
The Clang part of this looks fine to me; I can't review the backend changes, but you have partial approval.
May 20 2021
If these builtins are generally supposed to be accessed through a compiler-provided header, you could make that header define these functions using underlying builtins that have a __builtin prefix. That's not completely necessary, though; we do support some other non-library builtins that lack the __builtin prefix. And it sounds like maybe there isn't such a header anyway.
May 14 2021
May 12 2021
May 11 2021
May 10 2021
May 6 2021
How is the situation with byval different from the situation with indirect arguments? It sounds like we generally need to perform this extra copy into the coroutine frame once it's allocated, and yes, that's something we need to emit in the frontend because it might be non-trivial.
Objective-C object types also have aggregate evaluation kind. Those can only be used as value types in an old, fragile ObjC ABI, but I believe we haven't yet formally decided to remove support for that. Fortunately, however, there's a much better simplification available: this hasAggregateEvaluationKind(Ty) call is literally doing nothing that couldn't just be a getAs<RecordType>() call, and then we don't need to worry about it.
LGTM, except that the commit message should say "Lift", not "Lyft". ;)
May 4 2021
Can you test this with some code that doesn't bitcast the zero-sized alloca before its use? Just slightly worried that we might assert because of the type difference.
May 3 2021
Don't mind JF, he's just forgotten that C++ lacks the ability to turn the typically constant value arguments passed to functions like std::atomic::compare_exchange_strong into actual compile-time constants to the builtin used within those functions, so his advice would mean emitting all atomics in C++ as sequentially consistent.
Your change is correct, but I think there's a deeper flaw in this code.
Apr 30 2021
Apr 29 2021
Seems conceptually okay, given that we have an option to control it.
Apr 28 2021
What's the crash exactly/ Is IRGen just unhappy about processing the user definition when we've generated a declaration with a different type? Because we're already supposed to be being cautious about this.
I think this is intentional; requiring the indirect-result parameter to be in the alloca address space would prevent direct initialization of non-temporary memory, which is an important optimization in C++.
Apr 27 2021
Apr 26 2021
Here are the options I think the committee might take:
Yes, if you can dynamically choose to use an aligned allocator, that's clearly just much better.
Apr 23 2021
Apr 22 2021
What is the purpose of the builtin? Where is it being used? Typically you *can't* change the signature of a builtin because the builtin is itself a language feature that's documented to have a particular signature. If you've made a builtin purely for use in generated AST, that's pretty unfortunate, and you should consider whether you actually have to do that instead of e.g. synthesizing a call to an allocation function the same way that we do in new expressions.
Test looks good, thanks.
Apr 20 2021
Please addd tests, including tests that we suppress this assumption under e.g. -fno-builtin-malloc
Apr 19 2021
MLIR is an in-tree project that can be updated.
Apr 16 2021
Sorry for the delay. That seems like a reasonable summary of our discussion. Let me try to lay out the right architecture for this as I see it.
I agree. The arrangement logic is assuming that the exact pointer type doesn't matter because it's all just a pointer in the end, but of course that breaks down when we start inferring attributes.
Apr 15 2021
Hmm. I think the right thing to do here is to recognize generally that we're emitting a mandatory tail call, and so suppress *all* the normal transformations on the return value. The conditions on mandatory tail calls should make that possible, and it seems like it would be necessary for a lot of types. Aggregates especially come to mind — if an aggregate is returned in registers, we're probably going to generate code like
Apr 13 2021
That something is an (unlowered) coroutine is an important semantic property of the function that should be represented more explicitly in IR than just whether it contains a call to an intrinsic in the llvm.coro.id family. Coroutines have somewhat different structural rules from ordinary functions, as is natural for a somewhat different high-level concept. noinline is an independent attribute that should affect whether the ramp function is inlinable after coroutine lowering is complete.
Apr 11 2021
We don't have so many coroutine-emitting frontends that it's unreasonable to modify them all. Swift can make this change; it's not on you to do it. (I also work on the Swift frontend, so don't worry, I'm signing my own teammates up for work here.)
Why does this pass even exist? We should just expect the frontend to set the attribute. It's not like frontends don't have to otherwise know that they're emitting a coroutine; a ton of things about the expected entire IR pattern are different.
You should probably talk it over with AArch64 backend folks, but yes, abstractly it seems reasonable. CC'ing Tim Northover.
AArch64 only has 64-bit integer registers, so of course the algorithm is specified that way. LLVM could nonetheless choose to return an i32.
Why does the ABI "require" this to be returned as an i64 if some of the bits are undefined?
Apr 10 2021
Sorry, this got lost.
Your use of CoerceAndExpand seems fine; thanks for pinging me on it
Apr 8 2021
Patch functionally LGTM. Not sure what you're asking about the test script, but I don't know the libc++ test suite.
Apr 7 2021
Wow, yes, you're absolutely right. I think we can probably get away with fixing that without worrying about ODR problems because type_info ordered comparison is so uncommon.
Yes. We now have a motivational reason to do so, it wouldn't be "just expose the UB for the sake of miscompiling the program".
Apr 6 2021
Normally we don't want passes making educated guesses about things and codifying that into debug info, but in this specific instance i don't think it's unreasonable, since the debug frame is an artificial type. Ideally you would find the debug info that's present and use that to describe the frame layout as much as possible. The way frame layout works should allow that kind of reversal well enough.
Our builtin logic is capable of recognizing user declarations as builtins without any sort of explicit annotation in source. That's how we recognize C library builtins, for example. As Richard points out, that logic isn't really set up to do the right thing for namespaced declarations (or template ones, for that matter), but that seems fixable.
NUW_RN isn't a good name for the variable in this alternative since it's not readnone anymore. With an appropriate rename, LGTM.
Apr 5 2021
The last major conversation we had about this was this RFC I sent out about five years ago:
Alright, mostly looks good.
Patch LGTM. If this is what GCC is doing on Linux, then we should match it.
I think a pattern-match in IRGen is probably a much more reasonable alternative if the goal is primarily to avoid the code-generation overheads at -O0 / optimization costs at -O<n>. You'd still have template instantiation overheads; I don't know how significant those really are.