- User Since
- Jan 30 2014, 6:27 AM (385 w, 10 h)
Go higher for GPUs, 128 or so. And we really need a metadata check to avoid looking for kernels.
Rebase + address comment
Improve according to feedback
LGTM, some nits. Thanks for fixing this up so quickly.
Introduc a helper
I don't understand why we have an event in the async_info object at all. I would have assumed only to have some plugin specific "lock" in the table.
Can you add a test case with a callback and one with an unknown function pointer call. Maybe also one with a function pointer and callee metadata + a FIXME.
More comments, also, please fix the clang tidy warnings, they make it hard to read.
(answered now, I forgot to submit this yesterday -.- )
Mon, Jun 14
This makes sense to me, wording seems to be alright, though it's late. Thx for working on this!
I thought the problem was that a block with an assume caused us to miss out on some backed optimization.
The proposed solution was to eliminate blocks that only contain an assume as side effect early as middle-end
optimizations would also benefit from a simplified CFG.
Sun, Jun 13
Sat, Jun 12
Fri, Jun 11
Thu, Jun 10
Can you please reupload after clang formatting the code :)
// Loads/stores with an invariant.group metadata are ok to hoist/sink.
FWIW, if someone adds documentation what #pragma clang loop vectorize(assume_safety) actually means (somewhere), we can go back and make it imply if-conversion. However, we should not conflate it with the access groups/parallel metadata. Instead, annotate loads as "speculatable" (or sth. similar) and we go with that.
This fixes a bug and people are in favor. If we need to add a new mechanism for this we can afterwards still. LGTM
Wed, Jun 9
This starts to sound an awful lot like convergent to me, basically: Do not change the control conditions of this call.
Still unsure, maybe you can add a LangRef draft so we know what you try to do, or a nice example what you don't want to happen.
Please bear with me, I'm updating examples and documentation. I didn't think anybody would look at this while [WIP]. :-)
LG, two nits
WIP here is fine, would help to include a test that shows/explains what problem is actually solved though. I don't understand form this patch alone.
Thanks for splitting this. I quickly went over it only.
Tue, Jun 8
LG, one nit. Once the nit is resolved I can commit this for you.
LG, one nit
squash the commits so that we see the diff not from the last change but the entire change set
Mon, Jun 7
Put in a little more to reduce the follow up size
Does this by any chance also fix https://bugs.llvm.org/show_bug.cgi?id=49650 ?
Sun, Jun 6
Some minor thing but overall this looks sensible.
Fri, Jun 4
LG, one nit below.
LG, one nit
LG, one nit
Thu, Jun 3
Do we have to update OpenMPIRBuilder::createCancel too?
Wed, Jun 2
I'd change the default and pass a non-default value in getAAFor
Tue, Jun 1
Fri, May 28
Wed, May 26
We need a test. A callback call should do it.
Sun, May 23
This allows us to remove the switch in the device runtime, right?
Yep, thanks to both of you to solving the problem with general solutions, will help in the long run for sure :)
Fri, May 21
Thu, May 20
I think this is generally reasonable, haven't looked at all benchmarks and I would prefer some other opinions.
Wed, May 19
I looked over this again because I think we will later move the logic into an AbstractAttribute. The reason is that we want to reuse some of the analysis in AAKernelInfo to determine if we can use SPMD mode. Basically, if the shared memory is only used in a nice way and fees into an actual parallel51, it is compatible with SPMD mode. If we go to SPMD mode we have (or better should) eliminate the shared memory indirection. Maybe we move all of this into AAHeapToStack and make a more general AAAllocationInfo out of it. Let's see.
[Wrong review -.- ]
Also moved logic from the AAKernelInfoFunction::updateImpl to
AAKernelInfoCallSite::initialize as it looks cleaner to fix the
state of runtime calls right away.
LGTM, two minor nits.
Drop the weak attribute, will solve the problem differently
This doesn't work properly yet :(
When I run this on the OpenMP tests I get sometimes a total order where there should be none.
Accidentally deleted too many check lines this time ...
Nit: Commit subject (not message) is old, maybe change prior to push locally
Make the script remove leftover CHECK: lines that would now trigger given
the absence of a specific check prefix for the -fopenmp-simd invocations.
LGTM, thanks for fixing this now and for the future!
Diagnostic Info parts seems very mechanical and ok to me.
OpenMPOpt parts make sense.
Test seem to be always tied to an analysis (e.g., D29023) so LGTM