LGTM after you address David's comment -- A post-dominates B doesn't guarantee execution that reaches B will arrive A.
Nov 30 2016
Aug 22 2016
Aug 18 2016
Yay! Thanks for testing this!
Jul 15 2016
Jul 11 2016
Jul 9 2016
Jul 8 2016
Jun 24 2016
Jun 15 2016
May 25 2016
Have you considered letting Clang (instead of a late-stage IR pass) add these ranges? These ranges are useful for some target-independent IR passes, e.g. those using ValueTracking (D4150).
May 4 2016
Apr 29 2016
Apr 28 2016
Looks great! Thanks.
Thanks! I completely missed this case :(
Awesome! Thanks Justin.
Apr 26 2016
Oh, I see what you mean now.
So NVPTXPeephole should be guarded by CodeGenOpt::None and thus is safe to skip.
The optnone-llc.ll test verifies that nothing is skipped which is run at -O0, so as long as this pass is being run at -O0 we can't add the call to skip it for "optnone" functions and bisection. If the target machine code were updated to skip the peephole pass at -O0 then the skip check could be added.
Oh, I see what you mean now.
Maybe change the title to be "... instead of default for pure arithmetic instructions". Otherwise, LGTM!
Apr 12 2016
Apr 4 2016
Should we land this? It will fix PR26185.
Mar 31 2016
Other than that FIXME, pretty straightforward. LGTM
Mar 29 2016
Mar 22 2016
D18168 duplicates this and is submitted.
Thanks for working on this long overdue feature!
Mar 20 2016
All comments addressed. Submitting...
Some more minor changes.
Mar 18 2016
Mar 16 2016
Thanks a lot for the review! I understand it's a lot of work :) I answered one of your high-level questions. I'm still OOO and will get to other comments later.
Mar 15 2016
I am not clear with this part either. jholewinski@, can you comment on this?
Sorry, I pressed the wrong button. I meant to say "needs revision". Feel free to reclaim this patch.
Pressed the wrong button. Meant to say "needs revision". Feel free to reclaim it.
Oops... I pressed the wrong button. I meant to say "need revision". jmolly@, feel free to reclaim it.
It doesn't work with the new alias analysis infrastructure.
Is this patch obsolete? Are you still trying to push it in?
Mar 11 2016
I am OOO and maybe unable to review this until next week.
Mar 8 2016
Mar 3 2016
Feb 23 2016
Feb 20 2016
Feb 16 2016
Feb 12 2016
Feb 11 2016
I'll defer to Justin's approval.
Feb 9 2016
Just a reminder if you haven't done that already, double-check how the web page looks like before you commit.
Feb 8 2016
Feb 5 2016
I was referring to this paragraph
Barriers are executed on a per-warp basis as if all the threads in a warp are active. Thus, if any thread in a warp executes a bar instruction, it is as if all the threads in the warp have executed the bar instruction. All threads in the warp are stalled until the barrier completes, and the arrival count for the barrier is incremented by the warp size (not the number of active threads in the warp). In conditionally executed code, a bar instruction should only be used if it is known that all threads evaluate the condition identically (the warp does not diverge). Since barriers are executed on a per-warp basis, the optional thread count must be a multiple of the warp size.
LGTM, but do you have a test where LLVM generates wrong code if __syncthreads is not marked convergent?
Committed in r254408.
Feb 4 2016
Feb 3 2016
Jan 30 2016
Jan 22 2016
Dec 18 2015
Dec 17 2015
Nov 29 2015
LGTM with some minors
Nov 25 2015
Nov 18 2015
Nov 17 2015
Nov 10 2015
Replace the link to the raw diff with more instructions.
Simplify the command lines and header file inclusion
Nov 6 2015
My biggest concern is to avoid giving users the false impression that what is described here is an officially supported long-term interface from clang. Would it be accurate to say that this document is meant for "LLVM developers" (or otherwise people working inside LLVM)?
Other than clarifying that, the content LGTM.