- User Since
- Oct 9 2015, 4:06 AM (101 w, 5 d)
Thu, Sep 14
Updated commit message: the highest value that can be represented with
immediate + inline is 4156 = 4092 + 64.
Wed, Sep 6
How about in unittests/ADT/TripleTest.cpp?
Mon, Sep 4
TripleTest.cpp has some minimally meaningful tests though. So a test of amdgcn-amd-amdpal should be added there. Apart from that the patch looks good.
Mon, Aug 28
Aug 8 2017
Aug 4 2017
Some comments on what I noticed. Probably best if somebody else has a look as well.
Aug 3 2017
Aug 2 2017
One comment, apart from that LGTM.
One remaining style nitpick. LGTM with that fixed.
Thanks. Assuming we agree on the derivative calculations logic I added in a comment, this LGTM.
Aug 1 2017
Jul 21 2017
Looks mostly good, but I do have a couple of comments.
Jul 20 2017
One question, apart from that looks good.
One minor comment. Either way, LGTM.
Some minor comments. In addition, I think it can be simplified, and we probably want the intrinsic to be convergent, because sinking WQM computations into a non-uniform branch could mean that the computation becomes non-WQM for practical purposes.
Jul 14 2017
This was submitted long time ago.
Jun 28 2017
So, one thing that's not clear to me with is the semantics of how the update.dpp intrinsic is supposed to enable WQM or WWM. In your sequence of instructions, if you just put a WQM/WWM flag on the update.dpp intrinsic, how does LLVM know whether the regular ALU intrinsics in between should run in WQM/WWM or not?
May 18 2017
May 8 2017
Low & high bits and use cases aside, isn't this something that could be covered by adding support for PC in the read_register intrinsic?
Apr 24 2017
Address review comments and apply some more formatting fixes.
Address review comments.
I don't see anything wrong with the code.
Apr 21 2017
Apr 19 2017
There might be a benefit to choosing a larger alignment, like 64 bytes, due to cache line alignment.
Apr 8 2017
When applied to current trunk, this code can get into an infinite loop. Please see the sample LLVM IR at https://paste.debian.net/926533/
Feb 15 2017
Feb 10 2017
Feb 1 2017
I haven't looked in too much detail yet. I assume getelementptr doesn't work with these pointers, so it would be good to have a negative test which ensures that GEP use fails.
I see the same problem as Tom here. Do those shaders use read-only SSBOs? If so, this could perhaps be done at the Mesa level. But even then, there'd be a problem if the same memory is bound to two different SSBOs, and one of them is written to, unless the SSBO is marked 'restrict'.
Jan 30 2017
Backing out the effectively dead FMF code, but leave the corresponding
FIXMEs in place.
Jan 27 2017
Yeah, I personally feel more comfortable having the exact same change on the branch.
Drop unnecessary getNode()s.
But you can't because they'll crash [...]
@arsenm, unfortunately I can't add flag-based tests, since as @hfinkel points out, you can't actually get the flag on an FMA or FMAD in the DAG. @hfinkel, the idea was to put the check in already anyway, since by the discussion with @spatel, the plan is to add FMF for all nodes eventually. That seems reasonable to me.
Formatting of FIXMEs
Jan 23 2017
Jan 20 2017
Added the DefaultFlags. It does fix @spatel's example for me. Anything else?
Jan 19 2017
Obviously it'd be even easier to skip the getFlags() thing entirely now, since it can never evaluate to true. But the right way forward is to extend support for FMF, so...
Add FIXMEs. @hfinkel, do I understand your comment correctly that this patch is
good to go?
Jan 17 2017
- Take individual UnsafeAlgebra flags into account.
Jan 13 2017
Thanks! We generally use CHECK-LABEL only for the start of a function, not for labels inside a function. The idea is that the label helps FileCheck get less confused when there are many sub-tests in a single test file.
The approach is unfortunately not correct in general as far as I can tell. Say you have
Jan 12 2017
I don't see the change that merges the FP_EXTEND / FP_ROUND cases. But the end result LGTM.
Jan 11 2017
This looks like it's missing a rebase onto the rest of the series.
A test case would be good. Apart from that, LGTM.
One minor comment, LGTM otherwise.
Dec 16 2016
Dec 12 2016
Thank you for taking a look!
Dec 9 2016
Dec 8 2016
Thank you for taking a look.
Dec 7 2016
- Use an enum instead of OnlyLegalOrCustom
- Handle a corner case where shift amounts are too large for the native shift type
- Test case using alloca
- Unify getting the MUBUF instruction offset
Obsoleted by changes to D27346.
Don't split the files, just double the RUN lines in the existing files.
Dec 2 2016
Dec 1 2016
Rearrange the logic. It looks quite readable to me this way, and
clang-format-diff agrees with the formatting.
Nov 30 2016
Rebased on D27260.