User Details
- User Since
- May 3 2021, 10:02 AM (66 w, 1 d)
Yesterday
Ok, looks like this wasn't the bug.
Thu, Aug 4
Added test
Tue, Aug 2
(Also , @ThomasRaoux - should this be added to the NVVM conversion? I'm not doing it right now because I can't test the change, but I thought I'd raise the possibility)
Adress review comments
Mon, Aug 1
Added comments, fixed bug, updated tests
Thu, Jul 28
Mon, Jul 25
Thu, Jul 14
@kerbowa I'd like to still land this because the lds_barrier is a useful abstraction over whatever combination of (fence + barrier)/inline assembly/... you end up needing to use to implement it.
Tue, Jul 12
Re the above, you mentioned gfx90a+ ... what about gfx908?
Mon, Jul 11
This is a single op that expands to both a waitcnt on LDS and a barrier. I
can go digging tomorrow for what we used to lower this to some sort of
fence and a barrier, but I recall ( @whchung who may have more detail) that
using this bit of inline assembly to work around what I think was the lack
of an LDS-only fence gave a noticable performance increase
And I do have the bits, so landed. Thanks for the external pair of eyes!
Jul 7 2022
@eric-k256 Would you be willing to approve here?
Jul 6 2022
Jul 5 2022
Since @Mogball is handling this in his dataflow analysis rework, abandoning.
Jul 4 2022
Thanks for doing all the research on this!
Ah, I missed that the one with 128 was laneId, not any of the other ones - my bad for not reading carefully.
I agree that getting anything more than very conservative bounds in an outlined kernel will be tricky and so it might not be worth doing (right now).
Jul 2 2022
Thanks for digging up the sources on those bounds!
Two initial comments;
- Do we actually know that the number of blocks/threads is bounded above by 2^31? For one thing, the HIP (and presumably CUDA) launch APIs take uint32_t and so the bound could be 2^32
- Could you find a way to propagate these into (outlined) kernels, so that we get bounds on gpu.block_id and friends?
Jun 30 2022
LGTM, at least as far as the integer analysis parts go.
Jun 29 2022
So, overall comments - especially since it looks like all the details are basically fine ... wouldn't it still be a good idea to have a wrapper around IntRangeAnalysis that hides the fact that it's a dataflow analysis? Or are we abandoning that wrapper here?
Jun 28 2022
@jpienaar Would you be willing to approve this, or do you not make sense as a reviewer?
Jun 27 2022
Thanks!
In "things I'm just now noticing now that we're trying to send this upstream", should we add a test?
Jun 21 2022
Jun 20 2022
Review comments
Jun 17 2022
Per feedback here, I'm abandoning this revision in favor of
- A new revision that adds the new ROCDL intrinsics from LLVM but doesn't touch AMDGPU
- Going back downstream to design a better mfma operation that looks like something like mfma {k = K, m = M, n = N, ...} %c = %a * %b.
Jun 16 2022
Jun 15 2022
Should cleaning this up be a new revision or is there a way to re-open this one?
Jun 14 2022
Rebase
unsigned int -> unsigned throughought
Apply style comments, send off last build pre-land
Jun 10 2022
The only test I can think of that could catch this is an integration test, and our buildbot isn't testing against pre-release versions of the backend
@Mogball , I know you LGTM'd this a while back, but it got more-or-less rewritten since then, so it could probably use another style/sanity check
Jun 8 2022
Add a test closer in intent to the original while loop test
Rebase onto main, fix tests to account for arith.constant inference
So yeah, from the perspective of the integer range stuff, feel free to land
this.
I think the semantics of DataFlowAnalysis::getSuccessorsForOperands() could be clarified ... but I'm not sure it matters, since there're plans to scrap that entire framework.
Jun 7 2022
(Submitting inline comments)
Add block comments, remove IntPair
Jun 6 2022
To re-raise things, what's the right venue for discussing bf16 emulation in LLVM?
Jun 3 2022
I don't think we can delete old SCCP - range analysis doesn't handle vectors, for example
r
Update commit message
Jun 2 2022
I did get an email but the email only mentioned the warning, not the error.
Abandoning, I thought the warnings were just warnings, there was also a linker error - there'll be a new revision to re-land this