User Details
- User Since
- Jan 30 2014, 6:27 AM (488 w, 8 h)
Yesterday
I have been on this the last two days, will need till the end of the week for patches and proper eval across CTMark.
Tue, Jun 6
I browsed it, looks fine.
Mon, Jun 5
I saw this, will comment tomorrow, hopefully with actual results.
LG, all I tried to say is that we should be consistent across env vars. Accepting whitespaces around stuff is probably the right choice.
Fri, Jun 2
I'll fix LLVM.tools/UpdateTestChecks/update_test_checks::check_attrs.test before committing.
rebase
Honestly, I feel this is too pessimistic and short sighted. I understand we likely don't want everything on a GPU, maybe event can't, but I would not be so eager to guess for some of the things mentioned here. File I/O, for one, is on my TODO list. The entire "we do not want dependencies" is nice, but also somewhat irrelevant given that the vendor libs are linked in (to this day) anyway. Sure, we do not link them into libcgpu, but that is not because of some grant design, IMHO, but rather because we didn't need to (yet). Once libmgpu arrives, we will have to (or link libdevice later), so we will do what the motivation now states we don't want to.
FWIW, currently talking to @jplehr and @mhalk about the queue <-> stream mapping. Might be a follow up.
Force a power of two for the "middle" case, ensure thread_limit is honored.
Ensure thread_limit is honored.
Do we really need to support "old" CUDA style offloading/linking? If so, we also need HIP, SYCL, ...
I would suggest only to support the "new" offload linker. That's where Clang (driver) is headed anyway, I think.
Mon, May 29
ICMP, in general, captures. Check the tests for the capture tracker for examples.
Mon, May 22
Thu, May 18
Mon, May 15
update according to comments, more tests (consolidated)
LG, assuming these are stable now.
Termination is only a side-effect if mustprogress is set, otherwise it is not.
I will rebase this and hope to land it soon after.
Did you manually update these tests?
Looks reasonable, @jhuber6 , @tianshilei1992, wdyt?
This works now? What are the leftover problems?
May 5 2023
May 4 2023
Do we trim any other option? I am confused by this.
@tianshilei1992 @jhuber6, works for me, anything jumps out on your end?
May 3 2023
LG
May 2 2023
generating functions and then deleting them is costly and will likely not work. We need to not emit them in the first place.
May 1 2023
LG
LG, with assertion
LG, though I'd prefer if we could change the signed bit stuff such that we don't need "is first", hence use 4 states and define |= properly.
Add this to the commit message as it is the important part:
The fix was to only look for alignments that are powers of 2.
Glad you could at least hack this. If NVIDIA ever fixed their tools we can get rid of this stuff.
Apr 27 2023
So we have different schemes for AMD and NVIDIA? That does not sound good.
is this still supposed to land?
Does this work for non-AMD hardware?
Apr 25 2023
test case missing.
LG
Generally, I don't believe we should run elimination in all passes.
Instead, use the dominance tree to walk the blocks of a function, or walk it by manually following the control edges.
Apr 21 2023
Drive
Apr 20 2023
Make a test for the attributor/openmp-opt, also don't use O2 in tests, the IR only test is sufficient.
Just remove this part to make it consistent while it will still avoid the overflow:
if (static_cast<int32_t>(ThreadLimitClause[0]) <= 0) ThreadLimitClause[0] = PreferredNumThreads; else
I don't understand the concerns. A array to ptr decay, by itself, does not capture the pointer such that multiple threads can access it. You need to check the uses of the decay and decide based on them. Thus, I think this is fine.
Apr 19 2023
We can reduce the bits from 4 to 2 later if that results in less complexity or better performance. I don't think that is in itself a blocker. When we see the patch to change it we can probably easily tell if it is simpler, and we hopefully have more stress testing then to trust whatever changes we make.
LG
Apologies for the delay.
Apr 18 2023
I'm fine with this. Any objections?
LG
I'm not too happy with the name but the concept seems like something we need (to duplicate or move) here.
Works for me if we cannot find a better name(ing scheme).
Apr 17 2023
Subsumed by D148576
Last real issue I have is the try open loop, see below.
some comments, waiting on the next version now.
LG, one nit.
Apr 13 2023
Apr 11 2023
TL;DR, we should try to converge to one impl. in OpenMPIRBuilder but that might take a while and we should not force it where it doesn't make sense (yet).
LG
Just browsed the change, seems solid. LG