User Details
- User Since
- Aug 29 2017, 10:07 AM (316 w, 5 d)
May 25 2023
@Munesanz once you add that test case, I will land it.
Ping.
May 19 2023
May 10 2023
Apr 20 2023
Other than the comment this looks good to me, cleaner than before. Thanks Kevin.
Apr 3 2023
Mar 28 2023
Mar 24 2023
Mar 23 2023
Just a suggestion, I think this works, although it may be best to use OpenMPIRBuilder.
Mar 22 2023
Adding some initial comments
Mar 20 2023
Hi Ravi,
Adding another limitation to this approach. In the following code:
Mar 14 2023
Will this solve the problem or mitigate the error?
Feb 7 2023
Thanks for the clarification Shilei
I’m confused that Shilei said in the issue in GitHub that this is supported already. But front end is just throwing an error message.
Jan 14 2023
Minor things
Jan 7 2023
Jan 6 2023
Jan 5 2023
Jan 4 2023
Dec 20 2022
Oct 17 2022
Thanks for implementing this. I have added a comment inlined.
Sep 21 2022
Sep 13 2022
This is a good idea. Thanks Joseph.
Aug 31 2022
Aug 20 2022
That one is a good point. Let us revise that. Since the effect we actually want to have is the creation of the task graph sequentially.
I agree on 2. Any recommendations? We can move some of the logic there.
CUDA streams are FIFO queues. But it’s true this will not work if the device queue is not FIFO. In the case of CUDA this works for the case you described without being a read after write dependency.
It is only valid under the assumption of non unified shared memory. State across host and device is only visible during data movements. So it is up until then when changes in the host or device data is reflected. Assuming there are no external runtimes, it is possible to synchronize only on data movements, conserving the data dependencies between host and device.
Jun 29 2022
Rebasing to master
Jun 22 2022
I'm not sure what to do about the buildbot errors I am getting. I'm accessing them but there's no info that tells me what's going on.
Fixing var name to coding standard
Changing omp_get_num_devices() for omp_get_initial_device()
Ah! I missed that. Thanks
Jun 9 2022
Fixing the warning issue on the switch statement.
Changes are fine. I am not familiar with the progress in C++20 and 23, but I trust your judgement here.
Ah! Sorry about this. Quick fix. Working on it.
Jun 8 2022
Running clang-format
Thanks Jon for all the comments.
Jun 7 2022
Cleaning up code to remove unused enums. Removed the commented code with cache_iteration, and all related funcitons.
Jun 3 2022
I missed those. I will check well with all of them.
Jun 2 2022
I have added the HSA_ISA_INFO_NAME, made the function static, and removed
the unused elements in the enums.
@saiislam I will look into that.
Jun 1 2022
Oct 21 2021
lgtm
Sep 19 2021
Thanks for adding this Michael.
Jul 30 2021
Jul 29 2021
Jul 27 2021
Sorry for the delay. Working on this
Removing branch dependency
Rebasing to main this time for real
Resync again
Rebase to main
Rebase to main
Sync to main
Fixing tests
Fixing @tianshilei1992 comments.
Jul 26 2021
Changing name and adding cstdio instead of iostream. Adding license headers. Other minor changes
Updating minor comments. Major re-design of a less verbose solution will be added later
Rebasing to main
Rebasing to main
Fixing final comments from @jdoerfert
Jul 25 2021
I thought about the -omp- in the name too. But I remembered that the ACC folks wanted to use the same runtime. I like both llvm-omp-device-info and llvm-omp-deviceinfo. Or we could drop the llvm- as they do in mlir- tools
My test is still failing, but it fails on an assertion on the changeToSPMD method:
Jul 24 2021
Jul 23 2021
Format
- Created a single function receiving a string instead of one per attribute: foldKernelFnAttribute
- Removed pesimisticFixpoint if no kernel is found.
- Adding Check for the ReachingKernelEntries valid state
- Removed unnecesary comments