Working on ROCm at AMD
User Details
- User Since
- Oct 17 2012, 11:25 AM (506 w, 1 d)
Tue, Jun 21
Tue, Jun 14
Same typo in the change description, last paragraph ... middle "of" the block.
Looks straightforward enough to me. If the value is a PHI, it must be at the beginning of the current block. If it is outside this block, then whoever registered it is confident that it dominates this block. Is it correct that the SSAUpdater will not end up in a state where it tries to use a phi to produce another phi in the same block? Because that might race against the assumption that the used phi is available earlier in the block.
Mon, Jun 13
The change itself looks okay to me, to the extent that it does not seem to break the structurizer. Just by putting nodes in the right order, it is still able to generate predicates that preserve the original control flow. I think that's good enough to boldly make this change. I do have some comments/questions about the tests.
May 30 2022
The code looks okay to me, but still need @arsenm to check the effect on the AMDGPU lit test. There seem to be superficial changes to the control flow, but Matt might be able to say whether the test still serves its original purpose after this change.
May 29 2022
Requesting @arsenm to take a look at the AMDGPU test.
May 23 2022
LGTM! My sincere apologies for the delay ... I have no excuse :(
May 17 2022
Sorry I missed this earlier. The name of the new test "issue55099.ll" doesn't really convey much information. It should be renamed to something more informative, like "preserve-child-loops.ll"
I think the change description is missing the point and needs to be rephrased. The modified function was incorrectly (not unnecessarily) ignoring grandchild loops, and this change fixes the bug. In particular, this fixes the handling of the loop { inner, body }. The TODO in the same function is talking about the b1 self loop, which may be "unnecessarily" lost, but that is a different issue.
May 5 2022
May 4 2022
I am not much familiar with DirectX and HLSL, so unable to review this patch. It might help to post on the Discourse under Clang Frontend:
Apr 20 2022
The changes in GenericCycleInfo look good to me. But please do wait for reviewers looking at MachineSink.
Apr 19 2022
Do note that the cycle hierarchy is implementation-defined, and hence the reported cycle depth is not an invariant property of the program being compiled. For example, in the following CFG:
Apr 14 2022
Apr 12 2022
Apr 11 2022
This patch clearly follows the same pattern as other attributes. At least I didn't spot anything different. Please give it a day for others to comment, otherwise good to go.
Needs more reviewers ... I would suggest authors of previous non-trivial changes to the structurizer, and their reviewers.
Apr 10 2022
Apr 9 2022
Apr 7 2022
Mar 29 2022
Mar 24 2022
Mar 18 2022
The check for "__ockl_hostcall_internal" is not longer relevant with the new hostcall attribute. Can we simply remove this check? What is the situation where the warning is still useful?
Mar 16 2022
Mar 11 2022
Please remove the JIRA ticket number from the commit description.
Mar 7 2022
Please wait for a day to give other reviewers a chance to respond.
Feb 25 2022
Feb 21 2022
Please fix the commit description so that the first line is self-contained and separated from the rest by a blank line. This matters a lot when looking at the output of "git log ---oneline". In particular, the start with "in this work we implement" is pretty much unnecessary.
Please fix the commit description so that the first line is self-contained and separated from the rest by a blank line. This matters a lot when looking at the output of "git log ---oneline".
Feb 20 2022
Feb 17 2022
Feb 11 2022
Feb 10 2022
use AAPointerInfo; add more tests to demonstrate the same
Feb 9 2022
Feb 8 2022
added tests for i128 load. hostcall position is now independent of subtarget.
further simplified the callback return value; moved hostcall info out of the subtarget
eliminated NeedsHostcall by simply checking the return value of the walk over all calls
Feb 7 2022
Fixed stale comments that mention the attribute amdgpu-queue-ptr in the positive sense.
Feb 6 2022
Jan 31 2022
Dec 10 2021
Dec 8 2021
- rebase
- The opaque member class Compute broke module builds.
- Moved compute to an extern template class in namespace llvm.
- Cleaned out a spurious class declaration that was not used anywhere.
Reopening revision to help with pre-submit checks, before reapplying the reverted commit.
Dec 6 2021
Nov 28 2021
- rebase
- addressed review comments
Nov 23 2021
- rebase
- removed the "vacuous root" from the cycle description
Nov 22 2021
- rebase
- replace Dfg and Ssa with uppercase versions
- eliminate any interpretation of nullptr as root cycle
- move two functions from CycleInfo to Cycle, just like LoopInfo
- small cleanup to the cycle terminology, pointed out by @nhaehnle
Nov 19 2021
Nov 18 2021
Not related to the changes proposed in this review, but it seems the PDT is not used by the implementation. It might be good to take out that dependency.
Nov 12 2021
rebase
Nov 11 2021
- Fixed some doxygen comments.
- Eliminated the "root" cycle.
- Cycle should not point to CycleInfo; it's hard to update that reference. Fixed the side-effects of eliminating that reference.
- Fixed names mentioned in review comments.
- Extracted two helper functions: getTopLevelParent() and moveToNewParent(). Both very helpful in improving readability.
- SsaContext now keeps a reference to the Function so that user analyses can retrieve it. Forward looking change, found to be useful in later analyses not introduced yet.
Nov 5 2021
Nov 3 2021
"uint" broke the Windows build
Nov 2 2021
- added specific wording from paper to description of cycle tree
- added LLVM_DEBUG to implementation
- fixed typos
Oct 29 2021
Addressed review comments