Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
test/CodeGen/AMDGPU/convergent_flag.ll | ||
---|---|---|
1 ↗ | (On Diff #60285) | This is missing a run line. You should run instnamer on the test. A better name would also be convergent-inlineasm.ll |
3 ↗ | (On Diff #60285) | Space after ; |
4 ↗ | (On Diff #60285) | Unused arguments cal also be removed. The align 2 should be removed. |
6–13 ↗ | (On Diff #60285) | All of this can be replaced with something simpler, like a single argument use |
I am able to compile the convergent_flag.ll text using llc without any flags. Say if I insert the following line onto the top of the file, it doesn't work. So what's the correct "-march"? Any ideas? (I will add reference assembly later)
; RUN: llc -march=amdgcn -verify-machineinstrs < %s | FileCheck -check-prefix=SI %s
The patch summary should reflect that this is a target independent change, so something like:
Target: Add convergent flag to INLINEASM instruction
test/CodeGen/AMDGPU/convergent_flag.ll | ||
---|---|---|
4–28 ↗ | (On Diff #60285) | It would be great if you could make this test case smaller. I think you could remove most of the instructions from the entry block. |
Even if it compiles without flags, it's not running llc without the RUN line.
We usually prefer to not have the triple in the test itself and specify that in the llc run line so multiple run lines can use different ones. You need to add -mtriple=amdgcn--amdhsa for the intrinsic to work. You won't need that if you simplify the testcase to not use it.
The check prefix should also be GCN in case VI differences show up.
opt -strip -instnamer will clean up the names of the values so it will be easier to make future changes to the test.
include/llvm/Target/Target.td | ||
---|---|---|
792 ↗ | (On Diff #60285) | I am pretty sure this is not what you want. AIUI this makes *all* inline asm, on all platforms, as convergent. This will prevent certain optimizations on blocks that contain inline asm, which will be a regression on basically every platform other than GPUs. What we do when compiling CUDA device code is, in clang, mark all *calls* to inline asm as convergent. Call instructions can already individually be marked as convergent or not. This is conservative: The frontend or backend could in theory analyze the inline asm and, if it doesn't contain convergent instructions, not add (or remove) the attribute. Adding the IR attribute to the call should cover all IR optimizations. But then we still need to mark the inline asm machine instruction as convergent. We're not doing this yet in NVPTX, and it's a bug. Currently, a machine instruction either is or isn't convergent. The way we handle this for NVPTX call instructions is we have two instructions, a convergent one and a non-convergent one, and when we lower a call, we choose the one or the other. That may be tricky to do for inline asm, in which case conservatively saying that the relevant machine instruction is always convergent probably wouldn't be a big deal, at least for us. |
Modified LIT test for adding convergent flag to INLINEASM instruction based on Matt's and Tom's comments.
test/CodeGen/AMDGPU/convergent-inlineasm.ll | ||
---|---|---|
1 | There is no need for a check-prefix. You can get rid of that and s/GCN/CHECK/ everywhere in this test. I would also suggest changing this test so that the CHECKs immediately precede the relevant lines of IR. But more to the point, does this test fail without your change and pass with your change? I don't see how this is checking for convergence at all. I would recommend trying to write the minimal test case that checks for the thing you're changing -- this does not seem to be it, to me. (In fact I'm surprised this works at all, since the IR doesn't define the attributes #1 and #3...) |
include/llvm/Target/Target.td | ||
---|---|---|
792 ↗ | (On Diff #60368) | We are also marking calls to inline assembly as convergent, but the problem with inline assembly is that once the IR gets converted to a MachineInstr, the information about convergence is lost. Because INLINEASM is a target independent instruction, we don't really have the same flexibility when lowering a regular call, because there is not a target equivalent for INLINEASM. I think INLINEASM also had a similar issue with the hasSideEffects flag, and the solution was to encode that information as one of the operands to INLINEASM, maybe we should encode convergence information in the same way. Does that sound like a good solution? |
I think INLINEASM also had a similar issue with the hasSideEffects flag, and the solution was to encode that information as one of the operands to INLINEASM, maybe we should encode convergence information in the same way. Does that sound like a good solution?
Something like the mayLoad property on MachineInstr? I think that would work.
test/CodeGen/AMDGPU/convergent-inlineasm.ll | ||
---|---|---|
1 | If there is a convergence flag, we need to check that the inline assembly is still in the entry block: ; BB#0: v_mov_b32_e32 v1, 1 ;;#ASMSTART v_cmp_ne_i32_e64 s[2:3], 0, v1 ;;#ASMEND v_cmp_eq_i32_e32 vcc, 8, v0 s_and_saveexec_b64 s[0:1], vcc s_xor_b64 s[0:1], exec, s[0:1] ; BB#1: . . . ; BB0_2: . . . That's why GCN CHECK is used in the reference assembly. |
test/CodeGen/AMDGPU/convergent-inlineasm.ll | ||
---|---|---|
1 | OK, thanks for the explanation. As described, this is an end-to-end test, of the sort that we usually do not write in llvm. (Or at least, we do not rely on them as the exclusive means for checking a patch's correctness.) The problem is, what you've described is not strictly about testing the convergence of an instruction, but rather about checking that convergence prevents a certain optimization that would otherwise run. But if we change llvm so that this optimization no longer runs (seems reasonable, particularly since we're not running llc -O2 or anything), then your test will always pass, even without the fix you're making in this patch. At the very least, there needs to be a comment explaining what the test is checking. But again, it's fragile as written, and that imposes a cost on all maintainers. So if there is a simpler way to check that your change does the right thing, I would very much prefer that. I guess the good news is, to do what Tom suggested in his last comment, you're probably going to want a different set of tests anyway. :) |
What method are you using to upload this patch? Phabricator is only showing me the changes since the previous revision and not the changes when compared to trunk. This doesn't seem to happen for other patches.
include/llvm/IR/InlineAsm.h | ||
---|---|---|
226 | The indentation looks wrong here. | |
lib/CodeGen/MachineInstr.cpp | ||
1757–1758 | Indentation again. | |
lib/CodeGen/MachineVerifier.cpp | ||
818–820 | This line is more than 80 characters. | |
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
6738 | Indentation. | |
test/CodeGen/AMDGPU/convergent-inlineasm.ll | ||
15 | There should be an explicit name on this block. I would recommend running this whole test through opt -metarenamer. | |
25 | This can be dropped. |
test/CodeGen/AMDGPU/convergent-inlineasm.ll | ||
---|---|---|
25 | I mean you can drop the #0 attributes. I think the rest are OK, but I can't see the whole patch. |
test/CodeGen/AMDGPU/convergent-inlineasm.ll | ||
---|---|---|
6 | function name should be changed to something about what it is testing |
test/CodeGen/AMDGPU/convergent-inlineasm.ll | ||
---|---|---|
8 | The call site attributes can be removed here |
Don't we also need to change MachineInstr::isConvergent() to make it check the ExtraInfo (like MachineInstr::mayLoad/mayStore)? If possible please add a test that fails without that change and passes with that change.
"That change" referred to the change you've now made to MachineInstr::isConvergent().
I think you misunderstood what Justin was asking for. The original test convergent-inlineasm.ll is a test case that reproduces the bug we are trying to fix. That test case will fail without your patch, but it should pass with your patch. So you don't need to add any additional test cases.
The reason Justin made those comments was because your original patch didn't actually fix the bug, so it appeared as though the test case you had added would pass even without your fix.
So, you don't need to add any more tests. Please drop the nonconvergent-inlineasm.ll because it's not really testing anything.
A test with nonconvergent asm in the entry block that still sinks out of the entry block would show the original patch which always is isconvergent on inlineasm would break with it. It doesn't need to be in its own file though
Added a LIT test based on Matt's comments.
Dropped an unrelated test for the previous commit.
LGTM with trailing whitespace fixed
test/CodeGen/AMDGPU/convergent-inlineasm.ll | ||
---|---|---|
30–37 | I think the test has some trailing whitespace |
The indentation looks wrong here.