These don't really have function bodies to try to eliminate. This also
has a good chance of just producing invalid IR since intrinsics can
have special operand constraints (e.g. metadata arguments aren't valid
for an arbitrary call). This was wasting quite a bit of time producing
and failing on invalid IR when replacing dbg.values with undefs.
Details
- Reviewers
lebedev.ri fhahn Tyker diegotf ychen
Diff Detail
Event Timeline
what sort of problems manifest from this issue? Does it lead to slower reductions because they spend more time creating a lot of broken reductions?
I worry a bit about using heuristics "some intrinsics have these properties, so let's ignore all intrinsics" - other intrinsics represent normal instruction operations, right? (for target intrinsic instructions, for instance) so some cases might benefit from such reductions.
I think the general strategy of hack on the IR and if it breaks, skip it is a bit insane to begin with, so from that front this reduces the number of times that triggers
I worry a bit about using heuristics "some intrinsics have these properties, so let's ignore all intrinsics" - other intrinsics represent normal instruction operations, right? (for target intrinsic instructions, for instance) so some cases might benefit from such reductions.
I think removing intrinsic calls is an unproductive reduction strategy in general. I think this will harm reducing any codegen issues since it changes the generated code so radically. It also assumes a target supports arbitrary call targets, which may not be the case. e.g. before AMDGPU supported calls, every testcase with an intrinsic would reduce to a call to undef
+1.
There are plenty of other similar cases, like llvm.used metadata which we should probably handle in a special way.
llvm/tools/llvm-reduce/deltas/ReduceFunctions.cpp | ||
---|---|---|
33–39 | I think a comment here would be good explaining why we exclude intrinsics. Also, should we also remove intrinsics from chunks to start with, to keep the numbering/chunk management consistent? |
IIUC the main benefit from this optimization is removing unrelated function definitions all together so we don't need to continue processing them and the fact that this also applies to declaration is a side effect of the implementation? The call sites themselves should get removed by the remove instruction part, if they are 'uninteresting', so we probably do not loose too much in terms of reductions?
The test tieing to AMDGPU seems less ideal for general functionality. I've seen llvm.dbg.value caused the same problem, maybe we could use that?
The debug intrinsics require a *lot* of metadata overhead simply to satsify the operations. The other metadata using intrinsics are experimental, so this seemed like the least fragile way to test it
Given the functionality isn't implemented in terms of metadata-using-intrinsics in particular, a test using some non-metadata-using intrinsic would be fine.
How about this one? Adapted and reduced from llvm/test/Verifier/llvm.dbg.intrinsic-dbg-attachment.ll.
define void @foo() { entry: call void @llvm.dbg.value( metadata i8* undef, metadata !DILocalVariable(scope: !3), metadata !DIExpression()), !dbg !DILocation(scope: !3) ret void } declare void @llvm.dbg.value(metadata, metadata, metadata) !llvm.module.flags = !{!0} !0 = !{i32 2, !"Debug Info Version", i32 3} !1 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !2) !2 = !DIFile(filename: "f2", directory: "") !3 = distinct !DISubprogram(name: "foo", unit: !1)
I think only the callee matters here. If anyone makes changes to llvm-reduce and not build AMDGPU, there is a chance (should be small I guess) to regress this.
I'm specifically checking for the invalid intrinsic error produced. I think another intrinsic would be less clear for what is actually happening
Also, building the target isn't actually necessary for the test. The intrinsic declaration works as is anyway
llvm/tools/llvm-reduce/deltas/ReduceFunctions.cpp | ||
---|---|---|
33–39 | This comment still needs addressing. |
llvm/tools/llvm-reduce/deltas/ReduceFunctions.cpp | ||
---|---|---|
33–39 | I added the comment in the last revision |
llvm/tools/llvm-reduce/deltas/ReduceFunctions.cpp | ||
---|---|---|
33–39 | The second part of it - Also, should we also remove intrinsics from chunks to start with, to keep the numbering/chunk management consistent? |
Not necessarily - https://llvm.org/docs/GettingStarted.html suggests narrowing the target list in a few places, to speed up iteration time. So making tests more portable across targets is valuable - also means buildbots will get to your tests sooner (as the test will be run on more configurations, increasing the chances it'll be run on a buildbot that's running faster/picking up your changes sooner).
Ah, please tweak the test to test the general behavior (not modifying intrinsic calls), not the invalidity/second order that lead to the root cause/change in behavior (that some of those modified intrinsic calls then become invalid).
I disagree with this (maybe it's OK for frontend work), and you should run all target tests before committing backend changes. For this test it doesn't actually require the target so I've just dropped the REQUIRES
Ah, please tweak the test to test the general behavior (not modifying intrinsic calls), not the invalidity/second order that lead to the root cause/change in behavior (that some of those modified intrinsic calls then become invalid).
I don't see how else I would tell from the output that the intrinsic call itself is why it was skipped. The messages printed don't directly indicate why anything was skipped. The interestingness script needs to keep the call around
llvm/tools/llvm-reduce/deltas/ReduceFunctions.cpp | ||
---|---|---|
38 | Much like in ReduceAttributes, i don't think we should rely on the @llvm. function name prefix, |
llvm/tools/llvm-reduce/deltas/ReduceFunctions.cpp | ||
---|---|---|
38 | The special operand properties are based on the llvm. prefix in the verifier, so I think this is the correct check here |
Since we should catch these cases later on in reduceInstructionsDeltaPass() (please add a test), this should be fine i guess.
llvm/tools/llvm-reduce/deltas/ReduceFunctions.cpp | ||
---|---|---|
36–37 | Please drop and changing to an arbitrary call may radically change codegen., |
That if you don't include ; CHECK-INTERESTINGNESS: call, there won't be a call i32 @llvm.amdgcn.reloc.constant(metadata !"arst") in the end.
Ah, please tweak the test to test the general behavior (not modifying intrinsic calls), not the invalidity/second order that lead to the root cause/change in behavior (that some of those modified intrinsic calls then become invalid).
I don't see how else I would tell from the output that the intrinsic call itself is why it was skipped. The messages printed don't directly indicate why anything was skipped. The interestingness script needs to keep the call around
Sorry, not sure I follow - You have two calls, the difference is one is an intrinsic and one is not and the reduction continues until there's only one call & then the final check ensures the remaining call is the intrinsic you're looking for.
For instance, this modification of the test seems to fail without the patch and pass with it, but doesn't rely on a target-specific intrinsic or metadata parameters:
; RUN: llvm-reduce --test FileCheck --test-arg --check-prefixes=ALL,CHECK-INTERESTINGNESS --test-arg %s --test-arg --input-file %s -o %t 2> %t.log ; RUN: cat %t | FileCheck -implicit-check-not=uninteresting --check-prefixes=ALL,CHECK-FINAL %s ; Check that the call is removed by instruction reduction passes ; RUN: llvm-reduce --test FileCheck --test-arg --check-prefixes=ALL,CHECK-FUNC --test-arg %s --test-arg --input-file %s -o %t ; RUN: cat %t | FileCheck -implicit-check-not=uninteresting --check-prefixes=ALL,CHECK-NOCALL %s declare i8* @llvm.sponentry.p0i8() declare void @uninteresting() ; ALL-LABEL: define i8* @interesting( define i8* @interesting() { entry: ; CHECK-INTERESTINGNESS: call ; CHECK-NOCALL-NOT: call ; CHECK-FINAL: %call = call i8* @llvm.sponentry.p0i8() ; CHECK-FINAL-NEXT: ret i8* %call %call = call i8* @llvm.sponentry.p0i8() call void @uninteresting() ret i8* %call }
Might be able to be simplified further - test seems to work even with a void return from @interesting and the "CHECK-FUNC" on the second reduce run doesn't appear to be used (there are no "CHECK-FUNC" check lines).
Could you check in something like this, a simplified/reduced version of the test?
This test works, you can just replace the existing test with this one. Do you want to push it, or should I?
Please drop and changing to an arbitrary call may radically change codegen.,
that's interestingness's test's problem.