This is an archive of the discontinued LLVM Phabricator instance.

[llvm-reduce] Reduce function calling convention
Needs ReviewPublic

Authored by aeubanks on Jul 18 2023, 1:36 PM.

Details

Diff Detail

Event Timeline

aeubanks created this revision.Jul 18 2023, 1:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2023, 1:36 PM
aeubanks requested review of this revision.Jul 18 2023, 1:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2023, 1:36 PM
aeubanks added inline comments.Jul 18 2023, 1:38 PM
llvm/tools/llvm-reduce/deltas/ReduceFunctionBodies.cpp
46

I think this should be a list of CCs to opt-out, rather than opt-in to calling convention reduction since it really should be fine to reduce them to CCC in terms of legality.

For the amdgpu entry-point concerns, that's the point of the interestingness test?

aeubanks added inline comments.Jul 18 2023, 2:01 PM
llvm/tools/llvm-reduce/deltas/ReduceFunctionBodies.cpp
46

oh I forgot to ask the initial question, if you do want to opt out some CCs, which ones do you want to opt out?

arsenm added inline comments.Jul 18 2023, 2:07 PM
llvm/tools/llvm-reduce/deltas/ReduceFunctionBodies.cpp
46

I think opt-in is basically always a better system, especially for target stuff (which is what any unlisted CC is going to be)

aeubanks added inline comments.Jul 18 2023, 2:11 PM
llvm/tools/llvm-reduce/deltas/ReduceFunctionBodies.cpp
46

llvm-reduce is designed to reduce IR as much as possible and I don't think this is an exception, like I said the interestingness test should take care of if you really don't want the CC reduced

(adding more people for more opinions)

arsenm added inline comments.Jul 18 2023, 3:11 PM
llvm/tools/llvm-reduce/deltas/ReduceFunctionBodies.cpp
46

You're asserting that cc is reduced or simpler and other calling conventions are not, and I do not think you can view them this way. From an IR perspective it's not really conceptually simpler, it just removes one token from the printed IR. With the possible exception of fastcc, it's not really an optimizing modifier

What it does do is change codegen in arbitrary ways, and I don't think that provides any reduction value on the IR side.

FWIW, I'd consider this a simplification - if I saw a test case that used a novel/explicit calling convention, I'd probably assume/guess that the CC was load bearing/required/interesting/relevant to the bug. So a test case that doesn't use a novel CC is "simpler" in that sense, to me at least. (it's not a guarantee, it's possible the bug is only reproducible with the default CC, but more often than not that's not going to be the case, I think?)

FWIW, I'd consider this a simplification - if I saw a test case that used a novel/explicit calling convention, I'd probably assume/guess that the CC was load bearing/required/interesting/relevant to the bug. So a test case that doesn't use a novel CC is "simpler" in that sense, to me at least. (it's not a guarantee, it's possible the bug is only reproducible with the default CC, but more often than not that's not going to be the case, I think?)

I agree. There are some special cases for some CCs in the codebase, so this definitely helps identify that behavioural difference.

FWIW, I'd consider this a simplification - if I saw a test case that used a novel/explicit calling convention, I'd probably assume/guess that the CC was load bearing/required/interesting/relevant to the bug. So a test case that doesn't use a novel CC is "simpler" in that sense, to me at least. (it's not a guarantee, it's possible the bug is only reproducible with the default CC, but more often than not that's not going to be the case, I think?)

This is a very x86 centric view. I might go so far as to consider CCC to be the most complex of the calling conventions, others add restrictions

FWIW, I'd consider this a simplification - if I saw a test case that used a novel/explicit calling convention, I'd probably assume/guess that the CC was load bearing/required/interesting/relevant to the bug. So a test case that doesn't use a novel CC is "simpler" in that sense, to me at least. (it's not a guarantee, it's possible the bug is only reproducible with the default CC, but more often than not that's not going to be the case, I think?)

This is a very x86 centric view. I might go so far as to consider CCC to be the most complex of the calling conventions, others add restrictions

It's an IR centric view. Calling convention is generally irrelevant for middle-end tests and should be omitted from them.

For codegen tests, if the calling convention is relevant to reproduce an issue, that should be determined by the interestingness test.

It's an IR centric view. Calling convention is generally irrelevant for middle-end tests and should be omitted from them.

This isn't a middle end test, it's a general IR tool

For codegen tests, if the calling convention is relevant to reproduce an issue, that should be determined by the interestingness test.

Requiring additional complexity in interestingness tests is not good. The ideal interestingness test is just run the crashing program, I shouldn't have to hack around odd tooling decisions. The calling convention can and does place IR constraints which this is not considering, and we should try to avoid failing the verifier in any reduction.

Basically I'm fine with reducing fastcc, coldcc, and tailcc to c and opposed to "reducing" any other arbitrary CC. I don't see arbitrarily changing the CC as opening new reduction opportunities, and is easy to manually remove as the final step

nikic added a comment.Jul 20 2023, 1:32 AM

It's an IR centric view. Calling convention is generally irrelevant for middle-end tests and should be omitted from them.

This isn't a middle end test, it's a general IR tool

Sure, but it works on IR, so it should do an IR focused reduction.

I think we have a lot of precedent for this. For example, we reduce away the "dso_local" flag, which is a simplification on the IR level, but likely makes codegen more complex. Similarly, we reduce globals to external linkage -- again, this likely makes codegen more complex. Etc.

llvm-reduce at least right now favors producing minimal IR, not producing IR that gives you the minimum number of asm instructions after codegen.

If we wanted llvm-reduce to optimize for codegen tests, it should probably do things like add nounwind attributes to functions to avoid CFI, convert globals to internal linkage to avoid relocations, etc.

Requiring additional complexity in interestingness tests is not good. The ideal interestingness test is just run the crashing program, I shouldn't have to hack around odd tooling decisions.

You don't need to explicitly add something to the interestingness test. If the crash reproduces only with a specific CC, then you'll get that CC. If it doesn't require a specific CC, then it's okay to remove it.

The calling convention can and does place IR constraints which this is not considering, and we should try to avoid failing the verifier in any reduction.

I do agree with this part -- do you have any example where removing the calling convention would cause a verifier error? I would have assumed that the default cc doesn't have any special restrictions.

If we wanted llvm-reduce to optimize for codegen tests, it should probably do things like add nounwind attributes to functions to avoid CFI, convert globals to internal linkage to avoid relocations, etc.

In the past I've suggested a llvm-reduce flag that would change the reduction behavior to do this since I can definitely see it being useful for debugging codegen issues, but llvm-reduce is a general IR reduction tool which I mostly view as providing nice minimal test cases (text-wise) to put into llvm tests.

I think we have a lot of precedent for this. For example, we reduce away the "dso_local" flag, which is a simplification on the IR level, but likely makes codegen more complex. Similarly, we reduce globals to external linkage -- again, this likely makes codegen more complex. Etc.

This isn't a great analog. dso_local at least fits the mold of an optimization flag. It's not something mandatory. A target should have the right to not implement CCC at all and only use custom calling conventions (this was the case for AMDGPU at the very beginning for example).

llvm-reduce at least right now favors producing minimal IR, not producing IR that gives you the minimum number of asm instructions after codegen.

If we wanted llvm-reduce to optimize for codegen tests, it should probably do things like add nounwind attributes to functions to avoid CFI, convert globals to internal linkage to avoid relocations, etc.

I've thought about doing this, but it is a different thing

nikic added a comment.Jul 25 2023, 4:09 AM

I think we have a lot of precedent for this. For example, we reduce away the "dso_local" flag, which is a simplification on the IR level, but likely makes codegen more complex. Similarly, we reduce globals to external linkage -- again, this likely makes codegen more complex. Etc.

This isn't a great analog. dso_local at least fits the mold of an optimization flag. It's not something mandatory. A target should have the right to not implement CCC at all and only use custom calling conventions (this was the case for AMDGPU at the very beginning for example).

So the concern here is that if a target doesn't implement it, then it's likely that we'll get a false positive interesting test when run against something like ! llc? I think for that particular case (CCC not supported) it would be fine to opt-out the relevant calling convention.

So the concern here is that if a target doesn't implement it, then it's likely that we'll get a false positive interesting test when run against something like ! llc? I think for that particular case (CCC not supported) it would be fine to opt-out the relevant calling convention.

Opt out target stuff has been a constant source of pain in llvm. Opt in is a much more manageable system. I don’t see a practical plus to handling anything other than the handful of common CCs (fast, cold, tail)

So the concern here is that if a target doesn't implement it, then it's likely that we'll get a false positive interesting test when run against something like ! llc?

Personally, I don't think this is a use case worth optimizing for. A match that open ended is bound to trip over other issues - reduce to the wrong crasher, etc. Anything vaguely precise enough to be useful probably isn't going to get tripped up by an attempt to use a calling convention incompatible with the target, I'd have thought?

What's the failure look like when you use a cc incompatible with the target? Does it fail in IR parsing?