This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Set GenCrashDialog=false for various report_fatal_error calls in lib/Target/RISCV
Needs ReviewPublic

Authored by asb on May 16 2023, 7:55 AM.

Details

Summary

The GenCrashDialog parameter to report_fatal_error controls whether a backtrace and an invitation to submit a bug report are printed. As a number of our users of report_fatal_error really indicate an exit with an error message for improper input, it's more appropriate to just print the diagnostic and exit rather than suggest a bug report (especially true for handling of errors from RISCVISAInfo). D62236 is one patch I remember making a similar change.

This patch changes most report_fatal_error calls within lib/Target/RISCV to disable the backtrace and bug report suggestion. I've left it as the default (GenCrashDialog=true) for e.g. getRegisterByName, movImm as I think these are likely to represent an error on the LLVM side that deserves a bug report.

This came up in https://github.com/llvm/llvm-project/issues/62729. Happy to split out or change if people feel different criteria should be used. See also: D150669 which is needed for some of these changes to have the desired effect.

Diff Detail

Unit TestsFailed

TimeTest
30 msLinux x64 > LLVM.CodeGen/RISCV::ghccc-nest.ll
Exit Code: 1 Command Output (stderr):
30 msLinux x64 > LLVM.CodeGen/RISCV::interrupt-attr-args-error.ll
Exit Code: 1 Command Output (stderr):
30 msLinux x64 > LLVM.CodeGen/RISCV::interrupt-attr-invalid.ll
Exit Code: 1 Command Output (stderr):
20 msLinux x64 > LLVM.CodeGen/RISCV::interrupt-attr-ret-error.ll
Exit Code: 1 Command Output (stderr):
70 msLinux x64 > LLVM.CodeGen/RISCV::module-target-abi.ll
Exit Code: 1 Command Output (stderr):
View Full Test Results (18 Failed)

Event Timeline

asb created this revision.May 16 2023, 7:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 16 2023, 7:55 AM
asb requested review of this revision.May 16 2023, 7:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 16 2023, 7:55 AM
asb edited the summary of this revision. (Show Details)May 16 2023, 7:57 AM

report_fatal_error is a lazy error reporting mechanism that should be avoided. Adding false (exit(1) instead of abort()) just papers over the issue...

I guess the description should read GenCrashDiag=false?

asb retitled this revision from [RISCV] Set GenCrashDialog=true for various report_fatal_error calls in lib/Target/RISCV to [RISCV] Set GenCrashDialog=false for various report_fatal_error calls in lib/Target/RISCV.May 16 2023, 10:11 AM
asb added a comment.May 16 2023, 10:14 AM

report_fatal_error is a lazy error reporting mechanism that should be avoided. Adding false (exit(1) instead of abort()) just papers over the issue...

Yes, it would be good to have a better error reporting mechanism in many of these cases and excise these uses of report_fatal_error - but AFAIK the infrastructure just isn't there for many of these to report a proper error with context (corrections welcome!). I think we're in agreement this patch isn't fundamentally improving the situation, just making some minor incremental improvements on what we have today.

arsenm added a subscriber: arsenm.May 16 2023, 10:16 AM

This should be the universal default. report_fatal_error should not emit crash dialogs

report_fatal_error is a lazy error reporting mechanism that should be avoided. Adding false (exit(1) instead of abort()) just papers over the issue...

It's appropriate for certain things. This isn't for issues end users should encounter. There are plenty of situations that need to be treated as not-an-assert and exit, such as situations only reachable through handwritten MIR

asb added a comment.May 17 2023, 3:42 AM

This should be the universal default. report_fatal_error should not emit crash dialogs

Thanks for taking a look. Just for the sake of clarity, is this a general comment or a request to invert the default behaviour of report_fatal_error rather than landing this patch? If the latter - I'd say I'm happy to help improve things here and even post an RFC on the issue, but I'm also well aware it's likely to take a long time to land (not least as it will re-raise the issue, as this patch has, of inadequacies in LLVM's error handling infrastructure). So my preference would be to land this incremental improvement and separate that from work to flip the default or make other broader changes.

asb added a comment.Jun 28 2023, 7:14 AM

This should be the universal default. report_fatal_error should not emit crash dialogs

Thanks for taking a look. Just for the sake of clarity, is this a general comment or a request to invert the default behaviour of report_fatal_error rather than landing this patch? If the latter - I'd say I'm happy to help improve things here and even post an RFC on the issue, but I'm also well aware it's likely to take a long time to land (not least as it will re-raise the issue, as this patch has, of inadequacies in LLVM's error handling infrastructure). So my preference would be to land this incremental improvement and separate that from work to flip the default or make other broader changes.

This needs rebasing, but before doing that I wanted to ping again (@arsenm) on this question. It does make sense to me to improve the usage of report_fatal_error within lib/Target/RISCV according to its current interface, separate to any discussion about changing that interface. But let me know if you feel differently.

I agree that in lib/Target and lib/CodeGen, there are many report_fatal_error places that are difficult to switch to a better error reporting mechanism.
If lib/Target/RISCV does this, should other targets do similar things?

Perhaps you can start a thread on https://discourse.llvm.org/ whether GenCrashDiag should default to false, as I think it is worth bringing more attention. FWIW I support defaulting to false.

This should be the universal default. report_fatal_error should not emit crash dialogs

Thanks for taking a look. Just for the sake of clarity, is this a general comment or a request to invert the default behaviour of report_fatal_error rather than landing this patch? If the latter - I'd say I'm happy to help improve things here and even post an RFC on the issue, but I'm also well aware it's likely to take a long time to land (not least as it will re-raise the issue, as this patch has, of inadequacies in LLVM's error handling infrastructure). So my preference would be to land this incremental improvement and separate that from work to flip the default or make other broader changes.

This needs rebasing, but before doing that I wanted to ping again (@arsenm) on this question. It does make sense to me to improve the usage of report_fatal_error within lib/Target/RISCV according to its current interface, separate to any discussion about changing that interface. But let me know if you feel differently.

Yes. I really think the default needs to change, this has been majorly annoying for over 3 years now. Incrementally improving is fine

asb updated this revision to Diff 557114.Sep 20 2023, 7:32 AM

Rebase. I held off on this due to the branch, but would like to progress now. I'll kick off a discourse thread now on the default - but I'm hopeful to get a review in order to land this as an incremental improvement.

asb added a comment.Sep 27 2023, 4:12 AM

Ping? I don't think anything in this patch is blocked on the RFC thread.

As a number of our users of report_fatal_error really indicate an exit with an error message for improper input

report_fatal_error is a lazy error reporting mechanism that should be avoided. Adding false (exit(1) instead of abort()) just papers over the issue...

Yes, it would be good to have a better error reporting mechanism in many of these cases and excise these uses of report_fatal_error - but AFAIK the infrastructure just isn't there for many of these to report a proper error with context (corrections welcome!).

I'm not really buying the infrastructure argument: seems like a cultural issue in not spending enough design effort for each of these.
For example many of these seems to me like invariants of these components, and as such verification should happen upstream of getting through these code-paths (and thus not validating before become a bug in the tool...)