Follow-up to D50405: replace the fatal error with a remark and replace the offending instruction with a trap.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Converting the memory operation to an explicit trap seems overly complicated, as opposed to just fixing whatever isel pattern is assuming the immediate is divisible by 4. But I don't know the Hexagon backend that well.
Could you use the optimization remark emission infrastructure for this, instead of defining your own DiagnosticInfo? SelectionDAG::getORE() returns an optimization remark emitter. Granted, that wouldn't do exactly the same thing, since optimization remarks are only emitted when requested.
Fix the pattern to generate a load with the misaligned address? That already happens. The problem is that we have passes that make changes based on the assumption that the instructions are valid. We make no effort to gracefully handle obviously invalid code, and if we detect it, it is usually in an assert. I think that the trap is a compromise where it won't trigger any further problems during compilation and will still fail for the user.
I'd really like to print the message without users specifically requesting it. The reason being that there is no intuitive connection between a runtime fault and the optimization remark emitter. The AMDGPU isel lowering already uses the same diagnostic infrastructure (except that they don't define their own kind). Do you have a strong preference to use the ORE instead?
Fix the pattern to generate a load with the misaligned address? That already happens. The problem is that we have passes that make changes based on the assumption that the instructions are valid.
I mean, fix the patterns so that you materialize the immediate into a register. That should be enough unless some later pass tries to fold an immediate followed by a load... in which case that pass is still broken because some cases of immediate+load won't show up until after isel (for example, tail-dup can eliminate a PHI node).
I am much more inclined to detect and handle invalid code early than to teach the whole backend to propagate it to the end. If a user decides to use an external assembler, code like that can trigger an error, while using internal assembler will simply hide it.
There's nothing wrong with simplifying code that you've proven has undefined behavior; we do that all over the place in LLVM. We already have code that specifically detects invalid loads/stores in instcombine (although it currently doesn't try to check alignment). That said, the Hexagon backend still has to be prepared for the possibility that some later pass will introduce the sequence "r0 = ##74565; r0 = memw(r0+#0)", because each of those instructions is independently valid. So this is just papering over the underlying compiler crash.
If a user decides to use an external assembler, code like that can trigger an error
If the Hexagon backend is generating assembly that isn't valid according to the assembler, that has to be fixed in the asmprinter, or some very late MIR pass; isel is too early to catch all cases.
Example of what I mean; build with clang --target=hexagon -S -O2:
int a(int f(void)) { int *p = (int*)0x123450; if (f()) { f(); p = (int*)0x123456; } return *p; }
Produces:
{ r0 = ##1193046 r17:16 = memd(r29+#0) } // 8-byte Folded Reload { r0 = memw(r0+#0) dealloc_return }
I understand your point, which I believe is that the user can always sneak in some invalid code using valid instructions, which then some pass will transform into an invalid instruction. This is actually motivated by such a case (an assert cause by user program). The problem is that the assert also helped find compiler bugs, so in that sense it served its purpose. This patch is meant to address the problem in the user code before it gets to that pass, or essentially to filter out user problems before they can no longer be differentiated from internal compiler errors. It isn't intended to catch all possible forms of invalid input, but I believe it covers the only way that the user's program can trigger that particular assertion.
The point about internal vs. external assembler was that if we accept and propagate invalid instructions down the transformation sequence, the integrated assembler will eventually need to generate some code (since it shouldn't report errors). When generating the textual assembly, a wrong instruction can easily be printed, but then the external assembler would pick it up. Patching up invalid instructions in the assembly printer is a bit much, besides, this was really a hypothetical situation (for the purpose of illustrating my point).
It might be worth discussing on llvmdev the question of whether it makes sense to add a flag to clang to control general undefined-behavior diagnostics, as a sort of best-effort thing when the optimizer finds it. We have lib/Analysis/Lint.cpp, but there isn't any clang flag to turn it on. Undefined behavior diagnostics are not something that would ever be reliable in the sense of consistently printing the same diagnostics across compiler versions/optimization levels, and it would always have false positives, so I don't think we would ever turn it on by default. But it could be useful to help users figure out "why did my code disappear" when the compiler generates something unexpected.
Rebased the patch.
This patch happens to fix https://bugs.llvm.org/show_bug.cgi?id=50838.
Whether the remark should be emitted or not may still need to be decided.
llvm/lib/Target/Hexagon/HexagonISelLowering.cpp | ||
---|---|---|
1968 | Do the tests cover all of the new paths defined here? |
llvm/lib/Target/Hexagon/HexagonISelLowering.cpp | ||
---|---|---|
1968 | Turns out it's impossible to get an indexed load/store to an immediate address, since DAG combiner will simply generate the updated address directly, instead of relying on pre- or post-increment. |
Do the tests cover all of the new paths defined here?