When only a store is sunk, there is no need to create a load in the
pre-header, as the result of the load will never get used.
The dead load can can introduce UB, if the function is marked as
writeonly.
Fixes #51248.
Differential D123473
[LICM] Only create load in pre-header when promoting load. fhahn on Apr 10 2022, 2:26 PM. Authored by
Details When only a store is sunk, there is no need to create a load in the The dead load can can introduce UB, if the function is marked as Fixes #51248.
Diff Detail
Unit Tests
Event Timeline
Comment Actions I think the patch is not the full fix, as far as I understand. It's a good improvement, perf & correctness wise, though, but doesn't fix #51248. There's one missing case: the load is needed and the function can't load memory (writeonly, readnone, etc?). In that case the optimization should bail out. Comment Actions I think at the moment, the load will only be generated if the source already had a load that is guaranteed to execute. So hoisting that in a write only function should be fine, as it is already UB.
Comment Actions After this got recommitted, our bot has issue during stage2 with llvm-tablgen crashing, have you seen this elsewhere? Comment Actions In case it helps, I was diagnosing a regression in Rust that bisected to the recommit of this when the re-revert landed. https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/9758 has the failure, though I imagine you'd want some un-optimized IR for the test case or something. If you'd like that please let me know and I can get something together. Thanks! Comment Actions If you're looking for a testcase for the latest failure, this change appears to cause a miscompile in llvm::VersionTuple::tryParse, at least in our internal bootstrap builds. Comment Actions I think this needs another look. AFAICT the original version was causing some mis-compiles, because *if* the object we store to is function local (alloca), we sink the store even if the store is not guarnateed to execute (tests added in 5e54a413de1f803). I updated the patch to introduce the load, unless the the store is guaranteed to execute. If the load is introduced, the writeonly attribute is removed. This isn't ideal, because the attribute could have knock-on effects for functions calling the modified function. Maybe a better solution would be to clarify the wording for writeonly in langref to be clear that the attribute only considers writes to memory visible to its callers. Then we can retain writeonly even if loads to local objects are introduced. I doubt considering reads to function-local objects for writeonly adds any value. Comment Actions
These are already accepted semantics for memory attributes, see e.g. the code in checkFunctionMemoryAccess(), which ignores local memory. The wording in LangRef is "If a writeonly function reads memory visible to the program", which is a bit unclear, but probably the intent of "visible to the program" here is "visible to the caller". Comment Actions Right, Alive2 implements that semantics. Memory accesses to local allocas/mallocs are ok in functions that restrict loads and/or stores.
Comment Actions Update the patch to not remove the writeonly attribute, but instead verify that the writeonly semantics are not violated if we need to introduce a load, as clarified in D123473. If the store is not guaranteed to execute, a new load should only be introduced for not escaping noalias calls or allocas. Comment Actions Yeah, I couldn't find an example where this would trigger. The hope is that if it triggers we will soon have a test case. Comment Actions @regehr If you have time, it would be nice to fuzz LICM to check if this assertion can be triggered (after the patch lands). Comment Actions Coming into this quite late, but I'm concerned about this patch and the reasoning discussed in this thread. Consider the fact that the original program may have contained a load along a dynamically dead path inside a loop contained by a writeonly function. Even if we interpret the load as being UB when executed, the original program is well defined. If we then speculatively insert a load into the preheader - which we do based on speculation safety, even with this patch - then the load will execute in a writeonly function. Before: loop { v = 0 if (never_taken) { v = load addr } store v, addr } After: v1 = load addr loop { v = 0 if (never_taken) { v = v1 } store v, addr } If we consider such a load immediate UB, then this is not a valid transform. However, consider that this transform is the same as performed by LICM (e.g. not store promotion.)( If we actually want to prevent this transform, I think we'd have to consider loads in writeonly functions not safe to speculate. I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case. If we accept that semantic, then this patch becomes completely unnecessary. Comment Actions You've a point. My concern is that it's surprising to have LLVM introducing a load in a function that's not supposed to do any load. I was thinking about I/O, but those accesses should be volatile. For kernel code it may not be ok, but probably they also use volatile to prevent the compiler messing around instead of relying on writeonly. Not true. The loop may not execute and thus you need the previous value. If the loaded value is poison, you still can't do the transformation, Comment Actions If there is a transformation that turns load in a writeonly function into unreachable, defining it as UB would be necessary. Comment Actions Something can be surprising, and yet a natural consequence of other reasonable choices. I think this is one of them. On the topic of volatile, you raise an interesting point. We don't seem to be suppressing speculation of volatile loads in the common utility. That sounds like a bug, and likely worth a patch of it's own. It's much broader than LICM though.
Yes, you can. That's my point. If the code which branches on it executes, then you have UB. Not before.
Hm, this a minor argument for folding load of caller accessible memory in writeonly to poison as an instcombine rule, but is pretty unconvincing here. If you find it, let me know. We can debate removing it. Comment Actions I run LLVM's test suite under Alive2 and found no issues with the load-is-poison semantics for writeonly & argmemonly functions. Giving it a second thought, I agree with you. These attributes are not there to prevent optimizations, but to enable them. The compiler can do whatever it wants as long as it respects the memory model. The kernel may have a different idea about how the memory model looks like, but for now we don't support such special casing like gcc does (or used to). So I think we can change it. It's generally better to keep UB usage as low as possible, and yield poison whenever possible. So I favor the change. As for this patch, it optimizes the case where the loop & store are guaranteed to execute, so no load is needed. This is independent of writeonly. define void @test(i8 %var, ptr %ptr) { entry: br label %for.cond for.cond: %i = phi i64 [ 0, %entry ], [ %add, %cond.end ] %cmp = icmp ult i64 %i, 2 br i1 %cmp, label %for.body39, label %for.end for.body39: %div = sdiv i8 %var, 3 %cmp2 = icmp slt i8 %div, 0 br i1 %cmp2, label %cond.true, label %cond.false cond.true: br label %cond.end cond.false: br label %cond.end cond.end: %merge = phi i8 [ %div, %cond.true ], [ 0, %cond.false ] store i8 %merge, ptr %ptr, align 1 %add = add i64 %i, 4 br label %for.cond for.end: ret void } LICM currently produces a load that is replaced with poison with this patch: define void @test(i8 %var, ptr %ptr) { entry: %div = sdiv i8 %var, 3 %cmp2 = icmp slt i8 %div, 0 %ptr.promoted = load i8, ptr %ptr, align 1 br label %for.cond ... Comment Actions While I think that's a great first data point, the coverage of writeonly functions is probably not too extensive unfortunately. Another interesting data point may be to (randomly?) add writeonly attributes to the existing tests and verify that.
I think it makes sense in isolation. But I think we should be careful to consider consistency with other/related attributes. At the moment, violations of a set of related memory attributes all have the same effect, UB. If we decide to change that, I think we should update at least all related attributes (readnone, inaccessiblememonly, inaccessiblemem_or_argmemonly, argmemonly, !dereferenceable metadata,...). For attributes that forbid writing to memory, we would need to retain UB? Changing the langref is straight-forward, but enforcing it is difficult. Is there anything we could do to be more confident when adjusting the semantics? I added a test that triggers the assertion in the current version of the patch (3497a4f39601).
Yeah, this should be an independent improvement. I am leaning towards landing the improvement without the assertion. And leave resolution of #51248 until we either adjust LangRef or add a bailout. I probably won't have time in the immediate future to work on updating the langref and possibly audit existing code. Comment Actions Agreed. There are very few tests for writeonly. The good news is that it isn't heavily used either.
This kind of attributes are usually added by LLVM itself, they are not specified by hand by users. Hence we are more free to select the semantics.
Yes, as we use the "nowrite" attributes to determine whether a function can be hoisted and etc, so we can't introduce a store. Also, introducing stores is already forbidden in many cases.
The best thing we have today is fuzzing + Alive2. @regehr has been working on the fuzzing side of things. This strategy has been reasonably successful in testing semantics. The only caveat is that Alive2 doesn't support IPO, so we disable it for the Attributor and related things to avoid false-positives. For attributes that's not ideal. Comment Actions I went ahead and recommitted the patch as 0776c48f9b7e. As per earlier discussions, the committed version doesn't fix the Github issue. |
we can't remove the writeonly attribute locally. We would need to remove it transitively as it's illegal for a writeonly function to call a non-writeonly function.
We need to bail out if a non-local store is needed.