memset_chk may not write the number of bytes specified by the third
argument, if it is larger than the destination size (specified as 4th
argument).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
A couple thoughts:
- We need to be careful using the exact memset size. We could end up with an "impossible" memory location: for example, setting 100 bytes of a 50-byte buffer. That's probably okay here? But maybe not in other contexts; we need to be sure we don't treat memset_chk as UB. Maybe worth a comment describing what, exactly, we're modeling.
- In the case where the bounds check in memset_chk fails, it just aborts the process, and there isn't any synchronization point before it does that. So I'm not sure why it matters if the pointer is visible to other threads. Another thread can't read the contents before the process abort because it would be unsynchronized, and the memory doesn't exist after the process aborts.
Thanks for taking a look!
- We need to be careful using the exact memset size. We could end up with an "impossible" memory location: for example, setting 100 bytes of a 50-byte buffer. That's probably okay here? But maybe not in other contexts; we need to be sure we don't treat memset_chk as UB. Maybe worth a comment describing what, exactly, we're modeling.
I think using an 'impossible' memory location should be fine here, we use the location to remove stores based on the fact that the memset_chk is executed and impossible locations will cause an abort. It's a bit of a shame that after all the work of moving most location reasoning from DSE to MemoryLocation we still need special logic here that doesn't fit into MemoryLocation. But as you said, it is not safe in general.
- In the case where the bounds check in memset_chk fails, it just aborts the process, and there isn't any synchronization point before it does that. So I'm not sure why it matters if the pointer is visible to other threads. Another thread can't read the contents before the process abort because it would be unsynchronized, and the memory doesn't exist after the process aborts.
It might have been overly conservative on my side. I was thinking about cases where a failing memset_chk aborts the current thread/process but not other threads or processes that may access the same (shared) memory. But if that's not an issue the restriction can be removed.
memset_chk aborts the whole process. Throwing an exception on an out-of-bounds access is a popular model in other contexts, but memset_chk is not defined that way. It's meant purely as a hardening measure.
memset_chk should not be used on memory shared with other processes. Accesses to such memory should be volatile or atomic; otherwise, weird stuff can happen for a variety of reasons...
Sounds good, thanks! I remove the local-object restriction and added a comment explaining why using the precise bound is fine here.
| llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
|---|---|---|
| 1101 | Hm, I am not sure. For AA it should be fine for the same reasons it's fine for DSE, but I am not sure about some of the other users? Grepping surfaces a couple of additional ones. | |
| llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
|---|---|---|
| 1101 | Looking through other MemoryLocation::getForDest() users, all of them seem to be limited to handling intrinsics, not libcalls, so I don't expect issues with existing callers. | |
Move logic to MemoryLocation, add comment about correctness limited to contexts where the call executes.
| llvm/lib/Analysis/MemoryLocation.cpp | ||
|---|---|---|
| 256 ↗ | (On Diff #394287) | I now moved the code, but I'm not sure how to adjust/frame the comment here. | 
| llvm/lib/Analysis/MemoryLocation.cpp | ||
|---|---|---|
| 256 ↗ | (On Diff #394287) | memset_chk will either write exactly Len bytes or abort the process. If memset_chk aborts, then the partial write cannot be observed. For analysis purposes we can assume that exactly Len bytes are written. | 
| llvm/lib/Analysis/MemoryLocation.cpp | ||
|---|---|---|
| 256 ↗ | (On Diff #394287) | Okay, I'm having second thoughts here, maybe we should go back to your previous version. The main danger is that something decides that because the precise location is larger than the underlying object, the access must be UB. And indeed, this is exactly what we're doing in https://github.com/llvm/llvm-project/blob/15c8b8ad85c196df25aa3c22bca3d8458f86525a/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L1604-L1613. My earlier statement that this would be fine for alias analysis was wrong. Though I think that this may be problematic for DSE for the same reason. We pass the location to AA and it could determine that it thus doesn't alias. The saving grace here is probably that https://github.com/llvm/llvm-project/blob/3d595eccc7d5b20d9f202492bf48394ac7c078c6/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp#L926-L935 only uses the NoAlias result if the underlying objects differ, but one could probably still get an incorrect NoAlias result here if the underlying object is hidden in a way that BasicAA understands but DSE does not (e.g. through a phi or select). | 
Finally had some time to look at this again. I updated the patch to move the logic back to DSE and just use it for the sizes used to determine whether an object is overwritten. That should avoid any issues with AA queries.
| llvm/lib/Analysis/MemoryLocation.cpp | ||
|---|---|---|
| 256 ↗ | (On Diff #394287) | Great point, thanks for checking and sorry for the long delay. Finally had some time to look at this again. I updated the patch to move the logic back to DSE and just use it for the sizes used to determine whether an object is overwritten. That should avoid any issues with AA queries. | 
nit: remove 'during'