This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] Fix up debug loc for simplified memset in processMemSetMemCpyDependence
ClosedPublic

Authored by bjope on Oct 10 2022, 6:17 AM.

Details

Summary

Make sure the code comments in processMemSetMemCpyDependence match
with the actual transform. They indicated that the memset being
rewritten was sunk to after a memcpy, while it actually is inserted
just before the memcpy.

Also make sure we use the debug location of the original memset
when creating the new simplified memset. In the past we've been
using the debug location for the memcpy which could be a bit
confusing.

Diff Detail

Event Timeline

bjope created this revision.Oct 10 2022, 6:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 6:17 AM
bjope requested review of this revision.Oct 10 2022, 6:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 6:17 AM
bjope added inline comments.Oct 10 2022, 6:23 AM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1281–1284

Side note: This can be replaced by an usub_sat. I guess it would be safe to do that directly here instead of letting InstCombine detect the pattern later?

1295–1296

@fhahn : I think you added the MSSA updating here. But the code comments did not match up with the actual movement of the memset. But I think the actual MSSA update seem to match the implementation, so it was only code comment that was wrong. Right?

And as I hinted elsewhere, maybe we want this transform to move the memset below the memcpy. But that seem to make the MSSA update a bit trickier. I kind of assumed that one could use MSSAU->moveAfter/moveBefore in a situation like this? But maybe that require that the memset instruction only is moved and not replaced (as we need to adjust the arguments here) in some way?

bjope added a comment.Oct 11 2022, 2:28 AM

Alternative patch that aligns the code with the comments instead of updating the comments: D135653

Not sure which one is best or how to motivate the reordering (for out OOT target it might be better to do the memcpy first sometimes, but that is probably a more general scheduling thing rather than something this pass should need to think about).

bjope updated this revision to Diff 521254.May 11 2023, 4:32 AM

Change the focus of the patch to be around correcting the debug loc of the new
memset. And then cleaning up incorrect code comments is an extra bonus.

bjope retitled this revision from [MemCpyOpt] Update code comments for processMemSetMemCpyDependence. NFC to [MemCpyOpt] Fix up debug loc for simplified memset in processMemSetMemCpyDependence.May 11 2023, 4:33 AM
bjope edited the summary of this revision. (Show Details)

Adding some debug info people as reviewers.

I originally added @fhahn as reviewer, to consider the alternative solution D135653 involving changing the implementation to match with the old code comments instead. But since this has been stalled since October I thought that I could shape up this patch and promote this solution in favor of D135653 (as it is a smaller change to only impact debug info and not codegen).

Makes sense to me, the new comments appear to reflect what the code is doing better (insert the memset before the memcpy). The source locations look good too (added one nit comment). Added Orlando as a reviewer as peeling apart memcpys is his favourite thing.

I've a small concern over whether the new memset is going to re-order source locations -- as defined in https://llvm.org/docs/HowToUpdateDebugInfo.html we try to avoid source locations appearing outside of their original block or out of the original program order. It looks like the only caller to processMemSetMemCpyDependence enforces that the memset/memcpy are in the same block which means this patch is fine, but this is liable to become untrue in a future refactor. Would you be able to put a MemSet->getParent() == MemCpy->getParent() assertion next to the call to SetCurrentDebugLocation plus a comment about avoiding the source location appearing out of order? It feels a bit pointless, but IMO it'll guard against future refactors making a mistake.

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1296
llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll
26–29

I'd suggest dropping the scope metadata node as it's liable to be re-numbered by output changes in the future, and in this context I don't think it's testing anything meaningful.

bjope updated this revision to Diff 522112.May 15 2023, 3:49 AM

Updated according to review feedback.

jmorse accepted this revision.May 16 2023, 5:38 AM

LGTM, cheers

This revision is now accepted and ready to land.May 16 2023, 5:38 AM
This revision was automatically updated to reflect the committed changes.