This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnswitch] Don't remove instructions with side effects after folding them
ClosedPublic

Authored by davide on Apr 28 2017, 4:48 PM.

Details

Summary

This started at

commit 724ecb159395015309d73655641dc56e0063765d
Author: David Majnemer <david.majnemer@gmail.com>
Date:   Tue Jul 19 18:50:26 2016 +0000

    [FunctionAttrs] Correct the safety analysis for inference of 'returned'

    We skipped over ReturnInsts which didn't return an argument which would
    lead us to incorrectly conclude that an argument returned by another
    ReturnInst was 'returned'.

    This reverts commit r275756.

    This fixes PR28610.

The commit exposes, I think, what's a latent bug in LoopUnswitch.
When functionattrs run, we mark the first argument of @fn5 with the returned attribute.

; Function Attrs: noinline nounwind uwtable
define internal fastcc i32 @fn5(i32 returned %ch) unnamed_addr #2 {
entry:

%call = tail call fastcc signext i8 @fn4()
store i8 %call, i8* @c, align 1
tail call fastcc void @fn3()
ret i32 %ch

}

Later on, when LoopUnswitch runs, it calls SimplifyInstruction & if that returns a value, attemps RAUW + eraseFromParent on the instructions in the following block:

for.end: ; preds = %for.cond.for.end_crit_edge, %entry %cj.sroa.0.0.lcssa = phi i8 [ %split, %for.cond.for.end_crit_edge ], [ undef, %entry ] %conv4 = sext i8 %cj.sroa.0.0.lcssa to i32

%call = tail call fastcc i32 @fn5(i32 %conv4)
%conv5 = trunc i32 %call to i16
store i16 %conv5, i16* @ci, align 2                                                                                                                                                  %.pr1 = load i16, i16* @ag, align 2
%tobool = icmp eq i16 %.pr1, 0                                                                                                                                                       br i1 %tobool, label %for.end.while.end.split_crit_edge, label %for.end.for.end.split_crit_edge

This is not quite right I think as we can't DCE an instruction which has side effect, and this is what happens here (the call to @fn5 is DCE'd).

For reference, here's the debug output of the passes with and without the commit:

Replace with '  %cj.sroa.0.03 = phi i8 [ undef, %for.body.lr.ph.split ], [ %.cj.sroa.0.0, %for.body ]':   %.cj.sroa.0.0 = select i1 false, i8 6, i8 %cj.sroa.0.03
Replace with 'i8 undef':   %cj.sroa.0.03 = phi i8 [ undef, %for.body.lr.ph.split ], [ %cj.sroa.0.03, %for.body ]
Replace with 'i8 undef':   %split.ph = phi i8 [ undef, %for.body ]                                                                                                                   
Replace with 'i8 6':   %.cj.sroa.0.0.us = select i1 true, i8 6, i8 %cj.sroa.0.03.us                                                                                                 
Remove dead instruction '  %cj.sroa.0.03.us = phi i8 [ undef, %for.body.lr.ph.split.us ], [ 6, %for.body.us ]
Replace with 'i8 6':   %split.ph.us = phi i8 [ 6, %for.body.us ]        
Replace with 'i8 6':   %split = phi i8 [ undef, %for.cond.for.end_crit_edge.us-lcssa ], [ 6, %for.cond.for.end_crit_edge.us-lcssa.us ]                                               
Replace with 'i8 6':   %cj.sroa.0.0.lcssa = phi i8 [ 6, %for.cond.for.end_crit_edge ], [ undef, %entry ]                                                                             
Replace with 'i32 6':   %conv4 = sext i8 6 to i32                                                                                                                                    
Replace with 'i32 6':   %call = tail call fastcc i32 @fn5(i32 6)                                                                                                                     
Replace with 'i16 6':   %conv5 = trunc i32 6 to i16


Replace with '  %cj.sroa.0.03 = phi i8 [ undef, %for.body.lr.ph.split ], [ %.cj.sroa.0.0, %for.body ]':   %.cj.sroa.0.0 = select i1 false, i8 6, i8 %cj.sroa.0.03                    
Replace with 'i8 undef':   %cj.sroa.0.03 = phi i8 [ undef, %for.body.lr.ph.split ], [ %cj.sroa.0.03, %for.body ]                                                                     
Replace with 'i8 undef':   %split.ph = phi i8 [ undef, %for.body ]                                                                                                                   
Replace with 'i8 6':   %.cj.sroa.0.0.us = select i1 true, i8 6, i8 %cj.sroa.0.03.us                                                                                                 
Remove dead instruction '  %cj.sroa.0.03.us = phi i8 [ undef, %for.body.lr.ph.split.us ], [ 6, %for.body.us ]
Replace with 'i8 6':   %split.ph.us = phi i8 [ 6, %for.body.us ]        
Replace with 'i8 6':   %split = phi i8 [ undef, %for.cond.for.end_crit_edge.us-lcssa ], [ 6, %for.cond.for.end_crit_edge.us-lcssa.us ]                                               
Replace with 'i8 6':   %cj.sroa.0.0.lcssa = phi i8 [ 6, %for.cond.for.end_crit_edge ], [ undef, %entry ]                                                                             
Replace with 'i32 6':   %conv4 = sext i8 6 to i32

Diff Detail

Event Timeline

davide created this revision.Apr 28 2017, 4:48 PM
davide added inline comments.Apr 28 2017, 4:49 PM
lib/Transforms/Scalar/LoopUnswitch.cpp
1278–1279

Alternatively this could use isInstructionTriviallyDead. At that point LoopUnswitch should require TLI I guess and we should thread it properly in the pass.
I don't mind doing that, FWIW, if you prefer and/or it's better.

davide added inline comments.Apr 28 2017, 5:10 PM
lib/Transforms/Scalar/LoopUnswitch.cpp
1278–1279

Nevermind, I was wrong, it doesn't require TLI.

chandlerc accepted this revision.Apr 28 2017, 5:15 PM
chandlerc added a subscriber: chandlerc.

This seems correct and safe. Happy for you to land as-is, or to try a different approach if you want. I don't think you're making anything worse, that's for sure. =D

lib/Transforms/Scalar/LoopUnswitch.cpp
1278–1279

FWIW, I would keep this surgical given the work I'm doing on LoopUnswitch. But I don't feel strongly either way.

This revision is now accepted and ready to land.Apr 28 2017, 5:15 PM
This revision was automatically updated to reflect the committed changes.