This is an archive of the discontinued LLVM Phabricator instance.

[JumpThreading] Put a limit on the PHI nodes when duplicating a BB.
ClosedPublic

Authored by mnadeem on Oct 25 2022, 3:43 PM.

Details

Summary

Do not duplicate a BB if it has a lot of PHI nodes.
If a threadable chain is too long then the number of duplicated PHI nodes
can add up, leading to a substantial increase in compile time when rewriting
the SSA.

Fixes https://github.com/llvm/llvm-project/issues/58203

The threshold of 76 in this patch is reasonably high and reduces the compile
time of cldwat2m_macro.f90 in SPEC2017/cam4 from 80+min to <2min.

Diff Detail

Event Timeline

mnadeem created this revision.Oct 25 2022, 3:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 25 2022, 3:43 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
mnadeem requested review of this revision.Oct 25 2022, 3:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 25 2022, 3:43 PM
tblah added a subscriber: tblah.Oct 27 2022, 2:58 AM

This approach seems fine. Is there any particular reason behind the threshold 76?

llvm/lib/Transforms/Scalar/JumpThreading.cpp
542

Since we're iterating over the beginning of the block anyway, can we use the computed non-PHI instruction?

vzakhari accepted this revision.Oct 31 2022, 9:15 AM
vzakhari added a subscriber: vzakhari.

Looks good to me.

This revision is now accepted and ready to land.Oct 31 2022, 9:15 AM

This approach seems fine. Is there any particular reason behind the threshold 76?

I didn't want to miss any threading opportunities if the compile time was reasonable, 76 seemed high enough (after some experimentation) to cover a lot of cases in SPEC2017/cam4 while maintaining a reasonable compile time.

mnadeem updated this revision to Diff 472087.Oct 31 2022, 11:42 AM

Reuse the computed non-PHI instruction.

mnadeem marked an inline comment as done.Oct 31 2022, 11:42 AM