This is an archive of the discontinued LLVM Phabricator instance.

[ARM][ParallelDSP] Change search for muls
ClosedPublic

Authored by samparker on Aug 23 2019, 8:41 AM.

Details

Summary

rL369567 reverted a couple of recent changes made to ARMParallelDSP because of a miscompilation error: PR43073. The issue stemmed from an underlying bug that was caused by adding muls into a reduction before it was proved that they could be executed in parallel with another mul. Most of the changes here are from the previously reverted commits. The additional changes have been made area:

  1. The Search function now doesn't insert any muls into the Reduction object. That now happens once the search has successfully finished.
  2. For any muls added into the reduction but that wasn't paired, we accumulate the value as an input into the smlad.

Diff Detail

Event Timeline

samparker created this revision.Aug 23 2019, 8:41 AM
efriedma added inline comments.Aug 23 2019, 1:25 PM
lib/Target/ARM/ARMParallelDSP.cpp
675

Maybe add LLVM_DEBUG logging to note the unpaired multiplies?

Should we move the Acc = ConstantInt::get([...] code below this loop, to avoid creating an unnecessary add i32 %foo, 0?

samparker updated this revision to Diff 217338.Aug 27 2019, 2:58 AM

Thanks. I've added the debug info and moved the accumulator initialisation code.

This revision is now accepted and ready to land.Aug 27 2019, 11:58 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 28 2019, 1:50 AM