This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][CodeGen] Fix of incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr + peephole rules for Cmp+Brc
Needs RevisionPublic

Authored by eastig on Mar 21 2016, 9:30 AM.

Details

Reviewers
jmolloy
Summary

AArch64 peephole optimizer tries to remove a compare instruction. One case is comparison with 0: CmpInst %vreg, 0. It removes CmpInst and modifies the instruction defining ‘%vreg’ to the S version. It is not checked that the S version and CmpInst produce the same condition flags.
For example the following code

%vreg7<def> = ANDWri %vreg6<kill>, 15; GPR32sp:%vreg7 GPR32:%vreg6
%vreg8<def> = SUBSWri %vreg7<kill>, 0, 0, %NZCV<imp-def>; GPR32:%vreg8 GPR32sp:%vreg7
Bcc 3, <BB#2>, %NZCV<imp-use>

is transformed to the incorrect code:

%vreg7<def> = ANDSWri %vreg6<kill>, 15, %NZCV<imp-def>; GPR32common:%vreg7 GPR32:%vreg6
Bcc 3, <BB#2>, %NZCV<imp-use>

It's incorrect because 'SUBSWri %vreg7<kill>, 0' always sets the carry flag to 1 but 'ANDSWri %vreg6<kill>, 15' always sets the carry flag to 0. As a result Bcc which checks the carry flag is not working as expected.

The submitted patch contains the fix of the defect and additional peephole rules for Cmp+Brc sequences. New regression tests are created.

Changes description:

• Refactoring:

a. Function name ‘modifiesConditionCode’ is changed to ‘areCFlagsAccessedBetweenInstrs’ to reflect that the function can check modifying accesses, reading accesses or both.
b. Function ‘AArch64InstrInfo::optimizeCompareInstr’

  • Documented the function
  • Cmp_NZCV is DeadNZCVIdx to reflect that it is an operand index of dead NZCV
  • The code for the case of substituting CmpInstr is put into separate functions the main of them is ‘substituteCmpInstr’.

• Fix of the defect:

A new algorithm was implemented:

  1. It’s checked that the same condition code is read by instructions after CmpInstr and before the next modification of NZCV. The optimization is not applied if different condition codes are used. It might be difficult to find a candidate for substitution to satisfy all of them. I think this case with multiple used condition codes does not happen often.
  2. Then it’s checked that the instruction which defines a register for CmpInstr can produce the needed condition code itself or its S variant. If it or its S variant can produce then CmpInstr is removed.

• New peephole rules for Cmp+Brc:

The following rules were implemented:

SUBS reg, 0; B.LO => B <false path>
SUBS reg, 0; B.HS <label> => B <label>

• Testing status: all existing + new tests passed.

Diff Detail

Event Timeline

eastig updated this revision to Diff 51171.Mar 21 2016, 9:30 AM
eastig retitled this revision from to [AArch64][CodeGen] Fix of incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr + peephole rules for Cmp+Brc.
eastig updated this object.
eastig added a subscriber: llvm-commits.
eastig updated this object.Mar 21 2016, 9:35 AM
eastig updated this revision to Diff 51272.Mar 22 2016, 4:54 AM

Used arc to have the context of changes.

jmolloy requested changes to this revision.Mar 30 2016, 3:11 AM
jmolloy added a reviewer: jmolloy.
jmolloy added a subscriber: jmolloy.

Hi Evgeny,

Thanks for doing this. The documentation and cleanup is especially welcome.

I have comments below, but the main comment is that this needs to be split into two separate patches. One that performs a NFC refactoring, and another that actually fixes your bug. As is, it's very difficult for me to validate that the functional changes you've introduced are all required and what exactly they do differently.

Cheers,

James

lib/Target/AArch64/AArch64InstrInfo.cpp
825

Can you not just use

std::find(MachineBasicBlock::reverse_iterator(To), To->getParent()->rend(), *From) ?
834

I think inlining the isAccess call in here would make things easier to read:

if ( (AccessToCheck & AK_Write) && ...)
854

This could be better written as :

auto BeginInstrI = std::next(Instr->getIterator());
871

Braces go on the same line as the case.

887

Braces go on the same line as the case.

912

The formatting of this is too verbose. The original code had one LoC per case, you have three. Please reformat as the original.

960

Not properly indented here

969

Unneeded line break here

1013

Brace should be on the same line as the case.

This revision now requires changes to proceed.Mar 30 2016, 3:11 AM

I have comments below, but the main comment is that this needs to be split into two separate patches. One that performs a NFC refactoring, and another that actually fixes your bug.

Thank you for reviewing and advice. I will split it into three patches:

  1. Refactoring
  2. Bug fix
  3. New peephole rules for cmp+brc
eastig added inline comments.Mar 30 2016, 3:44 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
825

No, we cannot. '==' is not defined for MachineInstr. We can check either iterators or pointers.

eastig edited edge metadata.Mar 30 2016, 9:27 AM
eastig changed the visibility from "Public (No Login Required)" to "No One".
eastig changed the visibility from "No One" to "Public (No Login Required)".Mar 30 2016, 10:14 AM