[InstCombine] Known-bits optimization for ARM MVE VADC.

Authored by simon_tatham on Sep 11 2019, 2:29 AM.


[InstCombine] Known-bits optimization for ARM MVE VADC.

The MVE VADC instruction reads and writes the carry bit at bit 29 of
the FPSCR register. The corresponding ACLE intrinsic is specified to
work with an integer in which the carry bit is stored at bit 0. So if
a user writes a code sequence in C that passes the carry from one VADC
to the next, like this,

s0 = vadcq_u32(a0, b0, &carry);
s1 = vadcq_u32(a1, b1, &carry);

then clang will generate IR for each of those operations that shifts
the carry bit up into bit 29 before the VADC, and after it, shifts it
back down and masks off all but the low bit. But in this situation
what you really wanted was two consecutive VADC instructions, so that
the second one directly reads the value left in FPSCR by the first,
without wasting several instructions on pointlessly clearing the other
flag bits in between.

This commit explains to InstCombine that the other bits of the flags
operand don't matter, and adds a test that demonstrates that all the
code between the two VADC instructions can be optimized away as a

Reviewers: dmgreen, miyuki, ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67162


simon_tathamOct 24 2019, 8:33 AM
Differential Revision
D67162: [InstCombine] Known-bits optimization for ARM MVE VADC.
rG08074cc96557: [clang,ARM] Initial ACLE intrinsics for MVE.