Page MenuHomePhabricator

[InstCombine] Known-bits optimization for ARM MVE VADC.
Needs ReviewPublic

Authored by simon_tatham on Wed, Sep 4, 5:49 AM.

Details

Summary

The MVE VADC instruction reads and writes the carry bit at bit 29 of
the FPSCR register. The corresponding ACLE intrinsic is specified to
work with an integer in which the carry bit is stored at bit 0. So if
a user writes a code sequence in C that passes the carry from one VADC
to the next, like this,

s0 = vadcq_u32(a0, b0, &carry);
s1 = vadcq_u32(a1, b1, &carry);

then clang will generate IR for each of those operations that shifts
the carry bit up into bit 29 before the VADC, and after it, shifts it
back down and masks off all but the low bit. But in this situation
what you really wanted was two consecutive VADC instructions, so that
the second one directly reads the value left in FPSCR by the first,
without wasting several instructions on pointlessly clearing the other
flag bits in between.

This commit explains to InstCombine that the other bits of the flags
operand don't matter, and adds a test that demonstrates that all the
code between the two VADC instructions can be optimized away as a
result.

Event Timeline

simon_tatham created this revision.Wed, Sep 4, 5:49 AM

Sounds good.

We may also be able to do something similar in ISel, during the optimisation there, but this looks like a more general fix (and I'm not sure that would add a lot over what is already here).

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3301

This looks like something else entirely!

It's setting the range metadata on vminv's?

3336

Is this ever not 32? If so the 32 below should be the same value

3340

I think the last parameter is depth, and the default is 0? If so it might as well be left off.

llvm/test/CodeGen/Thumb2/mve-intrinsics/vadc-multiple.ll
21

Can you include a test for arm_mve_vadc_predicated?

simon_tatham marked an inline comment as done.Thu, Sep 5, 9:03 AM
simon_tatham added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3301

D'oh! Well spotted. That's what I get for writing my commit messages at the last minute.

simon_tatham marked 3 inline comments as done.Wed, Sep 11, 4:57 AM
simon_tatham added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3336

It should always be 32, because that intrinsic argument represents a value you can VMSR into FPSCR. I'll get rid of CarryWidth completely and make it an assertion.

Moved the vmin range metadata into a separate patch.